« Immediate performance gain in Nagios | Main | Better error messages »

16 May 2006

Reducing host checks for passive results

We found last time that host checks are run whenever a service comes back with a non-OK state, so it is important that your host checks are as fast as possible.

However, as part of our investigations, we found that if a passive check is received with a non-OK state, the host check is initiated. We don't think this is necessary because if the passive check has come from the host, the host must be up.

What if the passive check is not related to the host? For instance, if a host is setup as a central syslog-ng server and a passive check result is passed to say that a security login failed on a client, the result may go to the syslog-ng server, not the originating host. But in this case, checking the syslog-ng server gives no extra value either!

So we think this is a safe patch to apply. We've tested it on Nagios 2.2 and Nagios 2.3.1. We'll see if Ethan agrees.

Update: Jason Martin points out that the main assumption - that a passive check comes from the host, so it must be up - is not true with distributed monitoring.

Distributed monitoring can be setup in two ways:

1. the master receiving passive service checks and checking the host (the only way for Nagios 1.x)

2. the master receiving passive service and host check results from the slave (http://nagios.sourceforge.net/docs/2_0/distributed.html)

This patch is fine for case 2, but would break case 1. So this is not quite a safe patch anymore.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451f81d69e200d8352c9bb153ef

Listed below are links to weblogs that reference Reducing host checks for passive results:

Comments

I can also see problems arising if the passive check is alerting of an immediate system shutdown (environmental alarm forcing a shutdown, etc).

Larry: Possibly. The host check could still pass (if it is a ping) as network connectivity is shutdown last.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment