« Nagios Patch Day! | Main | Monitoring Cisco Netflow Data »

08 January 2008

NSCA's aggregate writing

In our continual task to try and speed up Opsview, we found a bug in NSCA's handling of aggregate writes when run in --single mode.

The specific failure scenario is this:


  1. NSCA and Nagios are told to start up
  2. A send_nsca request is received by NSCA before Nagios has created the nagios.cmd command pipe
  3. NSCA tries to write to open the command file, but sees it is not there
  4. NSCA opens the alternate dump file instead

Now when Nagios does create the nagios.cmd file, NSCA uses that ... unless aggregate mode is on and daemon mode is --single. In this case, it continues to use the alternate dump file, thus Nagios doesn't see the results from the slaves.

Here's the patch, which we've also added into our source for Opsview.

As we are very keen on good testing, we've managed to recreate the failing behaviour in a test script. You also need a test configuration file and a patch to the test framework. If you run this test, it will show the error and then after the patch is applied, the test should pass.

Comments

....or just start NSCA in the nagios init script :)

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment