We have been asked by a customer if it is possible to change a check command for a service depending on the time of day.
Why would this be useful?
Well, if a server runs time critical processes during the day and slow running batch processes over night, how can a service check command take into account how it is supposed to report on CPU or memory usage without generating false alerts? Yes, you could write your own plugins to take account of the time and react accordingly for each check this needs to be for, but these would have to be installed on each host for each service, the wealth of plugins from http://www.nagiosexchange.org/ cannot easily be used, setting the system up takes longer, and it is all much harder to maintain.
Instead, we have made changes to the service stanza within the Nagios configuration files to include a "check_timeperiod_command <timeperiod>,<command>" entry:
define service {
host_name server1
service_description Free Widgets
check_command check_widget -w 40% -c 20%
check_timeperiod_command nonworkhours,check_widget -w 5% -c 2%
.....
}
You get the idea....
check_command provides the default check for the day. During the nonworkhours period, the alternative command and arguments are used instead.
This seems far too useful to the community to keep to ourselves, so we offer the patch for Nagios 2.8 here, for peer review and comments (all of which are very welcome).
And here is a patch for ndoutils 1.4b2 that goes with it.
Enjoy!
Update: Patches for Nagios 3.0.6 and NDOutils 1.4b7 are available
Would it be possible to have some automatic threshold creator? Lets say each period of time (day, week, month), a cron checks the previous results for that period and adjust the thresholds according to those results. The thresholds could be an average of the previous periods...Some kind of graphical approval could be done by the user to change the thresholds made by abnormal previous results...
sorry for my bad english
Posted by: Samuel | June 4, 2008 at 03:15 PM