Michael Prochaska was having trouble with compiling NDOutils on Solaris 10. Since we have an interest in getting Opsview working on Solaris (the upcoming 2.12 release will add Solaris 10 as a supported platform), we offered to help. So this is the result of his company, Bacher Systems, sponsoring our work.
This is our version of NDOutils 1.4b7. We've applied the following patches, most of which can go back upstream:
ndoutils_better_mysql_detection.1.4b7.patch and ndoutils_np_mysqlclient.m4
We've talked about this before and it looks like it is included in CVS for ndoutils. However, since it is not released yet, we've included it in this list. You need to copy ndoutils_np_mysqlclient.m4 into a new m4/ directory at the top level of ndoutils and run:aclocal -I m4 autoconfThis updates the ./configure script to detect mysql using mysql_config.
ndoutils_show_mysql_error.patch
We've found this to be a useful patch. When ndoutils logs to syslog with mysql errors, we don't always understand what it means. This patch will add the returned mysql error to syslog too.ndoutils_solaris_eintr_in_accept.patch
We've found that it is possible to get an EINTR during the accept on Solaris systems. This patch ignores the EINTR and waits for the next connect.ndoutils_retry_on_soft_read_errors.patch
While reading through the code for EINTR, we found there was a case where it was possible that an EINTR falls through during a read(). This patch loops around the read() in these cases.ndoutils_remove_multiple_children.patchndoutils_remove_multiple_children.patch
For reasons that we haven't quite understood, sometimes the SIGCHLD signal is not processed correctly. This leaves a defunct process behind. However, the next time a child exits, it is processed, but only one child. This leaves many defunct processes over time. This patch loops around the waitpid() call to get all the children.ndoutils_use_sigaction_for_child_handler.patch
Related to above, the signal handler for the children in Solaris looks like it needs to be reset, if you use signal() to set it. However, sigaction() is a more convenient system call. When researching this, we found that NSCA uses this (and a similar loop to remove multiple children). So sigaction() is used instead of signal() to set the SIGCHLD handler.ndoutils_sunos.h, ndoutils_sunos.c and ndoutils_build_on_solaris.patch
These files include asprintf() and vasprintf(), which are not available on solaris. However, this is not the best way of handling this. A more appropriate way would be to be only include if ./configure saw they were missing. I'd be inclined to use gnulib to add this function in, because then there's a whole set of other functions that can be brought in as well. The Nagios Plugins use this in their project to great success. If Ethan says the word, we'll make this a better patch. But it works for now.We hope these all get included in a release of NDOutils soon. We love that Nagios logs its status information into a database - Opsview relies on this in order to show status screens for large systems. It is in all our interests to get this crucial piece of software as stable and as complete as possible.
Can someone send me the syntax for applying ndoutils_sunos.h, ndoutils_sunos.c and ndoutils_build_on_solaris.patch.
Thanks in advance
Posted by: Toby Nelson | July 23, 2008 at 05:44 PM
Thanks a lot!!
Now I can compile ndoutils. :D
Thanks again.
Regards.
Posted by: Marcio Seiji | January 6, 2009 at 05:45 PM
After compiling the patched version in Solaris 10 x86, I get the following message in the nagios.log file:
[1254246969] Error: Could not load module '/export/home/nagios/bin/ndomod.o' -> ld.so.1: nagios: fatal: /export/home/nagios/var/spool/checkresults/nebmod9ia4qB: wrong ELF class: ELFCLASS32
Any ideas what I am missing?
Posted by: Francisco Franco | September 29, 2009 at 07:01 PM