Nagios Documentation
Total Page:16
File Type:pdf, Size:1020Kb
Nagios® Version 2.x Documentation Copyright © 1999-2006 Ethan Galstad www.nagios.org Last Updated: 11-27-2006 [ Table of Contents ] Nagios and the Nagios logo are registered trademarks of Ethan Galstad. All other trademarks, servicemarks, registered trademarks, and registered servicemarks mentioned herein may be the property of their respective owner(s). The information contained herein is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. 1 Version 2.0 Documentation Table of Contents About What is Nagios? System requirements Licensing Downloading the latest version Other monitoring utilities Release Notes What’s new in this version Change log Support Self-service and commercial support Getting Started Advice for beginners Installing Nagios Compiling and installing Nagios Setting up the web interface Configuring Nagios Configuration overview Main configuration file options Object configuration file options CGI configuration file options Configuring authorization for the CGIs Running Nagios Verifying the configuration Starting Nagios Stopping and restarting Nagios Nagios Plugins Standard plugins Writing your own plugins 2 Nagios Addons NRPE - Daemon and plugin for executing plugins on remote hosts NSCA- Daemon and client program for sending passive check results across the network Theory Of Operation Determing status and reachability of network hosts Network outages Notifications Plugin theory Service check scheduling State types Time periods Advanced Topics Event handlers External commands Indirect host and service checks Passive service checks Volatile services Service and host result freshness checks Distributed monitoring Redundant and failover monitoring Detection and handling of state flapping Service check parallelization Notification escalations Monitoring service and host clusters Host and service dependencies State stalking Performance data Scheduled host and service downtime Using the embedded Perl interpreter Adaptive monitoring Object inheritance Time-saving tips for object definitions Integration With Other Software SNMP Traps TCP Wrappers Miscellaneous Securing Nagios Tuning Nagios for maximum performance Using the nagiostats utility Using macros in commands Information on the CGIs Custom CGI headers and footers 3 About Nagios® What Is This? Nagios® is a system and network monitoring application. It watches hosts and services that you specify, alerting you when things go bad and when they get better. Nagios was originally designed to run under Linux, although it should work under most other unices as well. Some of the many features of Nagios® include: Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.) Monitoring of host resources (processor load, disk usage, etc.) Simple plugin design that allows users to easily develop their own service checks Parallelized service checks Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable Contact notifications when service or host problems occur and get resolved (via email, pager, or user-defined method) Ability to define event handlers to be run during service or host events for proactive problem resolution Automatic log file rotation Support for implementing redundant monitoring hosts Optional web interface for viewing current network status, notification and problem history, log file, etc. System Requirements The only requirement of running Nagios is a machine running Linux (or UNIX variant) and a C compiler. You will probably also want to have TCP/IP configured, as most service checks will be performed over the network. You are not required to use the CGIs included with Nagios. However, if you do decide to use them, you will need to have the following software installed... 1. A web server (preferrably Apache) 2. Thomas Boutell’s gd library version 1.6.3 or higher (required by the statusmap and trends CGIs) Licensing Nagios® is licensed under the terms of the GNU General Public License Version 2 as published by the Free Software Foundation. This gives you legal permission to copy, distribute and/or modify Nagios under certain conditions. Read the ’LICENSE’ file in the Nagios distribution or read the online version of the license for more details. Nagios® is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Acknowledgements 4 Several people have contributed to Nagios by either reporting bugs, suggesting improvements, writing plugins, etc. A list of some of the many contributors to the development of Nagios can be found at http://www.nagios.org. Downloading The Latest Version You can check for new versions of Nagios at http://www.nagios.org. Nagios and the Nagios logo are trademarks of Ethan Galstad. All other trademarks, servicemarks, registered trademarks, and registered servicemarks may be the property of their respective owner(s). 5 What’s New in Version 2.0 Important: Make sure you read through the documentation before sending a question to the mailing lists. Change Log The change log for Nagios can be found online at http://www.nagios.org/development/changelog.php or in the Changelog file in the root directory of the source code distribution. Known Issues There is a known issue that can affect Nagios 2.0 on FreeBSD systems. Hopefully this problem can be fixed in a 2.x release... 1. FreeBSD and threads. On FreeBSD there’s a native user-level implementation of threads called ’pthread’ and there’s also an optional ports collection ’linuxthreads’ that uses kernel hooks. Some folks from Yahoo! have reported that using the pthread library causes Nagios to pause under heavy I/O load, causing some service check results to be lost. Switching to linuxthreads seems to help this problem, but not fix it. The lock happens in liblthread’s __pthread_acquire() - it can’t ever acquire the spinlock. It happens when the main thread forks to execute an active check. On the second fork to create the grandchild, the grandchild is created by fork, but never returns from liblthread’s fork wrapper, because it’s stuck in __pthread_acquire(). Maybe some FreeBSD users can help out with this problem. Changes and New Features 1. Macro Changes - Macros have undergone a major overhaul. You will have to update most of your command definitions to match the new macros. Most macros are now available as environment variables. Also, "on-demand" host and service macros have been added. See the documentation on macros for more information. 2. Hostgroup Changes Hostgroup escalations removed - Hostgroup escalations have been removed. Their functionality can be duplicated by using the hostgroup_name directive in hostgroup definitions. Member directive changes - Hostgroup definitions can now contain multiple members directives, which should make editing the config files easier when you have a lot of member hosts. Alternatively, you may use the hostgroups directive in host definitions to specify what hostgroup(s) a particular host is a member of. Contact group changes - The contact_groups directive has been moved from hostgroup definitions to host definitions. This was done in order to maintain consistency with the way service contacts are specified. Make sure to update your config files! Authorization changes - Authorization for access to hostgroups in the CGIs has been changed. You must now be authorized for all hosts that are members of the hostgroup in order to be authorized for the hostgroup. 3. Host Changes Host freshness checking - Freshness checking has been added for host checks. This is controlled by the check_host_freshness option, along with the check_freshness directive in host definitions. OCHP Command - Host checks can now be obsessed over, just as services can be. The OCHP command is run for all hosts that have the obsess_over_host directive enabled in their host definition. 4. Host Check Changes Regularly scheduled checks - You can now schedule regular checks of hosts by using the 6 check_interval directive in host definitions. NOTE: Listen up! You should use regularly scheduled host checks rather sparingly. They are not necessary for normal operation (on-demand checks are already performed when necessary) and can negatively affect performance if used improperly. You’ve been warned. Passive host checks - Passive host checks are now supported if you’ve enabled them with the accept_passive_host_checks option in the main config file and the accept_passive_host_checks directive in the host definition. Passive host checks can make setting up redundant or distributed monitoring environments easier. NOTE: There are some problems with passive host checks that you should be aware of - read more about them here. 5. Retention Changes Retention of scheduling information - Host and service check scheduling information (next check times) can now be retained across program restarts using the use_retained_scheduling_info directive. Smarter retention - Values of various host and service directives that can be retained across program restarts are now only retained if they are changed during runtime by an external command. This should make things less confusing to people when they try and modify host and service directive values and then restart Nagios, expecting to see some changes. More stuff retained - More information is now retained across program restarts, including flap detection history. Hoorah! 6. Extended Info Changes