WMLUG July 2015

Nagios, PNP4Nagios, and NConf by Patrick TenHoopen

What is ?

Nagios is an IT infrastructure monitoring and alerting tool. The free Nagios DIY Core provides the central monitoring engine and the basic web interface.

Current Version: 4.08 (2014-08-12)

Download: https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.0.8.tar.gz

Nagios Demo

Demo

Installation Prerequisites

● gcc

● apache2

● rrdtool ● php5-gd ● php5-zlib

● php5-socket

Installation

Follow Quick-Start Guides https://assets.nagios.com/downloads/nagioscore/d ocs/nagioscore/4/en/quickstart.html

After install, don't forget to configure the firewall on the Nagios server to allow http access if one is running.

Installation, cont. tar xf nagios-4.0.8.tar.gz cd nagios-4.0.8 ./configure --with-command-group=nagcmd make all make install make install-init make install-config make install-commandmode make install-webconf htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Nagios Plugins

Download: http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz tar xf nagios-plugins-2.0.3.tar.gz cd nagios-plugins-2.0.3 ./configure --with-nagios-user=nagios --with-nagios-group=nagios make make install

Configuration

Nagios comes with a default configuration for monitoring the localhost that Nagios is installed on (localhost.cfg) plus some other examples.

The configuration files are stored at /usr/local/nagios/etc/objects/ and are plain text files formatted in a proprietary format.

Detailed description of configuration files and options: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4 /en/objectdefinitions.html

Default Configuration Files

● commands.cfg – Check commands that are used in service definitions

● contacts.cfg - Who to contact if an alert is generated

● hosts.cfg - Hosts to monitor

● localhost.cfg - Basic config for Nagios host

● printer.cfg – Sample config for printers

● services.cfg - Things on hosts to monitor

● switch.cfg - Sample config for switches

● templates.cfg - Definition templates used by hosts, services, etc.

● timeperiods.cfg – Notification times/hours of alerting

● windows.cfg - Sample config for a Windows machine

Configuration File Organization

You don't need to separate the definitions into separate files, and you can have just one large configuration file.

The cfg_file line(s) in the /usr/local/nagios/etc/nagios.cfg file controls what files are used.

Note: If you want to import existing Nagios conf files into NConf (discussed later), it will work better if they are separated out by function/type.

Templates

Templates are used by configuration definitions to provide default values for settings. It keeps the actual definition smaller and easy to update. If you modify a template, all definitions that use it get updated.

Generic Linux Host Template

# Linux host definition template - This is NOT a real host, just a template! define host{ name linux-server ; The name of this host template use generic-host ; Inherits other values from generic-host template check_period 24x7 ; By default, Linux hosts are checked round the clock check_interval 5 ; Actively check the host every 5 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each Linux host 10 times (max) check_command check-host-alive ; Default command to check Linux hosts notification_period workhours ; Only notify during the day ; Note that the notification_period variable is being ; overridden from the value that is inherited from the ; generic-host template! notification_interval 120 ; Resend notifications every 2 hours notification_options d,u,r ; Only send notifications for specific host states contact_groups admins ; Notifications get sent to the admins by default register 0 ; DONT REGISTER THIS DEFINITION }

Generic Service Template

# Generic service definition template - This is NOT a real service, just a template! define service{ name generic-service ; The 'name' of this service template active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized ; (disabling this can lead to major performance problems) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts is_volatile 0 ; The service is not volatile check_period 24x7 ; The service can be checked at any time of the day max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state normal_check_interval 10 ; Check the service every 10 minutes under normal conditions retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined contact_groups admins ; Notifications get sent out to everyone in the 'admins' group notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events notification_interval 60 ; Re-notify about service problems every hour notification_period 24x7 ; Notifications can be sent out at any time register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! }

Commands

Nagios comes with several commands for checking services and more are installed with the Nagios plugins. They are located in the /usr/local/nagios/lib/ directory.

Some examples: check_disk, check_http, check_log, check_nt, check_ping

Community Check Commands

You can download command definitions created by the Nagios community by perusing the plugin exchange at: http://exchange.nagios.org/directory/Plugins

Custom Commands

You can also create custom check commands using scripts or custom programs. The script/program just needs to return one of the exit statuses that Nagios expects:

UNKNOWN = 3, CRITICAL = 2, WARNING = 1, OK = 0

Example Check Definition

The $USER1$, $HOSTADDRESS$, $ARG1$, and $ARG2$ are Nagios macros. They are substituted for the values passed into the check when it is called from the service definition. -W is warning threshold. -C is critical threshold.

# 'check_ping' command definition define command{ command_name check_ping command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 }

Example Host Definition

define host{ use linux-server ; Name of host templates to use ; This host definition will ; inherit all variables that are ; defined in (or inherited by) ; the linux-server host template ; definition. host_name localhost alias localhost address 127.0.0.1 }

Example Service

Note that the command parameters are delimited by an "!". The parameters are used in the check definition ($ARG1$, $ARG2$, etc).

# Define a service to "ping" the local machine define service{ use local-service ; Name of service ; template to use host_name localhost service_description PING check_command check_ping!100.0,20%!500.0,60% }

Host Groups

By using host groups, you can easily set up checks for a set of hosts with one service definition. You can create a new config file named hostgroups.cfg. define hostgroup{ hostgroup_name linux-servers ; Name of the hostgroup alias Linux Servers ; Long name of the group members localhost,linuxbox1,linuxbox2 ; Comma separated list of ; hosts that belong to this group } define service{ use local-service ; Name of service template to use hostgroup_name linux-servers service_description PING-LINUX-HOSTS check_command check_ping!100.0,20%!500.0,60% }

Parent/Child Relationships

By defining what other hosts a host depends on, Nagios can distinguish between down and unreachable states for the host.

For example if Nagios is monitoring a host connected to another switch and the switch is down, preventing Nagios from pinging it, Nagios only alerts that the switch is offline and doesn't alert that the other host is down too.

Parents Setting

When defining a host, use the "parents" setting to establish the parent/child relationship. define host{ host_name Nagios ; Nagios host has no parent } define host{ host_name Switch1 parents Nagios } define host{ host_name OtherHost parents Switch1 }

Parent/Child Relationship Picture

Pictorial representation: https://assets.nagios.com/downloads/nagioscore/d ocs/nagioscore/4/en/networkreachability.html

NSClient++

With the NSClient++ add-on, you can easily set up checks on Windows servers. http://exchange.nagios.org/directory/Addons/Mon itoring-Agents/NSClient%2B%2B/details

NRPE

The NRPE (Nagios Remote Plugin Executor) add- on runs checks on a remote Linux host. It also acts as an NRPE listener on the Windows server. http://exchange.nagios.org/directory/Addons/Mon itoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Ex ecutor/details

File Count Example

Using NSClient++ and a community-created command (Check Filecount), you can monitor the number of files in a directory on a Windows computer.

File Count Example – Service Definition

From services.cfg:

# Service definition define service { use generic-service host_name WINSERVER service_description Temp File Count check_command check_temp_files }

File Count Example – Command Definition

From commands.cfg:

# 'check_temp_files' command definition define command{

command_name check_temp_files

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_temp_files

}

File Count Example – NSClient++ Definition

From the NSClient++ nsclient.ini file on Windows server being monitored:

[/settings/external scripts/scripts] check_temp_files=c:\windows\system32\cscript.exe //NoLogo //T:30 C:\nrpe\directory_file_count\directory_file_count.wsf c: \\windows\\temp 50 100

Pre-Flight Check

Configuration changes don't go into affect until Nagios is restarted. You should run a Nagios pre-flight check after making configuration changes and before restarting Nagios to make sure it doesn't find anything wrong with the configuration.

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Starting Nagios

You'll need to restart webserver and start Nagios:

systemctl restart apache2 systemctl start nagios

Logging Into Nagios

Goto Nagios web page and log in: http://nagioshost/nagios/

PNP4Nagios

PNP4Nagios is an add-on to Nagios which analyzes and graphs performance data provided by Nagios plugins and stores them in Round Robin Database (RRD) files. https://docs.pnp4nagios.org/

Prerequisites

● Perl 5.x or higher, without additional modules ● RRDtool 1.x or higher, better with 1.2 ● Nagios 2.x or higher

Installation https://docs.pnp4nagios.org/pnp-0.6/install tar xf pnp4nagios-0.6.25.tar.gz cd pnp4nagios-0.6.25 ./configure make all make fullinstall

Configuration – Choose Mode

The Bulk-Mode + NCPD mode seems to be the only mode that works with Nagios core 4. https://docs.pnp4nagios.org/pnp-0.6/modes#bulk_mode_with_npcd

Also, this is the best way of processing because Nagios will not be blocked. The NPCD daemon (Nagios Performance C Daemon) will monitor the directory for new files and process them.

Configuration - Enable Processing

Enable processing of performance data in /usr/local/nagios/etc/nagios.cfg process_performance_data=1

# service performance data service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC:: $SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE:: $HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$ service_perfdata_file_mode=a service_perfdata_file_processing_interval=15 service_perfdata_file_processing_command=process-service-perfdata-file

# host performance data host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA:: $HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$ host_perfdata_file_mode=a host_perfdata_file_processing_interval=15 host_perfdata_file_processing_command=process-host-perfdata-file

Configuration – Add Commands

Add new commands to /usr/local/nagios/etc/objects/commands.cfg

# 'process-host-perfdata' command definition define command{ command_name process-host-perfdata-file command_line /bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$ }

# 'process-service-perfdata' command definition define command{ command_name process-service-perfdata-file command_line /bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$ }

Configuration - Templates

Add new templates to /usr/local/nagios/etc/objects/templates.cfg define host { name host-pnp action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_ register 0 } define service { name srv-pnp action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$ register 0 }

Verify Configuration

Download the verify_pnp_config Perl script from: https://docs.pnp4nagios.org/pnp-0.6/verify_pnp_config

Run it, specifying the mode, location of Nagios config file, and the pnp config file: perl verify_pnp_config --mode=bulk+npcd --config=/usr/local/nagios/etc/nagios.cfg --pnpcfg=/usr/local/pnp4nagios/etc

Starting PNP4Nagios

Start PNP4Nagios as a daemon:

/usr/local/pnp4nagios/bin/npcd -d -f /usr/local/pnp4nagios/etc/npcd.cfg

You'll also need to restart webserver and Nagios:

systemctl restart apache2 systemctl restart nagios

Using PNP4Nagios

Click the graph icon next to hosts and services in the Nagios page to see the graphs of performance data.

PNP4Nagios Demo

Demo

NConf

NConf is a PHP-based web-tool for configuring Nagios. It has features like templates, service to hostgroup assignment, and dependencies. http://www.nconf.org

Prerequisites

● PHP 5.x or higher ● php- ● php-ldap (only if using LDAP auth) ● MySQL 5.0.2 or higher (with InnoDB!) ● Perl 5.6 or higher

● perl-DBI ● perl-DBD-MySQL

Installation - Prep

Note your webserver document root, user and group:

Apache document root: /srv/www/htdocs/ User: wwwrun Group: www

Installation Guide http://www.nconf.org/dokuwiki/doku.php?id=nconf:h elp:documentation:start:installation

Installation - Extract

Copy the tar file to the document root and expand it. tar xf nconf-1.3.0-0.tgz

Grant permissions to webserver user: chown wwwrun:www ./config chown wwwrun:www ./output chown wwwrun:www ./static_cfg chown wwwrun:www ./temp

Installation – Create Database

Start the MySQL prompt then create the database: mysql -u root -p If the MySQL install is new, follow this to set password: https://dev.mysql.com/doc/refman/5.0/en/default-privileges. html mysql> CREATE DATABASE NConf; mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, DROP ON `NConf`.* TO 'nconfuser'@'localhost' IDENTIFIED BY 'nconfp'; mysql> quit

GUI Installation Method

GUI (easy): http://localhost/nconf/INSTALL.php 1. Enter MySQL information. 2. Enter NConf and Nagios paths. 3. Set up authentication (use defaults). 4. Remove INSTALL, INSTALL.php, UPDATE, and UPDATE.php from NConf directory.

Manual Installation Method

1. CD into the extracted NConf folder. 2. Create database structure: mysql -u nconf -D NConf -p < INSTALL/create_database.sql 3. Configure NConf. 1. Copy the contents of ./config.orig to ./config. 2. Edit ./config/mysql.php, and set values for DBHOST, DBNAME, DBUSER, and DBPASS. 3. Edit ./config/nconf.php, and set values for NCONFDIR and NAGIOS_BIN. 4. Remove INSTALL, INSTALL.php, UPDATE, and UPDATE.php from NConf directory.

NConf Nagios Configuration Check

In order to enable NConf to check the Nagios configuration files, make sure your webserver user has access to your Nagios binary, or copy the binary to the '/srv/www/htdocs/nconf/bin/' folder and make the webserver user the owner.

Using NConf

Open NConf page: http://nagioshost/nconf

NConf Demo

Demo