Nagios, Pnp4nagios, and Nconf by Patrick Tenhoopen

Total Page:16

File Type:pdf, Size:1020Kb

Nagios, Pnp4nagios, and Nconf by Patrick Tenhoopen WMLUG July 2015 Nagios, PNP4Nagios, and NConf by Patrick TenHoopen What is Nagios? Nagios is an IT infrastructure monitoring and alerting tool. The free Nagios DIY Core provides the central monitoring engine and the basic web interface. Current Version: 4.08 (2014-08-12) Download: https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.0.8.tar.gz Nagios Demo Demo Installation Prerequisites ● gcc ● apache2 ● perl ● php ● rrdtool ● php5-gd ● php5-zlib ● php5-socket Installation Follow Quick-Start Guides https://assets.nagios.com/downloads/nagioscore/d ocs/nagioscore/4/en/quickstart.html After install, don't forget to configure the firewall on the Nagios server to allow http access if one is running. Installation, cont. tar xf nagios-4.0.8.tar.gz cd nagios-4.0.8 ./configure --with-command-group=nagcmd make all make install make install-init make install-config make install-commandmode make install-webconf htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin Nagios Plugins Download: http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz tar xf nagios-plugins-2.0.3.tar.gz cd nagios-plugins-2.0.3 ./configure --with-nagios-user=nagios --with-nagios-group=nagios make make install Configuration Nagios comes with a default configuration for monitoring the localhost that Nagios is installed on (localhost.cfg) plus some other examples. The configuration files are stored at /usr/local/nagios/etc/objects/ and are plain text files formatted in a proprietary format. Detailed description of configuration files and options: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4 /en/objectdefinitions.html Default Configuration Files ● commands.cfg – Check commands that are used in service definitions ● contacts.cfg - Who to contact if an alert is generated ● hosts.cfg - Hosts to monitor ● localhost.cfg - Basic config for Nagios host ● printer.cfg – Sample config for printers ● services.cfg - Things on hosts to monitor ● switch.cfg - Sample config for switches ● templates.cfg - Definition templates used by hosts, services, etc. ● timeperiods.cfg – Notification times/hours of alerting ● windows.cfg - Sample config for a Windows machine Configuration File Organization You don't need to separate the definitions into separate files, and you can have just one large configuration file. The cfg_file line(s) in the /usr/local/nagios/etc/nagios.cfg file controls what files are used. Note: If you want to import existing Nagios conf files into NConf (discussed later), it will work better if they are separated out by function/type. Templates Templates are used by configuration definitions to provide default values for settings. It keeps the actual definition smaller and easy to update. If you modify a template, all definitions that use it get updated. Generic Linux Host Template # Linux host definition template - This is NOT a real host, just a template! define host{ name linux-server ; The name of this host template use generic-host ; Inherits other values from generic-host template check_period 24x7 ; By default, Linux hosts are checked round the clock check_interval 5 ; Actively check the host every 5 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each Linux host 10 times (max) check_command check-host-alive ; Default command to check Linux hosts notification_period workhours ; Only notify during the day ; Note that the notification_period variable is being ; overridden from the value that is inherited from the ; generic-host template! notification_interval 120 ; Resend notifications every 2 hours notification_options d,u,r ; Only send notifications for specific host states contact_groups admins ; Notifications get sent to the admins by default register 0 ; DONT REGISTER THIS DEFINITION } Generic Service Template # Generic service definition template - This is NOT a real service, just a template! define service{ name generic-service ; The 'name' of this service template active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized ; (disabling this can lead to major performance problems) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts is_volatile 0 ; The service is not volatile check_period 24x7 ; The service can be checked at any time of the day max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state normal_check_interval 10 ; Check the service every 10 minutes under normal conditions retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined contact_groups admins ; Notifications get sent out to everyone in the 'admins' group notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events notification_interval 60 ; Re-notify about service problems every hour notification_period 24x7 ; Notifications can be sent out at any time register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! } Commands Nagios comes with several commands for checking services and more are installed with the Nagios plugins. They are located in the /usr/local/nagios/lib/ directory. Some examples: check_disk, check_http, check_log, check_nt, check_ping Community Check Commands You can download command definitions created by the Nagios community by perusing the plugin exchange at: http://exchange.nagios.org/directory/Plugins Custom Commands You can also create custom check commands using scripts or custom programs. The script/program just needs to return one of the exit statuses that Nagios expects: UNKNOWN = 3, CRITICAL = 2, WARNING = 1, OK = 0 Example Check Definition The $USER1$, $HOSTADDRESS$, $ARG1$, and $ARG2$ are Nagios macros. They are substituted for the values passed into the check when it is called from the service definition. -W is warning threshold. -C is critical threshold. # 'check_ping' command definition define command{ command_name check_ping command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 } Example Host Definition define host{ use linux-server ; Name of host templates to use ; This host definition will ; inherit all variables that are ; defined in (or inherited by) ; the linux-server host template ; definition. host_name localhost alias localhost address 127.0.0.1 } Example Service Note that the command parameters are delimited by an "!". The parameters are used in the check definition ($ARG1$, $ARG2$, etc). # Define a service to "ping" the local machine define service{ use local-service ; Name of service ; template to use host_name localhost service_description PING check_command check_ping!100.0,20%!500.0,60% } Host Groups By using host groups, you can easily set up checks for a set of hosts with one service definition. You can create a new config file named hostgroups.cfg. define hostgroup{ hostgroup_name linux-servers ; Name of the hostgroup alias Linux Servers ; Long name of the group members localhost,linuxbox1,linuxbox2 ; Comma separated list of ; hosts that belong to this group } define service{ use local-service ; Name of service template to use hostgroup_name linux-servers service_description PING-LINUX-HOSTS check_command check_ping!100.0,20%!500.0,60% } Parent/Child Relationships By defining what other hosts a host depends on, Nagios can distinguish between down and unreachable states for the host. For example if Nagios is monitoring a host connected to another switch and the switch is down, preventing Nagios from pinging it, Nagios only alerts that the switch is offline and doesn't alert that the other host is down too. Parents Setting When defining a host, use the "parents" setting to establish the parent/child relationship. define host{ host_name Nagios ; Nagios host has no parent } define host{ host_name Switch1 parents Nagios } define host{ host_name OtherHost parents Switch1 } Parent/Child Relationship Picture Pictorial representation: https://assets.nagios.com/downloads/nagioscore/d ocs/nagioscore/4/en/networkreachability.html NSClient++ With the NSClient++ add-on, you can easily set up checks on Windows servers. http://exchange.nagios.org/directory/Addons/Mon itoring-Agents/NSClient%2B%2B/details NRPE The NRPE (Nagios Remote Plugin Executor) add- on runs checks on a remote Linux host. It also acts as an NRPE listener on the Windows server. http://exchange.nagios.org/directory/Addons/Mon itoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Ex ecutor/details File Count Example Using NSClient++ and a community-created command (Check Filecount), you can monitor the number of files in a directory on a Windows computer. File Count Example – Service Definition From services.cfg: # Service definition define service { use generic-service host_name WINSERVER service_description Temp File Count check_command check_temp_files } File Count Example – Command Definition From commands.cfg: # 'check_temp_files' command definition define command{ command_name check_temp_files command_line
Recommended publications
  • Nagios 3.X + Nconf - Настройка Системы Мониторинга Nagios 3.X И Утилиты Конфигурирования Nconf Опубликовано Muff.Kiev.Ua (
    Nagios 3.x + NConf - настройка системы мониторинга Nagios 3.x и утилиты конфигурирования NConf Опубликовано muff.kiev.ua (http://muff.kiev.ua) Nagios 3.x + NConf - настройка системы мониторинга Nagios 3.x и утилиты конфигурирования NConf Опубликовано muff в Пнд, 2010-10-04 03:48 Понадобилось настроить систему мониторинга. Раньше пользовался с этой целью системой мониторинга Nagios 2.x, однако уже есть возможность воспользоваться 3 версией. В последней не только исправлены найденные ранее ошибки, добавлены новые макросы и многое другое, но и пересмотрен алгоритм сканирования, с целью устранить один из главных недостатков этой системы – медлительность при проверке больших сетей. В 2.х все тесты проходят практически последовательно, а в новой редакции задачи выполняются параллельно. Хотя вторая версия еще развивается, очевидно, что в будущем все силы будут брошены на третью ветку. Проект возник в 2002 году, хотя первое время он был известен как NetSaint. Его лидером является программист Этан Галстад. Само слово Nagios, по информации на сайте www.nagios.org [1], – это рекурсивный акроним, который расшифровывается, как Nagios Ain't Gonna Insist On Sainthood («Nagios не собирается настаивать на святости») – намек на предыдущее название проекта. Функциональность расширяется за счет плагинов и аддонов, большая часть из которых доступна на странице закачки. Общеобразовательная часть завершена, приступаем к установке. Установка будет выполнена из системы портов: # cd /usr/ports/net-mgmt/nagios && make install clean && rehash Автоматически устанавливаются плагины, расширяющие основной функционал системы. Также, во время установки система предлагает создать пользователя и групу nagios. Советую воспользоваться данным предложением. По завершении установки советую ознакомиться с инструкциями, котрые вывел Nagios после установки: ********************************************************************** Enable Nagios in /etc/rc.conf with the following line: nagios_enable="YES" Configuration templates are available in /usr/local/etc/nagios as *.cfg-sample files.
    [Show full text]
  • Automated System Monitoring
    Automated System Monitoring Josh Malone Systems Administrator [email protected] National Radio Astronomy Observatory Charlottesville, VA https://blogs.nrao.edu/jmalone 2 One night, about 8 or 9 years ago, the chiller in our DC failed. Co-worker arrive in the morning to find room was 90F ambient. Quickly set up fans to vent the room. Checked servers - found that main web server had lost both disks in its OS RAID mirror. (15k disks, ran hot) Main page was being served from memory, but the OS was freaking out. We had minimal monitoring scripts. No environment monitoring. No disk health checks. Failure caught us completely by surprise. We decided that we weren’t going to let this happen ever again. Over the next year or so we implemented 2 independent monitoring systems - one for servers/ services and one for environmentals. Set up each system to also monitor the other. WHAT IS AUTOMATED MONITORING? 7 Some sort of dedicated, automatic instrumentation to check services and/or servers Detect and report service problems, server hardware issues Usually provides a central “dashboard” to track problems Can be distributed; but still under control of a central daemon * Diferentiates it from “a bunch of scripts” used to check on things; that doesn’t have the ability to determine cause or eliminate false alarms. Automated Monitoring Workflow 8 Most packages implement this type of workflow Not all packages provide event handlers ack’ing page is important - let’s other admins know that someone is working on the problem so they don’t step on each other’s toes Monitoring Packages: Open Source • • Pandora FMS • Opsview Core • Naemon • • • • • • Captialware ServerStatus • Core • Sensu All Trademarks and Logos are property of their respective trademark or copyright holders and are used by permission or fair use for education.
    [Show full text]
  • Icinga Version 1.9 Documentation
    Icinga Version 1.9 Documentation Icinga Version 1.9 Documentation Next Icinga Version 1.9 Documentation Copyright 2009-2013 Icinga Development Team. Portions copyright © by Nagios/Icinga community members - see the THANKS file in the Icinga core sources for more information.. Credits to Yoann LAMY for creating the Vautour Style we use for the Icinga Classic UI Icinga is licensed under the terms of the GNU General Public License Version 2 as published by the Free Software Foundation. This gives you legal permission to copy, distribute and/or modify Icinga under certain conditions. Read the ’LICENSE’ file in the Icinga distribution or read the online version of the license for more details. Icinga is provided “AS IS” with “NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.” Nagios is licensed under the terms of the GNU General Public License Version 2 as published by the Free Software Foundation. This gives you legal permission to copy, distribute and/or modify Nagios under certain conditions. Read the ’LICENSE’ file in the Nagios distribution or read the online version of the license for more details. Nagios and the Nagios logo are registered trademarks of Ethan Galstad. All other trademarks, servicemarks, registered trademarks, and registered servicemarks mentioned herein may be the property of their respective owner(s). The information contained herein is provided “AS IS” with “NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.” 2013.05.07 1 Icinga Version 1.9 Documentation Revision History Revision 1.9 2013-04-25 1.9 Icinga Documentation Revision 1.x 2010-2012 1.x Icinga Documentation Revision 0.1 2009-08-12 First Release Table of Contents 1.
    [Show full text]
  • Advanced Monitoring Upload.Key
    Advanced System Monitoring with Nagios, PNP4Nagios and NConf Josh Malone Systems Administrator National Radio Astronomy Observatory Charlottesville, VA is great It checks your servers It tells you when there are problems But… Services keep expanding… We work in larger teams We all want to work on things at the same time We all want to work on things at the same time Management demands data You need the right tools We Need to Engineer a Monitoring Solution That Goes to 11! The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Plugins • Online plugin repositories • Nagios Exchange • Icinga Exchange • Monitoring Plugins • But….
    [Show full text]
  • Nconf Perl-API
    NConf perl-API NAME The NConf perl-module (and sub-modules) are a collection of shared functions to be used in perl scripts surrounding the NConf software. They offer an API to NConf internal functionality, namely the database. SYNOPSIS use NConf; setLoglevel($loglevel); use NConf::Logger; logger($loglevel, $msg); use NConf::ImportNagios; %data_ref = parseNagiosConfigFile($class_name, $filename); use NConf::ImportCsv; %data_ref = parseCsv($filename, $class_name, $delimiter); %data_ref = parseHostServiceCsv($filename, $delimiter); use NConf::Helpers; $scalar = readNConfConfig($filename, $php_const_name, 'scalar'); @array = readNConfConfig($filename, $php_array_name, 'array'); @dist_array = makeValuesDistinct(@ref_array); $str_out = replaceMacros($str_in); use NConf::DB; setDbReadonly(1); $dbh = dbConnect(); dbDisconnect(); $q_str = dbQuote($str); use NConf::DB::Read; $id_item = getItemId($item_name, $item_class); $id_srv = getServiceId($srv_name, $id_host); $host_name = getServiceHostname($id_service); $id_attr = getAttrId($attr_name, $attr_class); $naming_attr = getNamingAttr($class_name); $item_name = getItemName($id_item); $item_class = getItemClass($id_item); @ref_array = getItemData($id_item); @ref_array = getItems($class_name); @ref_array = getItemsLinked($id_item); @ref_array = getChildItemsLinked($id_item); %conf_attrs = getConfigAttrs(); %conf_class = getConfigClasses(); $unique_name = getUniqueNameCounter($class_name, $item_name); $retval = checkItemsLinked($id_item, $id_linked, $attr_name); $retval = checkLinkAsChild($id_attr);
    [Show full text]
  • Shinken Documentation Release 2.4
    Shinken Documentation Release 2.4 Shinken Team August 14, 2015 Contents 1 About 1 1.1 About Shinken..............................................2 2 Getting Started 5 2.1 Advice for Beginners...........................................6 2.2 Installations................................................7 2.3 Upgrading Shinken............................................ 11 3 Configuring Shinken 13 3.1 Configuration Overview......................................... 14 3.2 Main Configuration File (shinken.cfg) Options............................. 15 3.3 Object Configuration Overview..................................... 23 3.4 Object Definitions............................................ 25 3.5 Custom Object Variables......................................... 26 3.6 Main advanced configuration...................................... 28 4 Running Shinken 49 4.1 Verifying Your Configuration...................................... 50 4.2 Starting and Stopping Shinken...................................... 50 5 The Basics 53 5.1 Setting up a basic Shinken Configuration................................ 54 5.2 Monitoring Plugins............................................ 59 5.3 Understanding Macros and How They Work.............................. 61 5.4 Standard Macros in Shinken....................................... 65 69subsubsection*.137 70subsubsection*.183 72subsubsection*.236 73subsubsection*.262 5.5 Host Checks............................................... 79 5.6 Service Checks.............................................. 81 5.7 Active Checks.............................................
    [Show full text]