Advanced Monitoring Upload.Key

Total Page:16

File Type:pdf, Size:1020Kb

Advanced Monitoring Upload.Key Advanced System Monitoring with Nagios, PNP4Nagios and NConf Josh Malone Systems Administrator National Radio Astronomy Observatory Charlottesville, VA is great It checks your servers It tells you when there are problems But… Services keep expanding… We work in larger teams We all want to work on things at the same time We all want to work on things at the same time Management demands data You need the right tools We Need to Engineer a Monitoring Solution That Goes to 11! The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ 6 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Addons • PNP4Nagios • Graph the data from your service checks • https://github.com/lingej/pnp4nagios • https://docs.pnp4nagios.org/pnp-0.6/ • NConf • Web-based Nagios configurator • http://www.nconf.org/dokuwiki/doku.php • https://github.com/nconf/nconf 7 The Right Plugins • Online plugin repositories • Nagios Exchange • Icinga Exchange • Monitoring Plugins • But…. if you want something done write • Write it yourself! • …and write it RIGHT! 8 PNP4Nagios Performance Data + Graphing Nagios Performance Data • Check plugins can optionally return “performance data” (‘perfdata’) • Perfdata is just any metric associated with a check • Response time (seconds, ms) • Web page size (bytes, kb) • Network throughput (bits/sec, kB/sec, mb/s) • Room temperature (F, C) 10 Perfdata Output ./check_ping -H 184.6.0.1 -w 100,2% -c 200,5% PING OK - Packet loss = 0%, RTA = 56.56 ms| rta=56.563000ms;100.000000;200.000000;0.000000 pl=0%;2;5;0 • All output is on STDOUT 11 Perfdata Output ./check_ping -H 184.6.0.1 -w 100,2% -c 200,5% PING OK - Packet loss = 0%, RTA = 56.56 ms| rta=56.563000ms;100.000000;200.000000;0.000000 pl=0%;2;5;0 • All output is on STDOUT • Vertical bar separates “screen output” from performance data 11 12 Support By Plugins • Not all plugins report performance data • Some plugins require a command-line flag to activate perfdata output • Some plugins output things that could be perfdata but they do it in the screen output • Wrap these plugins in a script to parse screen output and reformat it as proper perfdata 13 Performance Data Handling • Nagios does not natively do much with performance data • Perfdata must be passed to an add-on for it to be useful • Nagios comes with sample commands for processing perfdata • process-host-perfdata • process-service-perfdata 14 Getting Perfdata into PNP • misccommands.cfg - redefine perdata commands define command { command_name process-service-perfdata command_line /usr/localł/nagios/libexec/ process_perfdata.pl } define command { command_name process-host-perfdata command_line /usr/localł/nagios/libexec/ process_perfdata.pl -d HOSTPERFDATA } 15 Understanding RRDs • RRD is a “Round Robin Database” • Data in an RRD is stored as sets of averages • 1 minute, 5 min, 15 min, 1hr, 6 hr, 12 hr, etc. • File never grows, but resolution is lost with time • Maximum time to hold data is set when the RRD is created (number of slots for each time ‘bin’) • PNP4Nagios holds enough data for 4 years by default 16 Multi-value graphs • Graphs can overlay multiple values from one RRD 17 Multi-value graphs • Graphs can overlay multiple values from one RRD 18 Multi-value graphs • Graphs can overlay multiple values from one RRD 19 Perfdata Processing Modes sy Not a EaSynchronous Bulks modeEasy • The PNP processor is • Perfdata is accumulated invoked after each and in a flat file after each every service check service check • RRDs are updated • PNP processor is called immediately after each every 30 seconds and service check handles all data from file • Number of perl execs • Reduced PNP load can cause high load 20 Increase Graph Data Age • PNP4Nagios shows graphs out to 1 year by default 21 Increase Graph Data Age • PNP4Nagios shows graphs out to 1 year by default • The default RRDs hold data for 4 years • All that’s missing is some links for older data • Defined in the $views array in config_local.php $views[] = array(‘title’ => ‘Two Years’, ‘start’ => (3600*24*740) ); Days in 2 years 22 Increase Graph Data Age 23 Using PNP4Nagios PNP4Nagios Overview PNP4Nagios Menus • Switch to a different host right from PNP screen • Select date range • Create PDF export 26 Using the Basket • Basket can be used to combine graphs from multiple hosts into a single page • Use in combination with PDF export to generate printable/mailable summaries for others • Management, vendors, etc 27 Using the Basket Templates • Templates define how the perfdata is displayed • PNP4Nagios looks for a template with the same name as the check command • Falls back to a default if not found • Define how to present values from the RRDs • Written in PHP so you can do any kind of processing you like (scaling, coloring, etc.) 29 Using templates to tune graphs 30 Using templates to tune graphs • Define command line options to rrdtool • $opt[$key] = “-X 0 --height 200 --vertical-label ‘foo’ --title ‘Graph Title’ “ • Tells rrgraph not to power-scale the Y axis, sets Y axis label and graph title and makes graphs taller • Divide a value by 1024 and call the result ‘gb’ • $def[$key] .= “CDEF:gb=var1,1024,/ “; • Converts MB to GB 30 Using templates to tune graphs 30 NConf Web-based GUI configurator 32 33 NConf • Web-based GUI configurator for Nagios • Stores config objects in MySQL database • Generates Nagios config files from DB for deployment to Nagios servers • Deployment is scriptable (SCP, rsync, etc.) • NConf need not run on the Nagios server itself 34 Installation: Pre-requisites • MySQL with InnoDB • OS packages • apt-get install libdbi-perl php5-mysql gcc • yum install perl-DBI perl-DBD-mysql • PHP • short_open_tags = On • register_globals = Off • magic_quotes_gpc = Off 35 Install • Un-tar files into web server document area • config/mysql.php for database server/user/pass • config/authentication.php - AD, sql, file or basic auth • config/deployment.ini - How to deploy conf files to Nagios instance 36 Local Deployment [local deployment] type = local source_file = /etc/nconf/output/NagiosConfig.tgz target_file = /etc/nagios action = extract reload_command = “sudo /etc/init.d/nagios reload” 37 Importing Existing Configs • NConf can import existing config files, but the process must be done in multiple steps • Each type of object (hosts, services, commands, contacts, etc.) must be imported separately and in the correct order (contacts before contact groups) • Nagios object cache lists all objects sorted by type • See the Import Guide 38 Extending the Schema • Some Nagios configuration attributes aren’t supported by NConf out of the box • Luckily, the configuration schema/data model used by NConf is extensible • Administration • Attributes • Add • Back up your database before changing the schema! 39 Extending the Schema Back up your database before changing the schema! 39 Extending the Schema contacts Contacts People to notify about this host host assign-many contact 40 Extending the Schema 41 Check Plug-Ins Must-have plugins • check_openmanage - Monitor Dell servers with OMSA 43 Must-have plugins • check_openmanage - Monitor Dell servers with OMSA 43 Must-have plugins • check_netappfiler.py • Old, but still works great • Uses SNMP, compatible with OnTap 7-Mode • Comes with PNP templates • https://github.com/wAmpIre/check_netappfiler 44 Must-have plugins 45 Must-have plugins • check_logfiles • https://github.com/lausser/check_logfiles • Scans logfiles for patterns indicating Warning, Critical or OK states • Handles rotated logfiles • Detects recovery strings as well • Can use external config files for complex checks 46 Must-have plugins • check-cisco.pl • Cisco router / switch CPU, PSU, temp • https://github.com/ranl/monitor-utils • Synology status (check_snmp_synology) • Check health, RAID, disk temps, storage • Available on Nagios exchange 47 Writing Check Plug-ins Have no fear - Write exactly the plugin you need Custom Plugins • Nagios can monitor anything you can write a script to check • Simple API • You can write plugins in ANY language you choose! • bash, python, tcl, expect • perl (Nagios has embedded perl interpreter for speed) • C, C++ 49 Plugin API • Exit code determines check state • 0 - OK • 1 - Warning • 2 - Critical • 3 - Unknown • Stdout is for human-readable notices; ignored by Nagios • Perfdata written on stdout, after vertical bar • Multiple lines allowed - up to 4 kB • http://nagios.sourceforge.net/docs/3_0/pluginapi.html 50 Writing plugins in Perl • Nagios provides utils.pm • Provides %ERRORS hash • Maps status names to exit codes • $ERRORS{‘CRITICAL’} • You can use my template as a starting point • https://github.com/48kRAM/nagios-plugins/tree/ master/Template • Command-line parsing, threshold parsing, output formatting 51 Writing Good Plugins • Keep default output short and to the point • Suitable for SMS messages, pagers, etc. • Easy to parse in a time-critical situation • Remember: Nagios should help you fix the problem! • Call external binaries by their full path • Make it configurable on the cmdline or at the top of the script in a variable 52 Writing Good Plugins • Watch out for long runtimes or hung processes • Perl: Use alarm (standard function) • Bash/Sh: Use timeout (coreutils) • Avoid temp files in case your disk is full, out of file handles, etc.
Recommended publications
  • Nagios 3.X + Nconf - Настройка Системы Мониторинга Nagios 3.X И Утилиты Конфигурирования Nconf Опубликовано Muff.Kiev.Ua (
    Nagios 3.x + NConf - настройка системы мониторинга Nagios 3.x и утилиты конфигурирования NConf Опубликовано muff.kiev.ua (http://muff.kiev.ua) Nagios 3.x + NConf - настройка системы мониторинга Nagios 3.x и утилиты конфигурирования NConf Опубликовано muff в Пнд, 2010-10-04 03:48 Понадобилось настроить систему мониторинга. Раньше пользовался с этой целью системой мониторинга Nagios 2.x, однако уже есть возможность воспользоваться 3 версией. В последней не только исправлены найденные ранее ошибки, добавлены новые макросы и многое другое, но и пересмотрен алгоритм сканирования, с целью устранить один из главных недостатков этой системы – медлительность при проверке больших сетей. В 2.х все тесты проходят практически последовательно, а в новой редакции задачи выполняются параллельно. Хотя вторая версия еще развивается, очевидно, что в будущем все силы будут брошены на третью ветку. Проект возник в 2002 году, хотя первое время он был известен как NetSaint. Его лидером является программист Этан Галстад. Само слово Nagios, по информации на сайте www.nagios.org [1], – это рекурсивный акроним, который расшифровывается, как Nagios Ain't Gonna Insist On Sainthood («Nagios не собирается настаивать на святости») – намек на предыдущее название проекта. Функциональность расширяется за счет плагинов и аддонов, большая часть из которых доступна на странице закачки. Общеобразовательная часть завершена, приступаем к установке. Установка будет выполнена из системы портов: # cd /usr/ports/net-mgmt/nagios && make install clean && rehash Автоматически устанавливаются плагины, расширяющие основной функционал системы. Также, во время установки система предлагает создать пользователя и групу nagios. Советую воспользоваться данным предложением. По завершении установки советую ознакомиться с инструкциями, котрые вывел Nagios после установки: ********************************************************************** Enable Nagios in /etc/rc.conf with the following line: nagios_enable="YES" Configuration templates are available in /usr/local/etc/nagios as *.cfg-sample files.
    [Show full text]
  • Automated System Monitoring
    Automated System Monitoring Josh Malone Systems Administrator [email protected] National Radio Astronomy Observatory Charlottesville, VA https://blogs.nrao.edu/jmalone 2 One night, about 8 or 9 years ago, the chiller in our DC failed. Co-worker arrive in the morning to find room was 90F ambient. Quickly set up fans to vent the room. Checked servers - found that main web server had lost both disks in its OS RAID mirror. (15k disks, ran hot) Main page was being served from memory, but the OS was freaking out. We had minimal monitoring scripts. No environment monitoring. No disk health checks. Failure caught us completely by surprise. We decided that we weren’t going to let this happen ever again. Over the next year or so we implemented 2 independent monitoring systems - one for servers/ services and one for environmentals. Set up each system to also monitor the other. WHAT IS AUTOMATED MONITORING? 7 Some sort of dedicated, automatic instrumentation to check services and/or servers Detect and report service problems, server hardware issues Usually provides a central “dashboard” to track problems Can be distributed; but still under control of a central daemon * Diferentiates it from “a bunch of scripts” used to check on things; that doesn’t have the ability to determine cause or eliminate false alarms. Automated Monitoring Workflow 8 Most packages implement this type of workflow Not all packages provide event handlers ack’ing page is important - let’s other admins know that someone is working on the problem so they don’t step on each other’s toes Monitoring Packages: Open Source • • Pandora FMS • Opsview Core • Naemon • • • • • • Captialware ServerStatus • Core • Sensu All Trademarks and Logos are property of their respective trademark or copyright holders and are used by permission or fair use for education.
    [Show full text]
  • Icinga Version 1.9 Documentation
    Icinga Version 1.9 Documentation Icinga Version 1.9 Documentation Next Icinga Version 1.9 Documentation Copyright 2009-2013 Icinga Development Team. Portions copyright © by Nagios/Icinga community members - see the THANKS file in the Icinga core sources for more information.. Credits to Yoann LAMY for creating the Vautour Style we use for the Icinga Classic UI Icinga is licensed under the terms of the GNU General Public License Version 2 as published by the Free Software Foundation. This gives you legal permission to copy, distribute and/or modify Icinga under certain conditions. Read the ’LICENSE’ file in the Icinga distribution or read the online version of the license for more details. Icinga is provided “AS IS” with “NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.” Nagios is licensed under the terms of the GNU General Public License Version 2 as published by the Free Software Foundation. This gives you legal permission to copy, distribute and/or modify Nagios under certain conditions. Read the ’LICENSE’ file in the Nagios distribution or read the online version of the license for more details. Nagios and the Nagios logo are registered trademarks of Ethan Galstad. All other trademarks, servicemarks, registered trademarks, and registered servicemarks mentioned herein may be the property of their respective owner(s). The information contained herein is provided “AS IS” with “NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.” 2013.05.07 1 Icinga Version 1.9 Documentation Revision History Revision 1.9 2013-04-25 1.9 Icinga Documentation Revision 1.x 2010-2012 1.x Icinga Documentation Revision 0.1 2009-08-12 First Release Table of Contents 1.
    [Show full text]
  • Nagios, Pnp4nagios, and Nconf by Patrick Tenhoopen
    WMLUG July 2015 Nagios, PNP4Nagios, and NConf by Patrick TenHoopen What is Nagios? Nagios is an IT infrastructure monitoring and alerting tool. The free Nagios DIY Core provides the central monitoring engine and the basic web interface. Current Version: 4.08 (2014-08-12) Download: https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.0.8.tar.gz Nagios Demo Demo Installation Prerequisites ● gcc ● apache2 ● perl ● php ● rrdtool ● php5-gd ● php5-zlib ● php5-socket Installation Follow Quick-Start Guides https://assets.nagios.com/downloads/nagioscore/d ocs/nagioscore/4/en/quickstart.html After install, don't forget to configure the firewall on the Nagios server to allow http access if one is running. Installation, cont. tar xf nagios-4.0.8.tar.gz cd nagios-4.0.8 ./configure --with-command-group=nagcmd make all make install make install-init make install-config make install-commandmode make install-webconf htpasswd2 -c /usr/local/nagios/etc/htpasswd.users nagiosadmin Nagios Plugins Download: http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz tar xf nagios-plugins-2.0.3.tar.gz cd nagios-plugins-2.0.3 ./configure --with-nagios-user=nagios --with-nagios-group=nagios make make install Configuration Nagios comes with a default configuration for monitoring the localhost that Nagios is installed on (localhost.cfg) plus some other examples. The configuration files are stored at /usr/local/nagios/etc/objects/ and are plain text files formatted in a proprietary format. Detailed description of configuration files and options: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4 /en/objectdefinitions.html Default Configuration Files ● commands.cfg – Check commands that are used in service definitions ● contacts.cfg - Who to contact if an alert is generated ● hosts.cfg - Hosts to monitor ● localhost.cfg - Basic config for Nagios host ● printer.cfg – Sample config for printers ● services.cfg - Things on hosts to monitor ● switch.cfg - Sample config for switches ● templates.cfg - Definition templates used by hosts, services, etc.
    [Show full text]
  • Nconf Perl-API
    NConf perl-API NAME The NConf perl-module (and sub-modules) are a collection of shared functions to be used in perl scripts surrounding the NConf software. They offer an API to NConf internal functionality, namely the database. SYNOPSIS use NConf; setLoglevel($loglevel); use NConf::Logger; logger($loglevel, $msg); use NConf::ImportNagios; %data_ref = parseNagiosConfigFile($class_name, $filename); use NConf::ImportCsv; %data_ref = parseCsv($filename, $class_name, $delimiter); %data_ref = parseHostServiceCsv($filename, $delimiter); use NConf::Helpers; $scalar = readNConfConfig($filename, $php_const_name, 'scalar'); @array = readNConfConfig($filename, $php_array_name, 'array'); @dist_array = makeValuesDistinct(@ref_array); $str_out = replaceMacros($str_in); use NConf::DB; setDbReadonly(1); $dbh = dbConnect(); dbDisconnect(); $q_str = dbQuote($str); use NConf::DB::Read; $id_item = getItemId($item_name, $item_class); $id_srv = getServiceId($srv_name, $id_host); $host_name = getServiceHostname($id_service); $id_attr = getAttrId($attr_name, $attr_class); $naming_attr = getNamingAttr($class_name); $item_name = getItemName($id_item); $item_class = getItemClass($id_item); @ref_array = getItemData($id_item); @ref_array = getItems($class_name); @ref_array = getItemsLinked($id_item); @ref_array = getChildItemsLinked($id_item); %conf_attrs = getConfigAttrs(); %conf_class = getConfigClasses(); $unique_name = getUniqueNameCounter($class_name, $item_name); $retval = checkItemsLinked($id_item, $id_linked, $attr_name); $retval = checkLinkAsChild($id_attr);
    [Show full text]
  • Shinken Documentation Release 2.4
    Shinken Documentation Release 2.4 Shinken Team August 14, 2015 Contents 1 About 1 1.1 About Shinken..............................................2 2 Getting Started 5 2.1 Advice for Beginners...........................................6 2.2 Installations................................................7 2.3 Upgrading Shinken............................................ 11 3 Configuring Shinken 13 3.1 Configuration Overview......................................... 14 3.2 Main Configuration File (shinken.cfg) Options............................. 15 3.3 Object Configuration Overview..................................... 23 3.4 Object Definitions............................................ 25 3.5 Custom Object Variables......................................... 26 3.6 Main advanced configuration...................................... 28 4 Running Shinken 49 4.1 Verifying Your Configuration...................................... 50 4.2 Starting and Stopping Shinken...................................... 50 5 The Basics 53 5.1 Setting up a basic Shinken Configuration................................ 54 5.2 Monitoring Plugins............................................ 59 5.3 Understanding Macros and How They Work.............................. 61 5.4 Standard Macros in Shinken....................................... 65 69subsubsection*.137 70subsubsection*.183 72subsubsection*.236 73subsubsection*.262 5.5 Host Checks............................................... 79 5.6 Service Checks.............................................. 81 5.7 Active Checks.............................................
    [Show full text]