IRC Control Bot for Zabb Monitoring System
Total Page:16
File Type:pdf, Size:1020Kb
MASARYK UNIVERSITY FACULTY OF INFORMATICS IRC control bot for Zabb monitoring system BACHELOR'S THESIS Filip Zachar Brno, Fall 2015 Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Filip Zachar Advisor: RNDr. Adam Rambousek, Ph.D. i Acknowledgement I would like to thank my supervisor RNDr. Adam Rambousek, Ph.D. who supported me during writing this thesis and provided useful feedback. I would also like to thank Marek Mahut for advice and consultation regarding technical details for Zabbix monitoring system and for feedback provided during the test stage of this work. ii Abstract Applications these days runs as distributed systems consisting of many parts working together. This thesis discuss the necessity of mon• itoring these parts and describes the design and implementation of IRC control bot for Zabbix monitoring system that provides extended features through IRC communication network. iii Keywords Zabbix, monitoring, IRC, DevOps, Ruby iv Contents 1 Introduction 1 2 Monitoring 2 2.1 Monitoring system 3 2.2 MRTG -MultiRouter Traffic Grapher 3 2.3 Cacti 5 2.4 Nagios 7 2.5 Zabbix 8 2.5.1 Architecture 8 2.5.2 Data collection 11 2.5.3 Data Visualization 13 2.5.4 Alerts and Triggers 14 2.5.5 Maintenance 15 2.5.6 API 15 3 IRC - Internet Relay Chat 17 3.1 Architecture 17 3.2 Conferencing 18 3.3 Bot 19 4 Zabbirc 21 4.1 Goal 21 4.2 Architecture 21 4.2.1 Zabbix API 22 4.2.2 IRC API 23 4.3 Implementation 24 4.3.1 Zabbix component 24 4.3.2 Services component 25 4.3.3 IRC component 25 4.4 Features 26 4.4.1 Events 26 4.4.2 Hosts 27 4.4.3 Maintenance 28 4.4.4 Settings 28 4.5 Installation 28 5 Conclusion 30 Index 31 Bibliography 31 v List of Tables 3.1 Operator privileged actions 3.2 Operator privileged actions List of Figures 2.1 Screenshot of MRTG web page [4] 4 2.2 Screenshot of RRDtool generated graph 5 2.3 Screenshot of Cacti web interface 6 2.4 Screenshot of Nagios web interface 7 2.5 Zabbix deployment model for large environments 9 2.6 Zabbix deployment model using proxy servers 10 2.7 Graph showing three items in stacked format 12 2.8 Map representing physical infrastructure of a web service 13 2.9 Screen showing map with appropriate graphs alongside 14 2.10 Trigger rules 15 2.11 JSON-RPC API login request 16 3.1 IRC network 18 4.1 Zabbirc basic architecture 22 4.2 Zabbirc components interaction 24 4.3 Definition of the matchers for Zabbirc commands 26 4.4 Event notification and acknowledgment 27 4.5 Host status reporting 27 4.6 Installation steps for Zabbirc 29 vii 1 Introduction Over 3 billion people use Internet on daily basis nowadays and the number is still growing [1]. But Internet does not longer consist of just simple hypertex pages as it was at source of early World Wide Web era. It is full of dynamic web applications and services like social networks, video streaming services, storage exchange services or even office like applications that were used mainly as a desktop software. These applications serves to the millions of people at once. To accomplish this scale of availability they are no longer implemented on a single server but as more sophisticated distributed system. Distributed system consists of many moving parts working to• gether thus more points of failure are present. The whole system can be designed to be able to operate with some parts missing but to achieve the best reliability they should be monitored in order to pretend possible failures or be able to react to the occurred ones as soon as possible. In the Chapter 2 we describe what monitoring of the system con• sists of and discuss several existing monitoring solutions. Later in the chapter the thesis is focused on Zabbix monitoring system, it's capabilities and options to extend the system. The Chapter 3 describes IRC1 protocol used for communication in the bigger scale. It describes the architecture of the IRC network and discusses applicability of the IRC bots. The main goal of this thesis is to design and implement an IRC control bot that will serve as a gateway for controlling a Zabbix moni• toring system. The architecture, implementation and the features of the bot are described in the Chapter 4. The chapter also justifies used technologies and libraries that were used to create the bot. 1. Internet Relay Chat 1 2 Monitoring Most of the applications we use on the daily basis and provides us sim• ple interface to accomplish demanded task are however much more complicated under the hood when we look at the architectural and implementation details. What started as a single server service main• tained by one system administrator can easily grow to the distributed system that sits in the cloud. This more complex distributed architec• ture allows the service providers to handle the enormous amount of the customers that use various internet services these days in order to provide more redundancy, availability and speed. With distributed environment like that, it is really important to keep all the required units in healthy state. The company that is creating a product has to do an important decision about the deployment. • It can build it's own bare metal infrastructure that will run the product. Choosing this option it will have to provide all the maintenance that the infrastructure needs to ensure product's availability. To accomplish this usage of the various monitoring systems is recommended. • It can use some of the existing cloud platforms to host their appli• cation and thus bring a layer of abstraction to the infrastructure. It defines required services in a declarative way and the cloud platform takes care about fulfilling the defined requirements. Despite of what option the company chooses, there must be a bare metal infrastructure somewhere underneath. This cluster of devices in the infrastructure forms a computer network that faces several chal• lenges. The amount of data flowing in the network is almost constantly growing. Application data, media streams, backups, database queries and replication tend to saturate bandwidth just as much as they eat up storage space. To avoid a congestion in the network and outage of storage room on the nodes system administrators need to have good overview of the infrastructure status by visualizing the right data in the right way. This can be achieved by using a monitoring system. In this chapter we will talk about what a monitoring system is and what should it provide. We look at several existing monitoring systems 2 2. MONITORING and we will shortly discuss about their advantages, disadvantages and main characteristics. 2.1 Monitoring system Monitoring system is a piece of software that collects data from several sources, analyzes the data and gives a sophisticated visualization about the data. The data source can be any component in the network. The monitor• ing system usually supports the data collection by standard methods like SNMP1 therefore any device that has implemented this protocol is potential data source for the monitoring system. It can also use custom API2 to retrieve information from the monitored devices using agent approach. Agent is a small program running on the monitored device that gathers information about the device and communicates with the monitoring system using the conventional protocol. Agents are mostly used to monitor computers in the network since they require more computation performance. The collected data are analyzed using the rules configured on the monitoring system. The monitoring system checks if there is any threshold overlap and takes appropriate action to respond to the event. The data visualizations consists of various graphs generated from the data. This graphs shows the data in more informative way and also shows the historical information about the metric. Several monitoring systems are available with good community and enterprise support. This thesis takes into a consideration four of them: MRTG, Cacti, Nagios and Zabbix. 2.2 MRTG - Multi Router Traffic Grapher Multi Router Traffic Grapher (MRTG) was initially just a perl script which used external utilities to perform SNMP queries and create GIF images for display on the HTML pages (Figure 2.1). This script was being executed every 5 minutes and showed accumulated data in the graphs for last day, week, month and year. It was written by Tobias 1. Simple Network Management Protocol 2. Application Programming Interface 3 2. MONITORING M ax S che dule d Che cks: 50.0 Average S che dule d Che cks: 44.0 Current S che dule d Che cks: 45.0 Max On-DemandChecks: 45.0 Average On-DemandChecks: 10.0 Current On-DemandChecks: 10.0 Figure 2.1: Screenshot of MRTG web page [4] Oetiker in 1995 who was working at De Montfort University Leicester in United Kingdom as a System Administrator Trainee. At that time the university had 64 kBit Internet link and the management was not planing to increase this link any soon. The performance data provided by MRTG proved to bey a key argument in convincing management about necessity of the faster Internet link. [2, 3] One of the main problems in the first version of MRTG was perfor• mance. Monitoring 10 switch links worked fine but in larger environ• ments it encountered it's limits.