Project Report - CERN Summer Student Program 2014

Icinga Monitoring System Interface

The University of Manchester - Department of Science Author: Supervisor: Alina-Georgiana Neculae Olivier Chaze

August 23, 2014 Abstract

The aim of this project is to develop a web interface that would be used by the Icinga monitoring system to manage the CMS online cluster, in the experimental site. The interface would allow users to visualize the informa- tion in a compressed and intuitive way, as well as modify the information of each individual object and edit the relationships between classes. Acknowledgments

I would like to thank my supervisors Olivier Chaze and Lavinia Darlea for their continued support and careful guidance over the course of this project.

1 Contents

1 Introduction 3 1.1 Motivation ...... 3

2 Database motivation and configuration 5 2.1 Database ...... 5

3 Tools used for development 8 3.1 Front end ...... 8 3.2 Back end ...... 9

4 Interface 10 4.1 Read-only interface ...... 10 4.2 Read-write interface ...... 12

5 Conclusions and future work 15

2 Chapter 1

Introduction

Icinga is an open source network and computer system monitoring applica- tion, which was originally created as a fork to , in 2009. Since then, the developers of Icinga have tremendously improved the software, in order to overcome some of Nagios’s perceived short-comings. Through a continuous reflection of the community’s needs, some of the fea- tures added to Icinga are: a modern Web 2.0 style user interface, additional database connectors for MySQL, Oracle and PostgreSQL and a REST API that allows the integration of numerous extensions without any additional modifications to the Icinga core. The CMS System Administration Team uses Icinga to manage between 2500 and 3000 devices and their services, for the online cluster in the experimen- tal site. At the moment, the software relies on a flat file configuration to retrieve the cluster organization and any specific device information.

Motivation

The need for this project arises from the fact that, each time the system is updated or a new device is introduced, the Icinga flat files would have to be modified manually for that change to be perceived by the software. In response to this problem, a database containing all the relevant information about the cluster, was created, and through a series of scripts, the database can be used to generate the series of flat files required by Icinga. The project presented in the following chapters represents a front end to the database, and it enables the users to view the environment in a more struc- tured and compact way. Furthermore, it allows operations like the insertion

3 or removal of hosts to be monitored, the addition, removal or modification of checks and so on. In addition to this, given the time, the interface developed needs to be inte- grated with the central management system, used to install and remember the profile of each machine, Puppet. Puppet, is then used to determine the processes that need to be monitored and would update the database accord- ingly, using the developed interface.

4 Chapter 2

Database motivation and configuration

The CMS System Administration Team currently monitors over 2500 de- vices, with a total of 50 000 checks. The Icinga flat files document what hosts need to be monitored and which checks need to be performed on each of those, for example: available disk space, number of processes running and so on. When a new machine has to be introduced into the environment, or the role of a machine has to be modified, the changes have to be made manually to each file, thus leaving room for errors or omissions, as a result what is currently monitored does not cover the entire infrastructure anymore. Due to the fact that the configuration file structure is very similar to a database schema, it was decided that translating the files into such a for- mat would greatly facilitate the application of any required modifications to the system. Especially since the database structure could be automatically translated to the flat files required by Icinga, by applying a series of scripts.

Database

The database structure presented above describes the current configuration of the CMS cluster. The diagram shows how the main group of objects interact and form dependencies, as well as the main characteristics that de- fine each class. The central classes that are used by the system are: Host, Service, Command, Contacts.

5 Each fundamental class is then organized into groups, depending on the properties they have in common as well as the requirements they share. An- other use of the groups is to describe dependencies between large groups of basic objects, one example of this is shown by the ContactGroupHost table, which suggests that a group of contacts is responsible for a particular host and should be contacted in case of emergencies. Understanding the complexity of the relationships that can be described by the central classes, was the first step required for the completion of this project. One of the main requirements of this project was to develop a user friendly and intuitive way of displaying the complex relationships created, while maintaining the readability and usability of the system.

6 ScheduledDowntime ScheduledDowntimeID INT(10) HostScheduledDowntime SDTComment CHAR(100) FK_HSDT_HName CHAR(100) 1..* 1 SDTAuthor CHAR(100) FK_HSDT_ScheduledDowntimeID INT(10) SDTStartTime DATETIME Indexes SDTEndTime DATETIME 1..* Indexes HostDependency TimePeriod FK_HD_HName CHAR(100) TPName CHAR(100) FK_HD_PName CHAR(100) TPMondayStart TIME Indexes TPMondayEnd TIME TPTuesdayStart TIME 1..* 1..* HostHostGroup TPTuesdayEnd TIME 1..* FK_HHG_HGName CHAR(100) TPWednesdayStart TIME FK_HHG_HName CHAR(100) TPWednesdayEnd TIME Indexes 1 1 1 1 TPThursdayStart TIME 1..* TPThursdayEnd TIME Host 1 TPFridayStart TIME 1..* HName CHAR(100) 1 HostGroupDependency 1..*1 TPFridayEnd TIME FK_H_TPName CHAR(100) HostGroup FK_HGD_HGName CHAR(100) TPSaturdayStart TIME Indexes 1..*1 HGName CHAR(100) FK_HGD_MemberName CHAR(100) TPSaturdayEnd TIME Indexes Indexes TPSundayStart TIME 1..* 1 1 0..1 1 TPSundayEnd TIME ContactGroupHost Indexes ServiceHost 1..* FK_CGH_CGName CHAR(100) 1..* FK_SH_HName CHAR(100) FK_CGH_HName CHAR(100) FK_SH_SName CHAR(100) 1 Indexes ContactGroup Indexes CGName CHAR(100) 1 Indexes 1..* 1..* 1 ContactGroupService Service FK_CGS_CGName CHAR(100) SName CHAR(100) FK_CGS_SName CHAR(100) SService_description CHAR(100) Indexes 1..* SActive_checks_enabled TINYINT(4) SPassive_checks_enabled TINYINT(4) ContactContactGroup 1..* SParallelize_check TINYINT(4) FK_CCG_CName CHAR(100) 1 1..* SObsess_over_service TINYINT(4) FK_CCG_CGName CHAR(100) SCheck_freshness TINYINT(4) Indexes SNotifications_enabled TINYINT(4) 1..* SEvent_handler_enabled TINYINT(4) 1 SFlap_detection_enabled TINYINT(4) Contact SFlap_detection_options CHAR(100) CName CHAR(100) 0..1 SFailure_prediction_enabled TINYINT(4) 1 1..* CAlias CHAR(100) SProcess_perf_data TINYINT(4) SRetain_status_information TINYINT(4) CMail CHAR(100) ServiceHostGroup SRetain_nonstatus_inf7 ormation TINYINT(4) CPhone CHAR(100) 1..* FK_SHG_HGName CHAR(100) Indexes SIs_volatile TINYINT(4) FK_SHG_SName CHAR(100) SCheck_period CHAR(100) Indexes 0..1 SMax_check_attempts TINYINT(4) 1..* 1 1..* SNormal_check_interval INT(10) SCheck_interval INT(10) CommandContact SRetry_check_interval INT(10) ServiceGroup FK_CC_ContactName CHAR(100) SContact_groups CHAR(100) SGName CHAR(100) FK_CC_CommandName CHAR(100) SNotification_options CHAR(100) Indexes CC_NotificationType CHAR(100) SNotification_interval INT(11) Indexes SNotification_period CHAR(100) 1 1..* 1 SRegister TINYINT(4) 1..* SAction_url CHAR(100) 1..* 0..1 SParameter CHAR(100) ServiceServiceGroup FK_S_SName CHAR(100) Command 1..* FK_SSG_SName CHAR(100) FK_S_CName CHAR(100) CName CHAR(100) 0..1 FK_SSG_SGName CHAR(100) FK_S_HGName CHAR(100) CLine VARCHAR(500) Indexes Indexes Indexes , 0

Figure 2.1: Database schema used to generate the Icinga configuration files. Chapter 3

Tools used for development

The tools used for the development of this project, were chosen after careful consideration. The main areas that were investigated, in order to determine the perfect candidates were: the documentation or resources provided, since the system has to be easily maintainable; the scalability of the application developed; the stability and speed offered, as well as the level of complexity of the environment, the tools used require to function.

Front end

For this part of the interface, a number of tools were investigated, but after careful consideration, the JavaScript InfoVis Toolkit was chosen as the main library used for visualizing the data and the relationships formed. This li- brary provides web standard based tools to create interactive data visualiza- tions for web applications. Besides the fact that it provides a large number of visualization techniques, such as PieChart, Sunsurst, Force-Directed graphs and many more, it is also very fast, relying on json objects to generate dia- grams. Furthermore, it allows a large degree of specialization such as touch events, drag and drop and a large number of other complex animations that can have any style property required. The remaining part of the interface elements were developed using HTML5 and the Twitter Bootstrap library, due to their large usability and online resources.These tools were selected due to their ability to deliver rich con- tent without needing any external plugins, their speed of execution and the increased stability they offer the application.

8 Back end

Even though a NoSQL database would allow the application to take more performance out of the system, thus increasing the speed of the operations and the overall response rate of the application, the downside of using this technique was too great to be used for this project. One problem encoun- tered was that, the amount of time required to learn the different concepts and all the best practices was too great for the amount of time allocated for the project. Another downside was that the complexity of the application wasn’t in the database queries, but in the functionality behind, as a result, the language chosen to communicate to the database was MySQL. For the last development tool required, the choice was made between Python and PHP. Both languages are supported by large developer communities, are extremely portable since they can run on almost any platform without re- compilation, are high level languages, etc. Despite all this, PHP was chosen due to the simplicity of the environment required and the fact that it is commonly installed, as a result reducing the level of maintenance required, while offering the same degree of functionality.

9 Chapter 4

Interface

The development process of the Icinga configuration interface was divided in two main parts. The development of the read-only interface which was tar- geted mainly for general users, as an easier way of visualizing the data and retrieving information, and the implementation of the read-write interface, which was designed for experienced users that manage the devices cluster updates or modifications.

Read-only interface

Using this interface, a user will be able to visualize the most important in- formation about each group of central objects as well as the relationships they create with other objects or groups of objects. All the views created for the read-only interface are described in large lines in the following sections.

The first view accessible allows the user to view the existing host group dependencies as well as the hosts that populate a host group, and the services applied to either the entire group or each individual host.

Another view is the Contact Groups - Services, in which the user can visualize the existing contact groups, from the database, as well as each con- tact contained in each group and the services associated with each contact group.

The Host - Services view allows the user to search for a host name, using wild-cards and displays the services that are associated with each host, either

10 Figure 4.1: Series of views available from the Host Group - Host menu entry. From left to right: Host groups dependencies, hosts contained in host group, showing all the services applied to each host individually, services applied to each host group and their corresponding commands.

Figure 4.2: Series of views available from the Contact Group - Services menu entry. From left to right: Contact groups and the contacts they contain, services applied to each contact group. directly or through the host groups that contain that particular host.

Figure 4.3: Series of views available from the Host - Services menu entry. From left to right: Main search page, to look for a host name, a list of hosts that contain the search string provided, the list of services applied to a host.

Lastly, the Host Dependencies view shows in the main window all the parent hosts, as registered in the database. After activating a host name, we can observe all the children of the current host. Again clicking on a child name leads us to a new page which displays all the information related to a host: host groups that contain the host, services run on the host, contact groups associated with it.

11 Figure 4.4: Series of views available from the Host Dependencies menu entry. From left to right: The list of all parent hosts, the expanded parent host, showing all the children associated, all the information relevant to a host, in one view.

Besides the main views available, each page has a search function that enables the fast retrieval of all the data associated with the object of interest.

Figure 4.5: Series of views available after entering a search query. From left to right: Results after searching for a host group, results of a query made in the Host Dependency view.

Read-write interface

As mentioned previously, from the read-write interface a user will be able to modify the properties and information of every central element of the system. The views created, as well as their properties are explained in the following section.

The first view, Host Group - Host contains the same information as the equivalent page in the read-only interface, with the difference that on right click on any of the elements in the graph: Host group name, Host name or Service name, Command name a pop-up window opens. Each pop-up window introduces the options of modifying the host group information (eg: name), removing a host from the group, adding an existing host to the group or creating a new host and adding it to the group (all these options are available for a service or a command, as well).

12 Figure 4.6: Pop-up windows with modifiable information in the Host Group - Host view. From left to right: Modify Host group, modify host, modify service information.

The next view, Contact Group - Contacts displays all the contact groups, as well as the contacts contained in them. To modify an object works as before, a pop-up window appears on right click on the object’s name. Each pop-up window displays as before all the information relevant to the object as well as options of adding an existing contact to the group, removing a contact from the group, creating a new contact, or modifying the object’s information.

Figure 4.7: Pop-up window with the modifiable information available from the Contact Group - Contacts view. From left to right: Modify contact group, modify contact information

The last view Command - Contact allows the users to view and modify all the Command information as well as the contacts associated with each command, and modify these relationships in the same way as before. Due to the fact that the number of commands recorded in the database is very large and the number of commands that have contacts associated is much smaller, there are two different views to display each situation, thus making sure that the system will be able to scale correctly.

13 Figure 4.8: View available from the Command - Contact menu entry. From left to right: commands with contacts associated, independent commands with options of modifying the information.

Figure 4.9: Pop-up window with the modifiable information available from the Command - Contact view. From left to right: Modify command and modify contact information.

14 Chapter 5

Conclusions and future work

As it can be observed from this report, the Icinga configuration interface has been fully completed and after thorough testing, it can be concluded that all the functionality included is working correctly. Furthermore, the finished application is stable and scalable, various scenarios having been taken into account, for example, there are a couple of views for which the information has been segmented, in order to ensure readability. One such case is the Command - Contact view, for which the Commands are displayed in two separate pages, depending on whether the command has any contacts asso- ciated or not. Some possible improvements would be to integrate the developed interface with the Configuration Management System (Puppet), in order to automat- ically insert hosts into the database with the appropriate checks and in the appropriate groups. Due to lack of time, this part of the project could not be approached, but the existing system has been set up in such a way, as to facilitate such an integration.

15 Bibliography

[1] Asay, Matt Open-source working as advertised: ICINGA forks Nagios, 2009

[2] Icinga Documentation, 2010

[3] Puppet resources, 2014

[4] JavaScript InfoVist Toolkit Documentation, 2014

[5] Bootstrap Library Documentation, 2014

16