50-10-11

DATA COMMUNICATIONS MANAGEMENT WITH AND BIG BROTHER WATCHING OVER YOUR NETWORK, YOU DO NOT HAVE TO LOOK OVER YOUR SHOULDER (OR YOUR BUDGET)

Daniel Carrere

INSIDE Components of a Linux and Big Brother Network Monitor; Understanding Linux and Big Brother UNIX ; Economizing Without Sacrifice; Details of the Services That Can be Monitored; How Big Brother Monitors Linux’s Hardware Utilization; Big Brother: The ; Automatically Notifying the System Administrator in the Event of a Problem via Pager, E-mail, or Both

INTRODUCTION Many systems administrators in the networking world of today find them- selves attending to multiple systems. As such, these administrators want to automate monitoring and problem detection. To effectively monitor a system, one must consider many aspects, each of which are vital to sys- tem availability. The aspects that require a system monitor’s attention are the states of the services being provided (DNS, NNTP, FTP, SMTP, HTTP, and POP3), the states of the server’s hardware (disk space usage, CPU usage/utilization), and the states of core aspects (essen- tial system processes). In addition, one also would like to be able to de- termine the system uptime, as well as

any warning or status messages. PAYOFF IDEA Most importantly, one would like to Network administrators can combine the Linux be able to accomplish all of these operating system with the Big Brother UNIX net- tasks without repetition. One of the work monitor to view the status of their systems most effective ways to accomplish all without requiring specialized software to view the results. The status information can be accessed the aims without repetitiveness and from anywhere in the world, and the solution pro- provide the results within a singular vides a very low cost of ownership.

04/00 Auerbach Publications © 2000 CRC Press LLC

DATA COMMUNICATIONS MANAGEMENT interface for ease of analysis is with the synergy created by using Linux and the Big Brother UNIX network monitor. Furthermore, by using a combination of Linux and the Big Brother UNIX network monitor, one is afforded the ability to view the status of systems and their associated pro- cesses without having to use specialized software to view the results, since the results are formatted into an HTML document that can be served by the Apache Web server running on Linux. As a benefit of hav- ing a TCP/IP connection available to the monitoring server, one can monitor the systems from anywhere in the world via a TCP/IP connec- tion, provided that the machine serving the results is not blocked from access via a firewall and the machine is using a routable network layer address. Last but not least, an additional benefit of using a monitoring system comprised of Linux and Big Brother is that the total cost of de- ployment involves no associated costs for software (neither system nor monitoring).

COMPONENTS OF A LINUX AND BIG BROTHER UNIX NETWORK MONITOR There are two core components of a Linux and Big Brother network monitoring solution. Those two component parts are: (1) the Linux oper- ating system and (2) the Big Brother networking monitor. Each of these two core components can be broken down into numerous categories; but in light of the expressed purpose of this article being network mon- itoring, the monitoring components and how they relate, integrate, and operate within/into the Linux operating system are discussed.

Big Brother: The Five Core Parts Big Brother is composed of five core components: the central monitoring station (also called the display server), the network monitor, the local system monitor, pager programs, and intra-machine communications programs. Additionally, these five components involve two key pro- grams: bb (the client that runs on the machines being monitored) and bbd (the server program [daemon] running on the central monitor/dis- play server).

Component One: The Central Monitoring Station (Display Serv- er). The central monitoring system/display server effectively accepts the system status reports from the systems being monitored. Through the generation of HTML results, the reports generated can be viewed on vir- tually any computing platform available today. The format of the results has the ability to be customized by simply modifying one of the Bourne shell scripts.

Component Two: The Network Monitor. This segment of Big Broth- er operates using ICMP echo requests (pings). Essentially, the network

Auerbach Publications © 2000 CRC Press LLC

YOU DO NOT HAVE TO LOOK OVER YOUR SHOULDER (OR YOUR BUDGET) monitor contacts each host system listed in its host file. The network mon- itor runs on any UNIX machine and periodically contacts every element listed in the directory_path_chosen_for_installation/bb/etc/bb-hosts file via ping. The results are sent to the central monitor so as to update the system status.

Component Three: The Local System Monitor. It is the local system monitor’s duty to keep a check on the disk utilization and CPU utiliza- tion, and that system processes are running in a desirable fashion. After determining the status of these system aspects, the central monitor is then updated. In the event of problems, the local system monitor has the ability to contact the system administrator.

Component Four: The Pager Programs. The Big Brother client pro- gram that resides on the system being monitored sends the monitoring information to the display server, which then forwards the information using the Kermit modem protocol to the administrator’s pager.

Component Five: Intra-machine Communications Programs. The Big Brother client program sends its status information to the specified display and pager servers to TCP port 1984 (this port number was chosen by Big Brother’s creator, Sean MacGuire, in reference to George Orwell’s book, 1984).

UNDERSTANDING LINUX AND BIG BROTHER UNIX NETWORK MONITORING The workings of the Big Brother network monitor are such that there are two core aspects that fit well with the traditional understanding of cli- ent/server computing. In essence, this means that the stations being mon- itored effectively are being servers (serving information to the display server after obtaining system information using their location client appli- cations) of their system and process statuses to a central monitor that functions as a client. The central monitor polls the servers and obtains status information in a similar respect to any client/server interactive que- ry session. Effectively, the central monitor, the machine collecting the in- formation about the other hosts, is acting as a client. Once the central monitor obtains the information, the Web server running on the central monitor servers the statuses, represented by colored spheres, out to re- questing Web clients in the form of an HTML document so that they can effectively determine the status of their network at a glance.

ECONOMIZING WITHOUT SACRIFICE When one combines Linux and Big Brother, one gains in several areas. The most important benefit is that of the stability afforded by the Linux operating system. For the machines being monitored, the last thing that

Auerbach Publications © 2000 CRC Press LLC

DATA COMMUNICATIONS MANAGEMENT one wants to happen is to have the monitor machine fail. When monitor- ing using the Linux operating system, one gains stability — at least in light of the operating system employed. When combined with high-qual- ity hardware, one has a winning solution. To further ensure reliable mon- itoring, one can have redundant display servers. This can easily be accomplished through the simplicity of Big Brother’s structure being a group of Bourne shell scripts that require only a text editor (vi, emacs, pico, etc.) to modify their operation.

DETAILS OF THE SERVICES THAT CAN BE MONITORED Big Brother can monitor connection, CPU utilization, disk utilization, DNS availability, HTTP (HyperText Transfer Protocol) service availability, IMAP (Internet Message Access Protocol) service availability, MRTG (Multi-router Traffic Grapher) service availability, msgs, POP3 (Post Of- fice Protocol 3) service availability, procs (specified system processes), SMTP (Simple Mail Transport Protocol) service availability, SSH (Secure SHell) service availability, and Telnet service availability. Many of these services are monitored via an ICMP echo request ping command to determine if the system is reachable. Although the non-re- turn of a ping to a given host is not indicative of a host failure, it does alert the system administrator that the matter needs to be investigated in order to determine if there are problems along the transmission line that provides connectivity to the machine. Additionally, in the event of an un- successful ping and verification that the transmission line is functioning properly, the system administrator should then check the cabling to the machine as well as the network interface card(s) into the machine to de- termine the source of the problem.

What Big Brother Monitors Big Brother monitors the following services offered by Linux:

• DNS availability • HTTP (HyperText Transfer Protocol) service availability • IMAP (Internet Message Access Protocol) service availability • MRTG (Multi-router Traffic Grapher) service availability • msgs • POP3 (Post Office Protocol 3) service availability • procs (specified system processes) • SMTP (Simple Mail Transfer Protocol) service availability • SSH (Secure SHell) service availability • FTP service availability • NNTP (Network News Transport Protocol) service availability • Telnet service availability

Auerbach Publications © 2000 CRC Press LLC

YOU DO NOT HAVE TO LOOK OVER YOUR SHOULDER (OR YOUR BUDGET)

DNS availability is accessed via the use of a nameserver lookup. HTTP server process/daemon availability is accomplished using a session of the lynx (a text-based Web browser) to check for a valid HTTP re- sponse as well as output. TCP port 80 is the port queried on the host unless an alternate port is specified in the /etc/services file. IMAP availability verification is accomplished querying the server on TCP port 143 on the host unless an alternate port is specified in the /etc/ser- vices file. POP3 availability verification is accomplished by querying the server on TCP port 110 on the host unless an alternate port is specified in the /etc/services file.

PROCS (PROCESSES) The means via which Big Brother determines whether critical system pro- cesses are up and running are through a Bourne shell script located in the directory in which Big Brother is installed. The specific script location is: /directory_path_chosen_for_installation/bb/etc/bbdef.sh. In the event that a system process defined as critical within this script is no long- er running on the system (as determined by issuing the ps command), Big Brother can use its paging facilities to contact the administrator.

SMTP (Simple Mail Transfer Protocol) The standard TCP port (port 25) that sendmail runs on is examined. Note that sendmail is a Mail Transfer Agent (MTA) used to route messages from one system to another. Sendmail was developed at the University of California at Berkley and is the dominant mail transfer agent on UNIX systems.

The Connection The connection is verified via the ping command. The script in which this aspect of system availability is inquired about is:

/directory_path_chosen_for_installation/bb/etc/bb-network.sh

NNTP (Network News Transport Protocol) By default, the TCP port examined for NNTP is TCP port 119. FTP Service: The standard FTP port (TCP port 21) (unless an alternate port is specified in the /etc/services file is contacted to see if a connec- tion can be established).

Auerbach Publications © 2000 CRC Press LLC

DATA COMMUNICATIONS MANAGEMENT

EXHIBIT 1 — Output of df Command

Filesystem 1024-blocks Used Available Capacity Mounted on

/dev/hda2 249855 24153 212800 10% / /dev/hda5 398124 2551 375012 1% /home /dev/hda1 1435168 793760 641408 55% /mnt/win /dev/hda9 50717 1883 46215 4% /tmp /dev/hda6 1193895 835034 297173 74% /usr /dev/hda10 117087 32 111009 0% /usr/local /dev/hda8 23391 13 22170 0% /usr/misc /dev/hda7 101471 23550 72681 24% /var /dev/fd0 1423 299 1124 21% /mnt/floppy

Telnet Service For Telnet availability status, Big Brother contacts TCP port 23 (unless an alternate port is specified in the /etc/services file).

HOW BIG BROTHER MONITORS LINUX HARDWARE UTILIZATION Big Brother monitors Linux hardware utilization via the standard UNIX commands available on a Linux system. These commands are df (to de- termine the percentage of disk space used), and uptime (which displays the time of day, duration of uptime, number of users, and the process load average — which is indicative of CPU utilization). The Big Brother scripts process the output of these commands and warn the system administrator of problems based on thresholds. For ex- ample, the threshold for disk utilization is 95 percent capacity. As one can see within the output of the command displayed in Exhibit 1, the utilization on the /usr partition is at 74 percent, the Big Brother cli- ent program running on the station being monitored would send a mes- sage to the Big Brother server process running on the display server that would indicate that the disks are in good condition from a standpoint of utilization. Another Big Brother script would run the uptime command and use its output to access the processor’s utilization. CPU Utilization is accessed by the load average given by the issuance of the uptime command:

7:55pm up 52 min, 2 users, load average: 0.22, 0.06, 0.02

BIG BROTHER: THE USER INTERFACE The interface to any system is crucial. Big Brother’s interface allows the many aspects under the monitoring software’s watchful eye to be collect- ed and displayed from a central location that requires no operating sys- tem-specific specialized software. All that one needs to effectively

Auerbach Publications © 2000 CRC Press LLC

YOU DO NOT HAVE TO LOOK OVER YOUR SHOULDER (OR YOUR BUDGET)

EXHIBIT 2 — Top-Level Interface to Big Brother’s Monitoring Results

interact with the Big Brother monitor display is a Web browser. The as- pect of the system generating reports in HTML thus allows one to moni- tor the performance of their systems from anywhere in the world via a consistent user interface across all computing platforms. In Exhibit 2, one notes a the top-level interface to Big Brother’s mon- itoring results. When the monitor is running on a group of specified hosts, the URLs or IP addresses for each host will run horizontally and have an entry for each row. One sees the various system aspects that are under Big Brother’s supervision. In Exhibit 2, one sees the tiles: conn, cpu, disk, dns, ftp, http, imap, mrtg, msgs, pop3, procs, smtp, ssh, and telnet, which represent the various services being monitored by Big Brother. For example,

conn cpu disk dns ftp http imap mrtg msgs pop3 procs smtp ssh telnet www.someurl.com [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] or

10.10.10.1 [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

Auerbach Publications © 2000 CRC Press LLC

DATA COMMUNICATIONS MANAGEMENT

EXHIBIT 3 — The Details of the HTTP Service and Its Status

Drilling Down for Detail The Big Brother systems and network monitor provides drill-down facil- ities in that it allows one to click on one of the colored spheres in a col- umn to display details about the system service or system aspect in question. Exhibit 3 provides a detailed portion of the HTTP service generated by clicking on one of the green spheres above under the HTTP column in Exhibit 2. Below is shown the status of one of the Web servers, as well as details about its uptime.

AUTOMATICALLY NOTIFYING THE SYSTEM ADMINISTRATOR IN THE EVENT OF A PROBLEM VIA PAGER, E-MAIL, OR BOTH In the event of a system problem, Big Brother has the ability to notify the systems administrator via pager by utilizing the Kermit protocol. Exhibit 4 reveals a screen that displays a paging function such that system users can manually page the systems administrator in the even of a problem. Alternatively, one is able to send the administrator an acknowledgment of system status. Exhibit 4 shows the form that users can complete in order to manually page the administrator or send the administrator an acknowldgement.

The Paging Facilities: Codifying System Statuses The pager facilities of Big Brother are such that they provide a means for the administrator to be able to determine system status from a three-digit

Auerbach Publications © 2000 CRC Press LLC

YOU DO NOT HAVE TO LOOK OVER YOUR SHOULDER (OR YOUR BUDGET)

EXHIBIT 4 — The Interface to the Manual Paging Facilities of Big Brother

Users can submit a page to alert the administrator or send acknowledg- ment to the system administrator. code. The three-digit code comes before the IP address of the host in se- quence (in the page) such that the administrator first notices the system status and then, if the status is such that it demands immediate attention based on its relation to system availability, the system administrator then knows the network address/IP address of the machine requiring attention.

Pager Codes. In the event of a problem with one of the systems being monitored, the administrator can either be paged automatically by the system or manually by a user. The format used to send the administrator a page is:

[3 DIGIT CODE] [IP-ADDRESS]

100 Disk error: disk is over 95 percent full 200 CPU error: CPU load average is unacceptably high 300 Process error: an important process has died 400 Message file contains a serious error

Auerbach Publications © 2000 CRC Press LLC

DATA COMMUNICATIONS MANAGEMENT

[3 DIGIT CODE] [IP-ADDRESS]

500 Network error: cannot connect to that IP address 600 Web server HTTP error: server is down 7 — Generic server error: 7 + server port number (e.g., 721 = ftp down) 800 DNS server on that machine is down 911 User page: message is phone number to call back 999 The host reporting an error could not be found in the etc/bb-hosts file

CONCLUSION The intended purpose of this article was to introduce the reader to the Big Brother System and Network Monitor and how it can effectively in- tegrate with Linux to offer an effective, efficient, and stable system and network monitoring solution. The author’s intention was to provide an overview of the services and how they are monitored, as opposed to pro- viding an installation manual, as the accompanying online documenta- tion does an excellent job of guiding one through the installation with its various configuration aspects. This author would like to thank Sean MacGuire for writing such a use- ful program that stresses simplicity, modularity, and extensibility in de- sign and for making such a package available in source code form. It is this author’s belief that Big Brother can effectively provide a monitoring solution for a Linux network or any other UNIX network in which system monitoring is desired.

Notes Big Brother UNIX Network Monitor (Main Page): For Downloading and Information: http://maclawran.ca/bb-dnld/index.html

Multi-router Traffic Grapher (MRTG): http://ee-staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html http://www.ee.ethz.ch/stats/mrtg/

Daniel Carrere is with Open Systems Consulting in Milledgeville, Georgia.

Auerbach Publications © 2000 CRC Press LLC