Setup of a Ganglia Monitoring System for a Grid Computing Cluster

Total Page:16

File Type:pdf, Size:1020Kb

Setup of a Ganglia Monitoring System for a Grid Computing Cluster Setup of a Ganglia Monitoring System for a Grid Computing Cluster Aufbau eines Ganglia-Überwachungssystems für ein Grid-Computing-Cluster von Oleg Ivasenko Bachelorarbeit in Physik vorgelegt der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen im September, 2015 angefertigt im Physikalischen Institut IIIb bei Prof. Dr. Achim Stahl Zusammenfassung Das RWTH Grid Cluster ist ein Tier 2/3 Rechenzentrum im WLHC Grid mit über zwei hundert Rechnern. Diese müssen bei Betrieb überwacht werden, um aufkommende Probleme analysieren zu können. Da das aktuell laufende System gewisse Inkompatibilitäten zu neuen Versionen des laufenden Betriebssystems zeigt, soll es durch Ganglia ersetzt werden. In dieser Arbeit wird der Aufbau des Ganglia Monitoring Systems und die Anpassung dessen an die Besonderheiten des RWTH Grid Clusters vorgenommen. Das Ganglia Monitoring System überwacht das gesamte Cluster und zeigt sich weiterhin skalierbar für zukünftige Erweiterungen. Abstract The RWTH Grid Cluster is a Tier 2/3 grid site in the WLHC Grid with over two hundred com- puters. Those have to be monitored while operating to analyze occurring problems. The currently running monitoring system shows incompatibilities with newer versions of the running operating system and thus should be replaced by Ganglia. In this thesis the setup and the customization of the Ganglia Monitoring System to suit the RWTH Grid Cluster specic needs is detailed. The Ganglia Monitoring System monitors the whole Cluster and proves itself as further scalable for future extensions. Contents List of Figures I List of Tables II List of Listings II 1 Introduction 1 1.1 Worldwide LHC Computing Grid . .2 1.2 RWTH Grid Cluster . .3 2 Monitoring 5 2.1 Ganglia Monitoring System . .5 2.1.1 gmond . .6 2.1.2 RRDtool . 11 2.1.3 gmetad . 13 2.1.4 gweb . 15 3 Deployment 17 3.1 Installation . 17 3.2 Conguration . 17 3.2.1 Database . 17 3.2.2 Network Topology . 18 3.2.3 IO Optimisations . 18 3.3 Modules . 19 3.3.1 Torque . 20 3.3.2 Maui . 22 3.3.3 dCache . 23 3.4 Ganglia Web . 25 3.5 Extensions . 26 3.5.1 Job Monarch . 26 4 Conclusion 29 List of Figures 1.1 CERN's site plan [6] . .1 1.2 Hierarchical structure of the WLCG . .2 1.3 RWTH Grid Cluster overview . .3 2.1 Ganglia Monitoring System overview . .6 2.2 Structural overview of gmond . .6 2.3 File structure of a Round Robin Database . 12 2.4 Normalization and consolidation of reported data . 13 2.5 Structural overview of gmetad. LT: listening threads, IP: interactive ports . 13 2.6 Structure of the directory containing the RRDs . 14 2.7 Main navigation tabs of gweb . 15 2.8 Views of the main tab of Ganglia Web . 16 3.1 Impact of the monitoring system on the hosting node . 19 3.2 Optimized metric_callback function . 20 3.3 Update function for node statistics . 20 3.4 Update function for general job statistics . 20 3.5 Update function for jobs by queue statistics . 21 3.6 Update function for jobs by owner statistics . 21 3.7 Update function for job eciency by owner statistics . 22 3.8 Update function for reservations statistics . 23 3.9 Update function for fair share statistics . 24 3.10 dCache pool usage collection query . 24 3.11 dCache queues collection query . 25 3.12 Graph generated from the nodes statistics . 26 3.13 dCache view from the `Views' tab . 27 3.14 Job Monarch user interface . 28 4.1 Fully implemented Ganglia Monitoring System . 29 I List of Tables 1.1 List of the 13 Tier 1 grid sites[29] . .2 2.1 List of default metrics: reformatted output of gmond -m part 1 . .9 2.2 List of default metrics: reformatted output of gmond -m part 2 . 10 3.1 RWTH Grid Cluster network topology . 18 List of Listings 2.1 gmond global settings . .7 2.2 gmond cluster settings (Worker Nodes) . .7 2.3 gmond host settings . .7 2.4 gmond udp send settings . .8 2.5 gmond udp receive settings . .8 2.6 gmond tcp accept settings . .8 2.7 gmond modules settings . .8 2.8 gmond collection group settings . 10 2.9 Skeleton python module[18] . 11 2.10 gmetad settings . 14 3.1 Output of showq -r ................................... 22 3.2 Output of showres -n .................................. 23 3.3 Output of mdiag -f (truncated) . 23 3.4 Node Report denition . 25 3.5 dCache view denition . 26 3.6 Job Monarch conguration . 27 II 1| Introduction Build by the European Organization for Nuclear Research (CERN), the Large Hadron Collider (LHC) is the largest experimental facility in existence. It is located in a tunnel beneath the Swiss French border near Geneva, on the former site of the Large Electron-Positron Collider (LEP). In contrast to the LEP the LHC is able to accelerate much heavier particles and achieve signicantly higher collision energies. The LHC was build with collision energies of 14 TeV in mind, which corresponds to protons traveling at 99.9999991% the speed of light. Initially it was operated at much lower energies, and even this year as phase two of the experiment is started it still operates at 13 TeV. Higher collision energies enable physicists to observe physics on a much smaller scale with the possibility to generate even heavier and possibly previously unobserved particles. Just this month (August 2015) a paper[2] on the rst observation of a pentaquark was submitted for peer review. CMS North Area ALICE LHC LHCb 40 TT41TT TI8 neutrinos TI2 TT10 SPS ATLAS CNGS TT60 Gran Sasso TT2 AD BOOSTER ISOLDE p p East Area n-ToF LINAC 2 PS CTF3 neutrons e– LINAC 3 LEIR Ions Figure 1.1: CERN's site plan [6] Inside the LHC, two beams of protons or lead ions in two separate tubes are accelerated to nearly the speed of light in opposite direction from each other. A beam consists of 2808 bunches and each bunch contains about 1011 protons. 9593 superconducting magnets[5] keep the two beams on a circular trajectory of 27 km in circumference and aect the form of the beams. To maximize the collision rate the beams are highly concentrated at the four experiment sites (g. 1.1). These are the only sections of the LHC where the two tubes intersect and collisions are possible. 1. Compact Muon Solenoid (CMS)[7] is a general purpose detector with a solenoid magnet at its core that forces charged particles onto a circular trajectory. From the curvature of the trajectory the momentum of the particle can be determined. The goals of the CMS experiment are the study of the standard physics model, the search for extra dimensions and new particles. 2. ATLAS[4] has the same purpose as CMS, but is build up dierently. The magnetic eld in ATLAS is weaker and engulfs fewer of the sub-detectors. 3. Large Hadron Collider beauty (LHCb)[17] is used to examine the dierences of particles and antiparticles, particularly the b quark, with the goal to explain the absence of antimatter in the universe. 1 1.1. WORLDWIDE LHC COMPUTING GRID CHAPTER 1. INTRODUCTION Country Tier 1 Grid site Tier 0 Tier 3 Tier 1 Canada TRIUMF Germany KIT Tier 2 Spain PIC France IN2P3 Italy INFN Nordic countries Nordic Datagrid Facility Netherlands NIKHEF / SARA Republic of Korea GSDC at KISTI Russian Federation RRC-KI and JINR Taipei ASGC United Kingdom GridPP US Fermilab-CMS US BNL ATLAS Figure 1.2: Hierarchical structure of the WLCG Table 1.1: List of the 13 Tier 1 grid sites[29] 4. A Large Ion Collider Experiment (ALICE)[3] examines quark-gluon plasma. A phase that matter is believed to have been in just after the Big Bang. There are two more smaller experiments: TOTal Elastic and diractive cross section Measure- ment (TOTEM) that is placed in the same cavern as the CMS detector and Large Hadron Collider forward (LHCf) that is placed in the same cavern as the ATLAS detector. Both measure the same events as CMS and ATLAS respectively, but at a much more acute angle of diraction. As of 2015 the data from TOTEM and CMS will be combined to increase accuracy[32]. As soon as protons collide in one of the interaction points new particles are formed and decay into other particles. The resulting cascade of particles is interacting with the detectors. To be able to detect dierent particles the large detectors like CMS consist of a multitude of smaller sub- detectors that each have the ability to detect certain types of particles. Each collision is considered as one event. About 600 million events occur every second[16], which is why the data has to be ltered with hardware and software lters to retain only the events deemed as interesting (e.g. where four muons were detected, which is a typical signature of a certain Higgs boson dacay). Even with proper selection those experiments collect as much as 15 petabytes of data per year or 1 gigabyte per second. In phase two runs, which start this year, data rates of 6 GB/s are expected on average and peaking rates of up to 10 GB/s[36] are possible. Storing, distributing and analyzing this amount of data goes beyond the scope of what a single institution could provide in resources, which is why an international network of computing facilities that share their resources was initiated. 1.1 Worldwide LHC Computing Grid The Worldwide LHC Computing Grid (WLCG)[35] is the world's largest computing grid. Combin- ing the resources from multiple international and regional grids, it provides the necessary resources to process and store the data generated by the LHC experiments.
Recommended publications
  • Monitoring Cluster on Online Compiler with Ganglia
    Monitoring Cluster on Online Compiler with Ganglia Nuryani, Andria Arisal, Wiwin Suwarningsih, Taufiq Wirahman Research Center for Informatics, Indonesian Institute of Sciences Komp. LIPI, Gd. 20, Lt. 3, Sangkuriang, Bandung, 40135 Indonesia Email : {nuryani|andria.arisal|wiwin|taufiq}@informatika.lipi.go.id Abstract strength and weaknesses in certain part compared with the others. For the example, Ganglia is more concerned Ganglia is an open source monitoring system for high with metrics gathering and tracking them over time performance computing (HPC) that collect both a whole while Nagios has focused in an alerting mechanism, cluster and every nodes status and report to the user. Visuel has an additional function to visualize the We use Ganglia to monitor our performance data of MPI application and Clumon spasi.informatika.lipi.go.id (SPASI), a customized- offers web-based pages to show the detailed fedora10-based cluster, for our cluster online compiler, information about resources, queues, jobs and CLAW (cluster access through web). Our experience on warnings[2, 3, 6]. using Ganglia shows that Ganglia has a capability to At present, we are developing online compiler for view our cluster status and allow us to track them. high performance computing environment focusing on Keywords: high performance computing, cluster, cluster developing web-interface to ease user (especially monitoring, ganglia programmer) on using cluster infrastructure named . CLAW (cluster access through web). Online compiler 1. Introduction is a web-based application that provides service for The need for high performance computer which writing, compilation, and execution of source code closely related with supercomputer and massively through web.
    [Show full text]
  • Ganglia Users Guide 7.0 Edition Ganglia Users Guide 7.0 Edition Published Dec 01 2017 Copyright © 2017 University of California
    Ganglia Users Guide 7.0 Edition Ganglia Users Guide 7.0 Edition Published Dec 01 2017 Copyright © 2017 University of California This document is subject to the Rocks® License (see Appendix: Rocks Copyright). Table of Contents Preface.....................................................................................................................................................................v 1. Overview.............................................................................................................................................................1 2. Installing.............................................................................................................................................................2 2.1. On a New Server.....................................................................................................................................2 2.2. On an Existing Server..............................................................................................................................2 3. Using the Ganglia Roll.......................................................................................................................................4 3.1. Using the Ganglia Roll............................................................................................................................4 4. Customizing the Ganglia Roll...........................................................................................................................6 4.1. Customizing the Ganglia Web interface..................................................................................................6
    [Show full text]
  • Monitoring Systems and Tricks of the Trade
    Enabling Grids for E-sciencE Monitoring Systems and Tricks of the Trade Antun Balaz Scientific Computing Laboratory Institute of Physics Belgrade http://www.scl.rs/ 21 Jan – 01 Feb 2009 www.eu-egee.org FP7-INFRA-222667 The 2nd workshop on HPC, IPM and Shahid Beheshti University, Tehran, Iran Overview Enabling Grids for E-sciencE • Ganglia (fabric monitoring) • Nagios (fabric + network monitoring) • Yumit/Pakiti (security) • CGMT (integration + hardware sensors) • WMSMON (custom service monitoring) • BBmSAM (mobile interface) • CLI scripts • Summary FP7-INFRA-222667 The 2nd workshop on HPC, IPM and Shahid Beheshti University, Tehran, Iran Ganglia Overview Enabling Grids for E-sciencE • Introduction • Ganglia Architecture • Apache Web Frontend • Gmond & Gmetad • Extending Ganglia – GMetrics – Gmond Module Development FP7-INFRA-222667 The 2nd workshop on HPC, IPM and Shahid Beheshti University, Tehran, Iran Introduction to Ganglia Enabling Grids for E-sciencE • Scalable Distributed Monitoring System • Targeted at monitoring clusters and grids • Multicast-based Listen/Announce protocol • Depends on open standards – XML – XDR compact portable data transport – RRDTool - Round Robin Database – APR – Apache Portable Runtime – Apache HTTPD Server – PHP based web interface • http://ganglia.sourceforge.net or http://www.ganglia.info FP7-INFRA-222667 The 2nd workshop on HPC, IPM and Shahid Beheshti University, Tehran, Iran Ganglia Architecture Enabling Grids for E-sciencE • Gmond – Metric gathering agent installed on individual servers • Gmetad –
    [Show full text]
  • Ganglia Deployment Guide @CERN Crist´Ov˜Aocordeiro, IT-SDC-ID [email protected]
    Ganglia Deployment Guide @CERN Crist´ov~aoCordeiro, IT-SDC-ID [email protected] 1 Introduction With the constant increase of Cloud resources being used, there is the need of having something which allows their monitoring/accounting. This can be very useful and practical in a way where we will be able to not only create alarms, and automatic decisions based on these, but also compare and report resource usages for specific cloud infrastructures, over specific periods of time. Ganglia is a distributed system which allows us to achieve exactly this. "Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technolo- gies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node over- heads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world."1. It works on a Headnode Node based interaction and its deployment is fairly easy. During instantiation the workernodes can also be easily config- ured to report their metrics to the headnode by using the Ganglia module for CloudInit which is available and maintained at CERN (configuration through other contextualization methods is also possible). NOTE: Ganglia is a highly configurable system. This guide only covers the basics for some specific situations.
    [Show full text]
  • The Ganglia Distributed Monitoring System: Design, Implementation, and Experience
    The Ganglia Distributed Monitoring System: Design, Implementation, and Experience Matthew L. Massie Brent N. Chun David E. Culler Univ. of California, Berkeley Intel Research Berkeley Univ. of California, Berkeley [email protected] [email protected] [email protected] Abstract growth, system configurations inevitably need to adapt. In summary, high performance systems today have sharply di- Ganglia is a scalable distributed monitoring system for high- verged from the monolithic machines of the past and now performance computing systems such as clusters and Grids. face the same set of challenges as that of large-scale dis- It is based on a hierarchical design targeted at federations of tributed systems. clusters. It relies on a multicast-based listen/announce proto- col to monitor state within clusters and uses a tree of point- One of the key challenges faced by high performance to-point connections amongst representative cluster nodes to distributed systems is scalable monitoring of system state. federate clusters and aggregate their state. It leverages widely Given a large enough collection of nodes and the associated used technologies such as XML for data representation, XDR computational, I/O, and network demands placed on them for compact, portable data transport, and RRDtool for data by applications, failures in large-scale systems become com- storage and visualization. It uses carefully engineered data monplace. To deal with node attrition and to maintain the structures and algorithms to achieve very low per-node over- health of the system, monitoring software must be able to heads and high concurrency. The implementation is robust, quickly identify failures so that they can be repaired either has been ported to an extensive set of operating systems and automatically or via out-of-band means (e.g., rebooting).
    [Show full text]
  • Monitoring Systems and POWER5/6 Lpars with Ganglia
    Monitoring Systems and POWER5/6 LPARs with Ganglia Michael Perzl – [email protected] Agenda . Ganglia – what is it ? . Ganglia components and data flow . An introduction to RRDTool . Ganglia metrics – what can be measured ? . New POWER5/6 metrics (AIX & Linux) . Extending Ganglia with gmetric . Add device specific information to Ganglia . Ganglia network communication . Installation issues . Where to get Ganglia for AIX and Linux on POWER ? . Best practices . Future additions / plans . Discussion . Links 2 Monitoring Systems and POWER5/6 LPARs with Ganglia Ganglia – what is it ? Ganglia – what is it ? (1/3) . Ganglia is an Open Source cluster performance monitoring tool and has been extended to include POWER5/6 features like shared processor LPARs, entitlement, physical CPU usage etc. This session covers: – the technical details of Ganglia and the POWER5/6 extensions – how to set it up and use it to monitor all LPARs in a single machine and lots of machines 4 Monitoring Systems and POWER5/6 LPARs with Ganglia Ganglia – what is it ? (2/3) Ganglia properties: . scalable distributed monitoring system for high-performance computing systems such as clusters and grids . based on a hierarchical design targeted at federations of clusters . relies on a multicast-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state . leverages widely used technologies such as – XML for data representation – XDR (eXternal Data Representation) for compact, portable data transport – RRDtool for data storage and visualization . uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency .
    [Show full text]
  • Thirty Billion Metrics a Day: Large-Scale Performance Metrics with Ganglia Adam Compton, Quantcast [email protected] @Comptona
    Thirty Billion Metrics a Day: Large-Scale Performance Metrics with Ganglia Adam Compton, Quantcast [email protected] @comptona November 8–13, 2015 | Washington, D.C. www.usenix.org/lisa15 #lisa15 Thirty Billion Metrics a Day: Forty Adam Compton, Quantcast [email protected] @comptona November 8–13, 2015 | Washington, D.C. www.usenix.org/lisa15 #lisa15 ● 7,000,000 unique metrics collected every 15 sec ● from 1000s of machines ● 480,000 metrics/second ● 41,000,000,000 (billion) metrics/day How many machines do we need for this? Two. Who am I? ● What is Ganglia? ● How Ganglia works ● Inputs ● Outputs ● Scaling ● What’s next What is Ganglia? “Ganglia is a scalable distributed monitoring system for high-performance computing systems” What does “monitoring” mean? What is Ganglia - monitoring vs. metrics performance metrics: regular, numeric, time-series data monitoring: scheduled checks (of metrics, processes, endpoints, etc.) to identify anomalous behavior alerting: getting a human’s attention visualization: charts and graphs By these definitions, Ganglia collects, aggregates, and visualizes performance metrics. What is Ganglia not? ● No monitoring ● No alerting ● No asset database ● No Julienne fries ● What is Ganglia? ● How Ganglia works ● Inputs ● Outputs ● Scaling ● What’s next How Ganglia works How Ganglia works - gmond How Ganglia works - gmetad (aside) How RRDs work How Ganglia works - gweb How Ganglia works - Upsides Built-in hierarchy, summarization, and metadata Web UI is easy to browse and grok How Ganglia works - Downsides
    [Show full text]
  • HPC Best Practices: Software Lifecycles San Francisco, California September 28-29, 2009
    Report of the 3rd DOE Workshop on HPC Best Practices: Software Lifecycles San Francisco, California September 28-29, 2009 HPC Best Practices: Software Lifecycles Findings and notes from the the third in a series of HPC community workshops Workshop Sponsors: DOE Office of Science – ASCR and DOE NNSA Chair: David E. Skinner (LBNL) Vice-Chair: Jon Stearley (SNL) Report Editors: John Hules (LBNL) and Jon Bashor (LBNL Attendees/Report Contributors: Deb Agarwal (LBNL), Dong Ahn (LLNL), William Allcock (ALCF), Kenneth Alvin (SNL), Chris Atwood (HPCMPO/CREATE), Charles Bacon (ANL), Bryan Biegel (NASA NAS), Buddy Bland (ORNL), Brett Bode (NCSA), Robert Bohn (NCO/NITRD), Jim Brandt (SNL), Jeff Broughton (NERSC), Shane Canon (NERSC), Shreyas Cholia (NERSC), Bob Ciotti (NASA NAS), Susan Coghlan (ANL), Bradley Comes (DoD HPCMP ), Paul “Steve” Cotter (ESnet), Nathan DeBardeleben (LANL), Sudip Dosanjh (SNL), Tony Drummond (LBNL), Mark Gary (LLNL), Ann Gentile (SNL), Gary Grider (LANL), Pam Hamilton (LLNL), Andrew Hanushevsky (SLAC), Daniel Hitchcock (DOE ASCR), Paul Iwanchuk (LANL), Gary Jung (LBNL), Suzanne Kelly (SNL), Patricia Kovatch (NICS/UTK), William Kramer (NCSA), Manojkumar Krishnan (PNNL), Sander Lee(NNSA), Janet Lebens (Cray Inc.), Lie-Quan Lee (SLAC), Ewing Lusk (ANL), Osni Marques (DOE ASCR), David Montoya (LANL), Krishna Muriki (LBNL), Tinu Ogunde (ASC/NNSA/DOE), Viraj Paropkari (NERSC), Rob Pennington (NSF), Terri Quinn (LLNL), Randal Rheinheimer (LANL), Alain Roy (Univ. Wisconsin-Madison), Stephen Scott (ORNL), Yukiko Sekine (DOE ASCR),
    [Show full text]
  • Ganglia on the Brain
    iVoyeur: More Ganglia on the Brain DAVE JOSEPHSEN Dave Josephsen is the Welcome to the second installment in my Ganglia series . In the last issue I prom- author of Building a ised (threatened?) to walk you through building a Ganglia data collection module Monitoring Infrastructure in C, and that’s exactly what we’re going to do . with Nagios (Prentice Ganglia, as you’ll recall, is composed primarily of two daemons: gmond, which Hall PTR, 2007) and is senior systems runs on the client side, collecting metrics, and gmetad, which runs server-side engineer at DBG, Inc., where he maintains and polls the clients, pushing their data into RRDTool databases . Gmond, out of a gaggle of geographically dispersed server the box, can collect a litany of interesting metrics from the system on which it is farms. He won LISA ’04’s Best Paper award for installed and can be extended to collect user-defined metrics in three different his co-authored work on spam mitigation, and ways . he donates his spare time to the SourceMage GNU Linux Project. The first method to add new metrics to gmond is to collect the data yourself using [email protected] a shell script, or similar program, piping the resultant data to the Gexec daemon . This method is simple and widely used, but adds overhead to a system carefully designed to minimize it . The second and third methods are to write plug-ins for gmond in Python or C . Writing gmond plug-ins in Python requires that Python be installed and that gmond be compiled against the Python libs on every box where you might want to use your module .
    [Show full text]
  • Open Source Software Options for Government
    Open Source Software Options for Government Version 2.0, April 2012 Aim 1. This document presents options for Open Source Software for use in Government. 2. It is presented in recognition that open source software is underused across Government and the wider public sector. 3. This set of options is primarily intended to be used by Government to encourage IT suppliers and integrators to evaluate open source options when designing solutions and services. 4. This publication does not imply preference for any vendor or product. Open source software, by definition, is not tied inextricably to any particular commercial organisation. Any commercial entity can choose to support, maintain, or integrate open source software. 5. It is understood that the software market, and the open source ecosystem in particular, is a rapidly developing environment and any options list will be incomplete and may become outdated quickly. Even so, given the relatively low level of open source experience in Government, this options list has proven useful for encouraging IT suppliers to consider open source, and to aid the assurance of their proposals. Context 1. The Coalition Government believes Open Source Software can potentially deliver significant short and long term cost savings across Government IT. 2. Typical benefits of open source software include lower procurement prices, no license costs, interoperability, easier integration and customisation, fewer barriers to reuse, conformance to open technology and data standards giving autonomy over your own information, and freedom from vendor lock in. 3. Open Source is not widely used in Government IT. The leading systems integrators and supplies to Government do not routinely and effectively consider open source software for IT solutions, as required by the existing HMG ICT policy.
    [Show full text]
  • GANGLIA on the BRAIN 55 Specific Purpose in Mind, So Some of What It Isn’T Is by Design
    IT’S IRONIC, I THINK, THAT I’VE BEEN dissatisfied with my RRDTool data DAVE JOSEPHSEN collection and display options for as long as I have. Ironic in the sense that RRDTool is such a category killer for what it does (storing and graphing time series data). iVoyeur: Ganglia It used to be novel, to see that distinctive- looking RRDTool graph popping up on the brain somewhere unexpected, but these days Dave Josephsen is the author of Building it’s so ubiquitous that I’m surprised to see a Monitoring Infrastructure with Nagios anything but. So if RRDTool is the ultimate (Prentice Hall PTR, 2007) and is senior systems engineer at DBG, Inc., where he solution for everything that happens maintains a gaggle of geographically dispersed server farms. He won LISA ’04’s between data coming in and graphs going Best Paper award for his co-authored out, the open source community has surely work on spam mitigation, and he do- nates his spare time to the SourceMage provided category killers for polling data GNU Linux Project. and displaying those graphs, yes? [email protected] No. When it comes to polling and displaying that data, we’re far from having a category-killer, in my opinion. It’s not for lack of candidates, mind you. Freshmeat can provide you with myriad options written in all manner of languages (but mostly PHP) that do essentially the same thing—put an RRDTool graph on a Web page. It’s a bit of a cop-out to say they all do “essentially” the same thing, in fact, since they all seem to make the same mistakes over and over again.
    [Show full text]
  • Maestro: a Remote Execution Tool for Visualization Clusters Aron Lee Bierbaum Iowa State University
    Iowa State University Capstones, Theses and Retrospective Theses and Dissertations Dissertations 2007 Maestro: a remote execution tool for visualization clusters Aron Lee Bierbaum Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/rtd Part of the Computer Sciences Commons Recommended Citation Bierbaum, Aron Lee, "Maestro: a remote execution tool for visualization clusters" (2007). Retrospective Theses and Dissertations. 14635. https://lib.dr.iastate.edu/rtd/14635 This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Maestro: A remote execution tool for visualization clusters by Aron Lee Bierbaum A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Major: Computer Engineering Program of Study Committee: Carolina Cruz-Neira, Major Professor Julie Dickerson Chris Harding Iowa State University Ames, Iowa 2007 UMI Number: 1447485 UMI Microform 1447485 Copyright 2008 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, MI 48106-1346
    [Show full text]