Mcfarmgraph Daemon Flowchart

UTA HEP/Computing/0026: McFarmGraph______1

McFarmGraph a web based monitoring tool for McFarm jobs

Sankalp Jain, Aditya Nishandar, Drew Meyer, Jae Yu, Mark Sosebee, Prashant Bhamidipati, Heunu Kim, Karthik Gopalratnam, Vijay Murthi, Parag Mhashilkar

Abstract

McFarmGraph is a web based Graphical User Interface used to monitor McFarm jobs on Linux farms. This document is intended to be a comprehensive description of the design of McFarmGraph. For installation and administration please refer McFarmGraph administration guide. UTA HEP/Computing/0026: McFarmGraph______2

1. INTRODUCTION:

McFarmGraph consists of two parts – the front-end CGI scripts that are used to display the status of various jobs in a graphical format, and the back-end daemon that is used to bring status files over to the web server from the various farms.

Status Daemon (McFarmGraph_daemon): McFarm periodically (e.g. every four hours) outputs a status file on each farm where it is running. This flat file summarizes the status of the various jobs running in the farm. The update period can be changed based on a parameter that can be set by the farmer (see McFarm documentation). The purpose of the daemon is to bring the status file from each remote farm that is doing productions using McFarm. The daemon uses globus services (gsiftp) periodically to transfer the status files and stores them locally. The daemon then triggers the XML generator, which converts the flat files to XML format. The flowchart illustrates the control flow for McFarmGraph daemon.

Graphical User Interfaces: This is a set of CGI scripts written in PERL, and some applets written in Java. The scripts interpret the XML data representation of the status file transferred from the remote farms and present them to the user in a graphical format that can be accessed from the web.

Request Structure: Each Mcfarm request consists of a bunch of jobs that are grouped together according to the request id of type ReqXXXX. Individual job names consist of the Request Id followed by a descriptor string and a number that is unique in the group. For e.g. “Req6279-zh-zmumu+hbb-03219094626”, “Req6279-zh- zmumu+hbb-03219094710”, Req6279-zh-zmumu+hbb-03219094921… all belong to the group with request id Req6279. The “%Done” attribute displayed on the webpage for a particular request id is the average of %Done attribute of individual jobs within that group. Also the figures displayed in the PieChart represent the percentage of jobs that are in a particular phase.

2. STATUS DAEMON DESIGN:

Scalability and simplicity are the pivotal issues that influence the design of the daemon as well as the cgi scripts. Fig 1 shows the directory structure in which the job status files from the remote farms are stored. UTA HEP/Computing/0026: McFarmGraph______3

/home/mcfarm/McFarmGraph_New/

/SWIFT-HEP //CSE-HEP /OU-HEP /LTU-HEP /conf /log /tmp README

/mcp10 /mcp11 /mcp14 daemon.log

mcp11 mcp14 daemon.conf

mcp10

Fig 1. McFarmGraph Directory Structure on hepfm007.uta.edu

The McFarmGraph job information as well as the configuration and log file is placed in the directory structure as shown above. Whenever a new farm is added the daemon automatically creates a directory corresponding to a farm (e.g. SWIFT-HEP for Swift Farm). The mcpxx subdirectories are created according to the mcp versions on a particular farm (e.g. CSE-HEP has mcp13 & mcp14, whereas OU-HEP has only mcp14). The status files and their XML representations are stored in these (mcpxx) directories. Each mcpxx will typically contain, mcpxx(flat file), mcpxx_arch (XML representation of the archived job information) and mcpxx.xml(XML representation of queued and live jobs).

The conf and the log directories contain the McFarmGraph _daemon configuration and the log files respectively. Addition of farms is done through the configuration file. The tmp directory is used as a scratch space when the daemon is running to store the process id of a running daemon as well as some temporary files (e.g. ls.txt).

Fig 1a and 1b illustrate the control flow in the McFarmGraph_daemon. UTA HEP/Computing/0026: McFarmGraph______4

Start Start

Read the configuration file. Read the configuration file.

to /S to/n rt Sk n Stop a t/ c / Start t r e cks S ta h e rg s /c a g Sp ch r / o a p o Check if the daemon is up Check if the daemon is up Check if the daemon is up Check if the daemon is up

MainPrint Subroutine: Usage MainPrint Subroutine: Usage andmain Exit () Yes andmain Exit () Yes No Is n Is n s i s i No I n in I n in n n n n ru n ? ru n ? u g ? u g ? r g Redirect the output stream r g Redirect the output stream Fork a child to the Log File. Fork a child to the Log File. process and Print process and Print Read the Print separate it from Print Read the “daemon is separate it from “daemon is daemon’s “daemon is the parent “daemon is daemon’s running” the parent running” process id Read the configurationrunning” file running” process id Read the configuration file for the farm variables by Invoke the for the farm variables by Invoke the invoking theExit initialize() main ( ) Flush the invoking theExit initialize() main ( ) Flush the subroutine subroutine Exit logs subroutine subroutine Exit logs Check for Check for Print Issue a kill with configuration Print Issue a kill with configuration “daemon is pid as the errors “daemon is pid as the errors running” daemons’ pid running” daemons’ pid Sleep for specified time Sleep for specified time Exit rs (UPDATE_INTERVAL) Exit ro rs Yes r o (UPDATE_INTERVAL) E rr? E ?

No p Print “Configuration e pal Print “Configuration le ev l S e r a? Errors” lte vr Errors” Sn ere ? I t v er Create Farm objects InO v Create Farm objects O corresponding to Exit corresponding to No Yes Exit NUMBER_OF_FARMS NUMBER_OF_FARMS

For each farm, call the For each farm, call the farm_mkdir () method to Fig 1a. McFarmGraphfarm_mkdir ()Daemon method to Flowchart create farm specific create farm specific directories if not already directories if not already present. present. For each Farm, call the For each Farm, call the getFiles () method to retrieve getFiles () method to retrieve the job status files. the job status files.

When all the files are retrieved, start the XML generator UTA HEP/Computing/0026: McFarmGraph______5

Fig 1b. main() subroutine in the McFarmGraph Daemon

3. XML GENERATOR: In earlier version of McFarmGraph lot of computation was done while the client (browser) was waiting. Although this processing wasn’t a bottleneck but would have increased as the size of flat files increase. So in order to avoid this, bulk of the processing is now being done offline with the data stored in an XML file. While generating the page the task is now simply reading the data from the XML file and generating the HTML code.

The task of generating the XML data is done by two scripts. A wrapper which for each file pulled over from various farms calls a subroutine (in xmlgen.pm) which generates the UTA HEP/Computing/0026: McFarmGraph______6

XML data for that status file. The diagram below shows the flow chart for the subroutine. Flow chart of XML generator

START

Read the file path of the file to operate on and create file paths for both XML files

Create a temporary sorted file from status file

Read a line from the sorted file

Yes No EOF Accumulate job info

Write archived job info to arch XML RequestId No file changes?

Calculate % Done

Delete the sorted write info in file %Done = No live jobs xml 100 ? file Yes

EXIT Accumulate archived job info

4. CGI and PERL Scripts

The following scripts generate the various Web pages: 1. filter.cgi 2. applet.pm 3. filemani.pm 4. generalpage.pm 5. html.pm 6. jobpage.pm UTA HEP/Computing/0026: McFarmGraph______7

All of these scripts are written in PERL and are located under /usr/local/apache2/cgi-bin on hepfm000.uta.edu. Apart from these scripts there is the java applet code which is in the file PieChart.java under /usr/public_html/job_status/applet hepfm000.uta.edu. All the images that are used in the web pages and the “style.css” file are also under /usr/public_html/job_status on hepfm000.uta.edu.

Functions of various scripts filter.cgi: All the requests from the browser are directed to filter.cgi along with a set of parameters. This script then invokes subroutines in other files depending on the parameters. applet.pm: This script generated all the applet specific HTML code. filemani.pm: This script consist a single subroutine whose functions are explained below. generalpage.pm: This script generates bulk of the Req. Desc page. jobpage.pm: This script generates the Job Desc page html.pm: This script prints most of the HTML code for all scripts.

Generation of Web Pages McFarmGraph generates most of the pages dynamically using CGI. The only static page is the “index.html” page which is stored under /usr/public_html/job_status on hepfm000.uta.edu. For adding a new farm this page has to be modified (refer the installation guide for more details). For the other pages there are 3 cases:

1. “ Farm Request Ids” Page Request UTA HEP/Computing/0026: McFarmGraph______8

Generation of “Farm Request Ids” page

printHTMLHeader, html.pm Browser printHTMLfooter main, farm name Job Status index.html filter.cgi HTML code page webpage printHTMLCell , printCellLink readDir Req and reply between browser and cgi script filemani.pm Call to functions Function Return

In this case the parameters passed to the filter.cgi file include “main” and the farm name. The filter.cgi script calls the html.pm file function to print the header and then calls the readDir function which read the directory for the requested farm and creates a link for each mcp version available on that farm.

2. “ Farm Request Desc.” Page Request UTA HEP/Computing/0026: McFarmGraph______9 UTA HEP/Computing/0026: McFarmGraph______10

Generation of “Farm Request Desc.” page

html.pm Browser genpage, mcp ver. farm name, arch? printHTMLfooter Farm request ids page filter.cgi HTML code webpage printHTMLHeader , HTML code printHTMLCell , printCellLink HTML code generalPage Req and reply between browser and cgi script generalpage. applet.pm pm Call to functions printApplet Function Return

The page generated here will either be one containing all the “live jobs” or all the “archived jobs” on this farm for the requested mcp version depending on the presence of last attribute. filter.cgi script calls generalPage subroutine in generalpage.pm file which does the rest of the processing. generalPage calls various subroutines in html.pm file and also printApplet in applet.pm which embeds the applet into the HTML code generated. The PHASES column in archived page indicates all the phases this Request has gone through.

3 “Job Desc.” Page Request UTA HEP/Computing/0026: McFarmGraph______11

Generation of “Job Desc.” page

jobpage, mcp ver. printHTMLHeader, html.pm Browser Req desc.,farm name printHTMLfooter ,status Farm request desc. page filter.cgi HTML code webpage printHTMLCell

jobPage Req and reply between browser and cgi script jobpage.pm Call to functions Function Return

The page generated here lists all the jobs in a particular group identified by the Request Id passed on by the browser. If a status parameter is present in the request (present in case the applet link is clicked) then the page lists the details of all the jobs in the group whose current status is the one requested. For e.g. it might contain all the jobs whose status is “D0GSTAR”. UTA HEP/Computing/0026: McFarmGraph______12

3. FUTURE WORK:

The performance of the McFarmGraph tool can certainly be improved. Some of the future work is highlighted below. We would like to reiterate that these are just some of the suggestions; no study of their feasibility and success is done.

. Exploring options to cgi: When the number of farms being monitored increases cgi scripts could be a potential performance bottleneck. Java servlets might be able solve this issue.

. Expiration of proxies: During the course of development of McFarmGraph, it was observed that the proxies expire, thus disabling the retrieval of status files from the remote site. Modification to the daemon could solve this problem.

. Java applets load slowly: For every row in the job status page a new applet is loaded and executes on the client side Java Virtual Machine. Mechanisms for caching the byte code and having a single instance of the applet would speed up the loading time.

. McFarmGraph status updates: Currently McFarmGraph pulls status files; majority of the information that it contains has already been pulled over before. Instead it would be more efficient to get information about those Requests that either have jobs that are still running on the farm or have those requests that have finished since last update.