17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

DIRAC: Secure Web User Interface

A Casajus Ramo 1, M Sapunov 2,

1 University of Barcelona, Diagonal 647, ES-08028 Barcelona, Spain 2 Centre de Physique des Particules de Marseille, 163 Av de Luminy Case 902 13288 Marseille, France

E-mail: [email protected]

Abstract . Traditionally the interaction between users and the Grid is done with command line tools. However, these tools are difficult to use by non-expert users providing minimal help and generating outputs not always easy to understand especially in case of errors. Graphical User Interfaces are typically limited to providing access to the monitoring or accounting information and concentrate on some particular aspects failing to cover the full spectrum of grid control tasks. To make the Grid more user friendly more complete graphical interfaces are needed. Within the DIRAC project we have attempted to construct a Web based User Interface that provides means not only for monitoring the system behavior but also allows to steer the main user activities on the grid. Using DIRAC's web interface a user can easily track jobs and data. It provides access to job information and allows performing actions on jobs such as killing or deleting. Data managers can define and monitor file transfer activity as well as check requests set by jobs. Production managers can define and follow large data productions and react if necessary by stopping or starting them. The Web Portal is build following all the grid security standards and using modern Web 2.0 technologies which allow to achieve the user experience similar to the desktop applications. Details of the DIRAC Web Portal architecture and User Interface will be presented and discussed.

1. Introduction Since the beginning of the distributed computing era users need to know what has happened to their payloads. Command line interfaces have been the usual tools, but in the framework of LCG and EGEE projects several graphical interfaces were created. Most current monitoring systems provide really low level or very high level views. Although these types of views are very useful for site managers, users require other ways to control their grid activity. Few monitoring systems provide views useful for non- expert users or interactivity with the monitored object. When the development of the new revision of DIRAC Project started, an interactive monitoring interface was defined as the key new feature. It had to allow users to monitor their jobs in a platform independent way. Using the web proved to be a framework that allowed having an interactive

c 2010 IOP Publishing Ltd 1

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004 monitoring interface easy to use for non-expert users, and a powerful way to interact with DIRAC [1] for experts. In order to decrease the learning curve, the Web Monitoring had to have a user-friendly interface mimicking standard graphical interface elements like menus or windows commonly found in desktop applications. Another key requirement was a complete interactivity in the monitoring interface. All the actions users can do via command line have to be available via the monitoring web interface as well. Interaction requires having an authorization and authentication mechanism based on grid certificates. After formulating these requirements we started to look for a satisfactory solution. The well known and widely used Grid monitors were carefully examined: GridView [2], GridPP [3] and MonALISA [4]. GridView is a monitoring and visualization tool which provides a high level view of various functional aspects of the LHC Computing Grid (LCG). It shows the statistics of data transfers, jobs running and service availability information for Grid. Unfortunately for us it’s really high level view solution used to display statistical information; it does not meet user needs. GridPP is a brilliant 3D monitor which gathers information from resource brokers around the world. Using images from NASA's Blue Marble Project, presents a visualization of the Grid at work. It could be used as a general overview for the DIRAC system, but similarly to GridView this is a high level solution. Moreover, the client itself is written in and is not web based. Although there is an option which allows mapping the monitoring data to Google Maps, it can’t interact with a user on the level we want to. MonALISA is a framework based on dynamic distributed service architecture and is able to provide complete monitoring, control and global optimization services for complex systems. The monitor can be used at a user level but it can’t provide certificate based authorization, and control interfaces provided by this monitor can’t be used for job manipulation. Based on the previous studies we decided to create our own monitoring client to fit our needs. The main features to provide were security access to the web monitor using grid certificates and user- system interaction. The resulting DIRAC’s monitoring interface is designed and built with an interaction paradigm in mind instead of passively looking at the objects history. In this paper we describe in Section 2 the architecture of the monitoring system and justify the choice of its components and their implementation. Overview of the security issues and solutions is presented in Section 3. Interaction between the Web Portal and services is described in Section 4. The user interface, goals and features as well as known limitations are described in Sections 5 and 6 respectively. Section 7 is devoted to conclusions and outlook for future work.

2. Architecture overview

2.1. Brief explanation of how it works In this section we present the architecture of the web monitoring interface and it’s interaction with DIRAC. We start with a quick explanation of the way it works from the mouse click to the page update. Details will follow later in this chapter. When a user clicks on any element of the web page an event is triggered and processed by JavaScript interpreter. We used a JavaScript library to create a common through the whole set of web pages. Using a JavaScript library allows us to focus our efforts on building functionality by having a set of widgets ready to use. To mimic the look-and-feel of a desktop application the ExtJS library [5] is used. It allows to dynamically display information retrieved from the web server using techniques [6], so there is no need to refresh the whole page. AJAX provides a way to do a standard GET/POST HTTP query from the user's browser to the web server and feeds the results to ExtJS components which can modify the web page dynamically and hide the client-server interaction. When the web server receives a query, it is processed by DIRAC code running in the web server. To handle all the parameters parsing and URL mapping, the Pylons [7] is used.

2

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

Pylons processes all the incoming HTTP queries, translates the parameters to Python variables and maps each URL to a Python function. The Python function executed by pylons acts as an adapter to DIRAC. If some information is required from a DIRAC service, the function uses DIRAC clients to retrieve it. When a connection to a DIRAC service is required the DISET [1] secure protocol (which is part of the DIRAC framework) is used. Once the web server gets a response from the service, it passes the results back to the user’s browser. This information is then processed by JavaScript code and the web page is modified accordingly.

Fig 1. Protocol used for interaction between layers

2.2. Server side architecture DIRAC Web Portal uses Pylons as the Python framework to handle all the HTTP processing. Pylons include a web server for testing purposes, but it doesn’t scale properly. To make the solution more scalable, Pylons is run in conjunction with an Apache web server. Apache can run multiple processes to serve requests and spawn or kill processes if needed. Each Apache process runs a Pylons instance. The client authentication can be handled by the Apache mod_ssl module. Although Apache is a well known and rock solid solution, perhaps it’s not really suitable for our needs. As an alternative we have tried to use Lighttpd [8], a well known web server. This web server is used by projects such as YouTube and . Its high speed IO-infrastructure allows a better scaling on the same hardware than the Apache server. Moreover, its event-driven architecture is optimized for a large number of parallel connections. Unfortunately, Lighttpd doesn’t fully support OpenSSL authentication mechanisms. If future releases provide the required functionality, probably Lighttpd will be used instead of Apache for our solution. Another alternative to Apache is [9], a web server with load balancing and fault tolerance. But it doesn’t fully support OpenSSL neither. DIRAC web logic is coded in Python and runs under Pylons which is also coded in Python. To run it under Apache, the mod_python module is used. By internally invoking a Python interpreter, it allows to increase the execution speed instead of using the standard CGI mechanism that needs to instantiate the interpreter each time a request is received. Most of the modern Python frameworks can use the WSGI [10] mechanism. The Web Server Gateway Interface specification defines a simple and universal interface between a web server and web application or framework. Currently, DIRAC Web Portal uses mod_python coupled with WSGI mechanism, but recently mod_wsgi was developent and provides all the necessary functionality. mod_python could be replaced by mod_wsgi, but we still haven’t tested it. Another way of executing the Python code under a web server can be done by using SCGI [11] or FastCGI [12] interfaces. Simple Common Gateway Interface as well as FastCGI is the replacement of the old CGI and they are supported by most web servers. The main difference between these two interfaces is that the SCGI is much simpler, and is capable of a faster processing of requests. SCGI can be a good alternative to the WSGI interface. Any of these solutions can be used as a replacement for the “standard” Apache and mod_python chain. In any case, having the interpreter invoked only once per process and kept alive across different queries makes mod_python the more performant solution.

3

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

As for the web framework, Pylons provides a really easy way to construct a web service. The only requirement is Python. DIRAC Web Portal has to interact with DIRAC services, so the ability to use DIRAC libraries to connect to DIRAC services was required. There are other Python frameworks like or TurboGears, but Pylons is the most simple, easy to use and still feature rich. However, we only require a way to map a URL to a Python function and a library to handle the entire low level HTTP parsing and encoding. An application running inside Pylons is organized as a set of controllers, which perform actions. A controller is a Python object, and each object has one or several functions, which are the actions that can be performed. Typically a URL is mapped to a controller action by having with “prefix/controller/action” form, but it can be modified to suit any need. Template engines are used to separate the data from its representation by adding the requested data to a page skeleton to generate final web pages. Template engines allow a developer to code a web page in HTML and reserve the space for the data using special indicators. To render the final web page, the template is parsed and the dynamic indicators are filled with data by the template engine. Mako [13] template engine is used for the DIRAC Web Portal. It is the default template engine for the Pylons framework. Although Pylons can use any template engine such as Cheeta or Genshi, Mako is powerful enough to fit our needs. To describe in detail the execution chain, the following actions are triggered on the server when a user enters a URL in the address bar of the browser: • If the query is done through HTTPS (secure HTTP) protocol, mod_ssl authenticates the connection. If the certificate presented by the user is not valid, the connection is rejected. If the client certificate is valid, Apache invokes Pylons to process the request. If the connection is insecure (HTTP), no authentication takes place. • Based on the query URL, Pylons maps the query to a Python function. This function can use DIRAC if it needs to. If this function needs to contact a DIRAC service, the DISET protocol is used. • After the response has been received, the Python function can decide to render a web page. In this case the template engine is fed with the received data to generate a HTML page, and the web page will be sent back to the browser. Alternatively, if it is a response to an AJAX request, the data will be formatted in JSON format and sent back to the browser.

Fig 2. Software layers for communication between users and DIRAC services

4

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

2.3. Client side architecture Plain HTML pages can display information but allow little interaction. One of the main goals of the DIRAC Web Portal is to provide an interactive interface that emulates a desktop application. To do so, updating dynamically web pages as users interact with them is a requirement. We use JavaScript to transform web pages as needed to avoid external dependencies. JavaScript is a well established technology. Although all major browsers support JavaScript, there are some differences in their implementations, and some of them have a poor support for the JavaScript standard. There are several other solutions that provide a way to have dynamic updates of web pages. Adobe Flash provides a scripting language used to display changes dynamically called ActionScript [14]. ActionScript is an implementation of the ECMAScript standard focused on animation. It has almost the same syntax as JavaScript but it is executed in a different framework with a different set of libraries. It requires an execution environment called “Flash Player”, and doesn’t provide a secure connection mechanism with grid certificates. On top of that, Adobe Flash is a propietary technology and its use can be forbidden under certain circumstances by Adobe. In any case Adobe Flash has become very widespread. Adobe claims that 99.3 percent of all internet users have the Flash player installed. Another alternative is the SilverLight [15] technology created by Microsoft. In a SilverLight application, the user interface is declared in XAML (declarative XML-based language used to initialize structured values and objects) and programmed using a subset of the .Net framework language. Textual content created with SilverLight is searchable and indexable as it is not rendered but represented as text. Unfortunately, SilverLight doesn’t use any open standards that would allow using it if Microsoft decides to change the API between versions. JavaScript is a stable and mature technology although it might be not as easy to use as the alternatives. When AJAX (Asynchronous JavaScript and XML) was established as a de-facto standard, JavaScript received a lot more of attention, and as a result there are many frameworks and libraries, which ease the JavaScript development cycle. These libraries allow building rich interactive web applications using techniques such as AJAX, DHTML and DOM scripting. We started using Yahoo User Interface (YUI) library [16]. Although YUI is a perfectly documented, fast and flexible library, after three months of prototyping we decided to pass to another solution. The main reason was the slow processing speed of large tables with more than 500 entries. Another important reason was that it is a low level library but we don’t need such level of details. The JavaScript library should help building web applications by using large building blocks, YUI requires learning a lot of low level glue code. We moved to the ExtJS library; originally built as an add-on extension to YUI. It provides an extra level of abstraction. The second version of this library provides an interface more similar to those traditionally associated with desktop application development. ExtJS lacks a bit of documentation, especially compared with YUI. An active community and pixel-perfect widgets similar to those used in desktop applications compensates for this. The ExtJS library works at the first level of the user interaction. Every action users do is handled by the ExtJS code. If the JavaScript code running in the user’s browser detects an action that requires some information from the web server, an AJAX request is executed and the response processed to modify the web page accordingly.

3. Authorization and authentication Users can access sensitive information through DIRAC Web Portal. To protect it, DIRAC Web Portal provides means for the access control. If users do not identify themselves, they are only allowed to access the public part of the Web Portal. The public part allows a visitor to see information about DIRAC and some statistics showing how the system is behaving. But it does not allow users to see any private information.

5

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

3.1. User authentication DIRAC Web Portal provides HTTP and HTTPS access. Unsecure HTTP access is provided for visitors that want to navigate the unsecure part of the site. On the other hand, Secure HTTPS access allows users to authenticate themselves. Users need to have a valid certificate registered in the DIRAC configuration database and loaded in the . If the certificate has not been issued by one of the trusted Certification Authorities (CA), the access to the site will be denied. If the certificate is valid, the site will check if the Distinguished Name (DN) of the certificate is registered in DIRAC. If the DN is not registered, the Web Portal will give the user visitor credentials and allow navigation only to the public part of the site. Certificate authentication provides a secure mechanism to authentify users. Users don't have to type any secret pass phrase. But users need to have their certificate with them, so logging in from a public computer is not easy. Password authentication can be key logged in public computers, so DIRAC Web Portal does not provide this authentication method.

3.2. Groups and privileges Different users have different privileges. To map privileges, users are organized into groups. Each group has a different set of privileges and can perform different actions. Before executing any action, users have to select their active group. DIRAC Web Portal provides a way to select the active group.

Fig 3. Selection of the active group for a user

Once the user has selected the active group, the site has to allow access only to the allowed areas to that group. It is achieved by modifying the web pages to reflect the privileges. Only the allowed actions and data are displayed in the pages for every group.

Fig 4. Web pages modification based on active group

6

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

Not only the web page has to react to the different privileges, but also the access to certain parts of the site has to be restricted. DIRAC Web Portal has a set of authorization rules applied to the web pages. The authorization schema is the same as for the DIRAC services but web pages are used instead of service actions as the entities to control the access.

4. Services interaction DIRAC Web Portal is designed to display dynamic information and to interact with the user. When a user accesses a page that contains dynamic information or triggers any action (like requesting the description of a job), the web page has to connect to a DIRAC service to forward this request. DIRAC services need to know the identity of the requester. If the user is a visitor, the requester will be the web site, but if it is a valid user, the credentials also need to be forwarded.

Fig 5. Mechanism of the Web Portal interaction with the DIRAC services

When the DIRAC Web Portal receives a request from an authenticated user via the secure interface, it can access the certificate info of the user, but cannot access the private key. This prevents the Web Portal from using the user's certificate to contact the DIRAC services. DIRAC Web Portal has to forward the credentials in some other way. There are two ways how the user credentials can be forwarded: • Using DIRAC Proxy Manager service to retrieve the user's proxy and use it to contact the final service. Using the real user proxy prevents the Web Portal from inventing credentials. But it can only use credentials that are stored in the Proxy Manager service. It requires that users upload their proxies with defined DIRAC group to the Proxy Manager service before using the Web Portal with that group. • The Web Portal uses its own credentials and instructs the DIRAC service to use another credentials. This solution allows the web service to use any credential, but users don't have to do anything special. As soon as they are registered in DIRAC, users can access the Web Portal.

7

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

Both solutions are implemented but currently the second one is used for convenience. Users just need to connect with their certificate to use the site with no prior step. The first option can be also enabled via a configuration parameter.

5. User Interface Layer The DIRAC Web Portal is built using GUI elements to mimic desktop applications, such as , menus, windows, buttons and so on. All pages have two toolbars, one on the top and another at the bottom of the pages that contain the main navigation widgets. The top contains the main and reflects the logical structure of the Portal. It also allows to select active DIRAC setup. The bottom toolbar allows users to select their active group and displays the identity the user is connected with.

Fig 6. Job monitoring page in DIRAC Web Portal

The mostly used layout within our Web Portal is a table on the right side of the page and a side bar on the left. Almost all the data that needs to be displayed can be represented as two-dimensional matrix using a table widget. This widget has a built-in pagination mechanism and is very customizable. As a drawback, it is a bit slow to load the data into the table. On an average desktop hardware, tables with more than 100 elements can be slow to display the data. Tables with more than thousand entries can take more than 10 seconds to load. The table widget is slow because processing large arrays is inefficient in JavaScript. The Job Monitoring page is the most accessed page of the DIRAC Portal. At the left part of the page there is a side bar made in an layout as seen in Figure 6. It is used as container for selection widgets, global sort controllers and to display statistical data. The main display on the Job Monitoring page is the table widget showing the user’s jobs, and some relevant information about them. There is a lot of information associated with each job. To simplify the presentation, the non-essential information is hidden by default, but it can be easily displayed. A is available to display extra information that cannot be displayed in a table form, for example the job JDL. The requested information will appear in a new created by ExtJS in the same web page as an overlay. Users can open as many display windows as they like, so users can display multiple pieces of information at the same time. Each row has a widget to mark selected jobs. After marking the jobs, the user can perform several actions on them like rescheduling, killing or deleting them. Any action can be performed on a single job or on a group of jobs.

8

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP Publishing Journal of Physics: Conference Series 219 (2010) 082004 doi:10.1088/1742-6596/219/8/082004

6. Known limitations There are currently several limitations in the Web Portal usability: • A modern browser with JavaScript support (except Microsoft Internet Explorer) is required. The recommended browser is but or Chrome can be used as well. • The DIRAC Web Portal is still under development and code is not yet optimized. It results in some pages relatively slow to respond. • Unfortunately, due to limitations in the Internet Explorer JavaScript interpreter, ExtJS library doesn’t work properly on it, so users cannot use the DIRAC Web Portal from this browser. Future versions of the Internet Explorer JavaScript interpreter are expected to satisfy the ECMA standards. • As it was mentioned before, JavaScript has poor efficiency when dealing with big arrays, and many web browsers have memory leaks that render the web pages unusable and force the web browser restart. The situation is getting better with the new versions of browsers where the JavaScript support is getting much attention nowadays.

7. Conclusions and outlook After a year of development the DIRAC Web Portal is now providing the desired features. It’s a secure portal that provides a desktop application behavior and it is easy to learn for new users. But there are quite a number of views still to be done. Users have also provided some feedback, the main feature requested is the GUI customization. Although ExtJS provides a rich set of GUI components, not all of them are perfect. For intstance, the table widget does not have a column width automatic adjustment, so users have to change the column widths by hand. One way to implement these requests would be to have a user profile to store modifications made by users. As for the general development, testing and incorporating new technologies like Adobe Air [17] or Google Gears [18] is under investigation. Both cross-platform runtime environments allow executing web application code offline and move time and networking consuming operations to the background. This new type of web environments will provide the basis to build the next generation interfaces that will replace today’s standard command line interfaces.

References [1] A.Tsaregorodtsev et al, DIRAC: A Community Grid Solution, Proceedings of the CHEP 2007 Conference [2] GridView project, http://gridview.cern.ch [3] GridPP project, http://www.gridpp.ac.uk [4] MonALISA project, http://monalisa.caltech.edu [5] ExtJS, http://extjs.com [6] , http://pylonshq.com [7] AJAX, "The XMLHttpRequest Object", World Wide Web Consortium. 2006-04-05. [8] Lighttpd project, http://www.lighttpd.net [9] Nginx project, http://nginx.net [10] WSGI, PEP 333, Python Web Server Gateway Interface v1.0 [11] SCGI specification [12] FastCGI specification [13] Mako project, http://www.makotemplates.org [14] Adobe ActionScript, ActionScript 3.0 Language & Component Reference [15] Silverlight architecture [16] YUI, http://developer.yahoo.com/yui/ [17] Adobe AIR: Browser vs. Desktop [18] Google Gears

9