International Desktop Grid Federation – Support Project

Contract number: RI-312297

Report on best practices in infrastructure support

Project deliverable: D2.3.1

Due date of deliverable: 30/04/2013 Actual submission date: 08/07/2013

Lead beneficiary: MTA SZTAKI Workpackage: WP2 - Supporting infrastructure

Dissemination Level: PU (Public) Version: 6.0 (FINAL)

IDGF-SP is supported by the FP7 Capacities Programme under contract nr RI-312297.

D2.3.1 – Report on best practices in infrastructure support

Copyright (c) 2013. Members of IDGF-SP consortium, see http://idgf-sp.eu/partners for details on the copyright holders.

You are permitted to copy and distribute verbatim copies of this document containing this copyright notice but modifying this document is not allowed. You are permitted to copy this document in whole or in part into other documents if you attach the following reference to the copied elements:

‘Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu’.

The commercial use of any information contained in this document may require a license from the proprietor of that information.

The IDGF-SP consortium members do not warrant that the information contained in the deliverable is capable of use, or that use of the information is free from risk, and accept no liability for loss or damage suffered by any person and organisation using this information.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 2/52

D2.3.1 – Report on best practices in infrastructure support

1 Table of Contents

1 Table of Contents ...... 3

2 Status and Change History ...... 5

3 Glossary ...... 6

4 Introduction ...... 8

5 Best practices in support tools ...... 9

5.1 Non-DG related tools ...... 9

5.2 DG related tools ...... 11

6 Best practices in different desktop grid scenarios ...... 14

6.1 Simple scenarios ...... 14

6.1.1 SZDG BOINC project with the KOPI application ...... 15

6.1.2 edges@home with clouds ...... 27

6.2 Complex scenarios ...... 30

6.2.1 University of Westminster Local Desktop Grid ...... 31

6.2.2 Autodock portal + edges@home ...... 35

6.3 Bridging scenarios ...... 38

6.3.1 Modified Computing Element + edges@home ...... 39

6.3.2 Bridges from DGs to the DesktopGrid VO ...... 43

7 Best practices for security in desktop grid computing...... 44

7.1 Introduction ...... 44

7.2 Securing servers ...... 44

7.3 Securing the grid ...... 45

7.4 Making projects reliable ...... 46

8 Further case studies ...... 47

8.1 DesktopGrid Testbed at JINR...... 47

8.2 DIRAC supported by EDGeS@home ...... 48

8.3 ...... 48

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 3/52

D2.3.1 – Report on best practices in infrastructure support

8.3.1 How we raise money for great causes -- and the prize draw ...... 49

8.3.2 How the prize winner is chosen ...... 49

9 Questionnaire for BOINC project operators ...... 50

10 References ...... 52

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 4/52

D2.3.1 – Report on best practices in infrastructure support

2 Status and Change History

Status: Name: Date: Signature:

Draft: Jozsef Kovacs 09/05/13 n.n. electronically

Reviewed: Adam Visegradi, Reka 12/06/13 n.n. electronically Makkos

Approved: Robert Lovas 5/7/13 n.n. electronically

Version Date Section/Part Author Modification

1.0 18/04/13 All sections JK Creation of the initial version

1.1 23/04/13 All sections JK, TK Updating sections

2.0 25/04/13 Section 8 JK, MM Adding case studies

2.1 02/05/13 Section 7 AM Adding section on “best practices for security…”

3.0 06/05/13 Section 1 JK, RL Improving introduction

Adding more materials to introduction, shortening Section section 6.1.1, updating references, adding section 4.0 20/05/13 4,6,9,10 JK, RL about the questionnare

5.0 5/06/13 All sections JK Minor corrections

6.0 29/06/13 All sections RL Final version (after review)

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 5/52

D2.3.1 – Report on best practices in infrastructure support

3 Glossary

3G Bridge Generic Grid-Grid Bridge

APEL Accounting Processor for Event Logs (tool for accounting in EGI)

ARC Advanced Resource Connector (grid middleware)

CE gLite Computing Element (grid middleware component)

CLI Command-Line Interface

CO Confidential

CREAM Computing Resource Execution And Management (gLite CE)

BOINC Berkeley Open Infrastructure for Network Computing

DC-API Distributed Computing Application Programming Interface

DCI Distributed Computing Infrastructure

DEGISCO Desktop Grids for International Scientific Collaboration

DG Desktop Grid

DoW Description of Work

EC European Commission

EC2 Amazon Elastic Compute Cloud

EDGeS Enabling Desktop Grids for e-Science

EDGI European Desktop Grid Initiative

EGI European Grid Infrastructure

FLOPS FLoating point Operations Per Second

GBAC Generic BOINC Application Client

GenWrapper Generic Wrapper tool for BOINC applications gLite Middleware for Grid Computing supported by EGI

GOCDB Grid Configuration Database

GPL General Public License (open-source license)

GRID Distributed Computational Network

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 6/52

D2.3.1 – Report on best practices in infrastructure support

GUI Graphical User Interface gUSE Grid and cloud User Support Environment (web-based portal)

IDGF International Desktop Grid Federation mCE modified (gLite) Computing Element

MUNIN network/system monitoring application that presents output in graphs

NAGIOS Monitoring system in EGI

NGI National Grid Initiatives

OurGrid Free-to-join peer-to-peer Grid computing platform

PU Public

REST Representational State Transfer (web-service interface)

SAM Service Availability Monitoring

SG Service Grid

SOAP Simple Object Access Protocol (web-service interface)

SVN Apache Subversion

UNICORE Uniform Interface to Computing Resources (grid middleware)

VO Virtual Organisation

WLDG Westminster Local Desktop Grid

WM Virtual Machine

WMS Workload Management System (gLite component)

WN Worker Node (gLite component)

WP Workpackage

WU WorkUnit (task to be processed by a computational resource)

XML Extensible Markup Language (document format)

XWHEP XtremWeb for High Energy Physics: Desktop Grid Middleware

ZENWorks Software product for systems management

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 7/52

D2.3.1 – Report on best practices in infrastructure support

4 Introduction

The IDGF-Support Project aims to give the IDGF a boost in two important areas. Firstly it helps considerably with increasing the number of citizens that donate computing time to e-Science. It is to be achieved by targeted communication activities and by setting up a network of "ambassadors". Secondly it is to help universities and other organisations to include otherwise idle PCs from their class rooms and offices into their e-infrastructures. In addition IDGF-SP have been collecting and analysing data that will help deploying idle PCs in an effective and energy efficient way; it has been shown that Desktop Grids can contribute to Green IT if used in the correct way.

This deliverable ‘Report on best practices in infrastructure support’ describes briefly the selected set of support tools for the best practices that are utilized in the IDGF-SP project.

The deliverable also collects and describes the best practices suggested for infrastructure operation and support that are based on the experiences of existing DG installations. The infrastructure expert group of the project proposes various scenarios and wide range of configurations for computing infrastructures related to Desktop Grid. Each scenario and configuration targets a well-defined usage mode also depending on the type of included resources and its access by the users (i.e. job submission). These scenarios and configurations are shortly introduced at the beginning of each subsection in Section 6, followed by the description of best practices. The background information and more precise, detailed description can be found at http://desktopgridfederation.org/technical-wiki (or http://doc.desktopgrid.hu).

Moreover, the document summarizes the best practices in security aspects of Desktop Grids as one of the key issues of such distributed systems. Finally, some further case studies are introduced following the proposed ways of building SG-DG combined infrastructure, and the first version of a questionnaire to be distributed among the infrastructure operators in order to collect more information for the second version of this deliverable.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 8/52

D2.3.1 – Report on best practices in infrastructure support

5 Best practices in support tools

There are several widespread monitoring, communication, and test tools that can be utilized during the operation and support of the infrastructure. The following sections describe these tools in details.

5.1 Non-DG related tools

In case of Service Grid resources and services it is highly recommended to rely on the automated tools provided by the European Grid Infrastructure (EGI, https://wiki.egi.eu/wiki/Tools). The following table summarizes the EGI central operations tools, their documentations, the links to instances, and the most relevant IDGF-SP related information already available for support purposes.

Tool Description & link to available Link to instance(s) IDGF related information & link documentation from EGI The Operations Portal provides a view, based on the role of the DesktopGrid VO: viewer, to information about the Operations status of resources and grid http://operations- http://operations- portal services. portal.egi.eu/ portal.egi.eu/vo/view/voname/deskt https://wiki.egi.eu/wiki/Operations opgrid.vo.edges-grid.eu _Portal

Example SAM is a system for monitoring IPB site of DesktopGrid VO distributed grid services, which are part of the EGI infrastructure. The availability and reliability: service covers storage and https://wiki.egi.eu/ Service computation of status and http://grid- wiki/SAM_Instances Availability availability of services (OPS VO), monitoring.cern.ch/myegi/sa/?view= Monitoring generation of EGI league monthly (Nagios and MyEGI 2&graph=1&vo=37&profile=28&filte (SAM) report, web interface (MyEGI) and portals) rs-value-Regions_or_Tiers=&filters- web APIs to access stored data value- retrieval. Sites=342&dateorperiod=pd&period https://wiki.egi.eu/wiki/SAM =pM&startdate=28-03- 2013&enddate=27-04-2013

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 9/52

D2.3.1 – Report on best practices in infrastructure support

The accounting infrastructure is a Example complex system that involves various sensors in different regions, DesktopGrid VO utilization in term all publishing data to a central of CPU hours since the start of IDGF- repository. The data is processed, SP project: summarized and displayed in the accounting portal, which acts as a http://accounting.eg Accounting http://accounting.egi.eu/custom.ph common interface to the different portal i.eu/ p?query=normcpu&startYear=2012& accounting record providers and presents a homogeneous view of startMonth=11&endYear=2013&end the data gathered and a user- Month=6&yRange=DATE&xRange=SI friendly access to understanding TE&voGroup=custom&voList%5B%5 resource utilization. D=vo.edges- grid.eu&chart=GRBAR&scale=LIN https://wiki.egi.eu/wiki/Accounting GStat displays information about grid services, the grid information system itself and related metrics. Gstat provides a method to DesktopGrid VO: visualize a Grid infrastructure from Gstat an operational perspective based http://gstat.egi.eu/ on information found in the grid http://gstat.egi.eu/gstat/vo/desktop information system. grid.vo.edges-grid.eu/

https://wiki.egi.eu/wiki/External_t ools#Gstat The GGOCDB is the central input system for recording Grid topology information. This includes the Grid sites that contribute to the Grid production Grid infrastructure,

Configuration their associated Service Endpoints, https://goc.egi.eu/ (certificate is required to access) Database service downtime information and (GOCDB) contact details for participants who maintain the infrastructure.

https://wiki.egi.eu/wiki/GOCDB

Table 1 - EGI tools used heavily in infrastructure support

In case of cloud resources the providers will be able to consider the outcomes of the on-going EGI Cloud Federation Task Force, especially in ‘Scenario 5: Reliability/Availability of Resource Providers’ (https://wiki.egi.eu/wiki/Fedcloud-tf:WorkGroups:Scenario5) in order to apply the above described Grid related tools for cloud-based resources as well.

Another practice is available by SZTAKI, the use of ZABBIX (www.zabbix.com) for monitoring and testing the cloud resources was proven as a feasible way in case of OpenNebula based SZTAKI cloud (http://cloud.sztaki.hu/en/monitoring) that hosts several IDGF related services, e.g. science gateway for Autodock application (https://autodock-portal.sztaki.hu/), bridges, and storage/computing

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 10/52

D2.3.1 – Report on best practices in infrastructure support

elements. Phoronix test suite (http://www.phoronix-test-suite.com), an automated open-source testing framework, is currently under evaluation and seems to be a candidate tool especially for the benchmarking related testing according to a new cloud-related accreditation effort.

5.2 Desktop Grid related tools

In case of Desktop Grid resources Nagios framework (http://www.nagios.org/) as a widely used tool with grid plugins is recommended for monitoring purposes. For example, the following figure shows the status of bridges with Nagios.

Figure 1 - Nagios-based monitoring of bridges in operation

The RT tool (http://bestpractical.com/rt/) is an open source enterprise-grade issue tracking system. It is used by several organizations for bug tracking, help desk ticketing, customer service, workflow processes, and change management. The tool has been used by other Grid projects and organizations as well, e.g. EGI (https://wiki.egi.eu/wiki/New_Requirement_Manual) and each phase of SEE-GRID (http://wiki.egee-see.org/index.php/Grid_Operational_Tools#Request_tracker).

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 11/52

D2.3.1 – Report on best practices in infrastructure support

The RT tool for IDGF-SP (originated from DEGISCO) is - available at http://desktopgridfederation.eu/helpdesk - can be accessed after registration. One ticket queue for each major infrastructure element has been set up as it can be seen in the figure below. For each queue, a group of IDGF infrastructure support experts has been registered and assigned.

Figure 2 - RT in operation

In the following event categories a ticket is recommended to be opened:

– critical software bug or vulnerability in the 3G bridge or other software component

– major configuration change is needed in the Desktop Grid VO or Desktop Grid server

– security incident

– every problem with 72-hour or longer expected solution time

– misuse of the infrastructure

The Message board (Forum) is operated at the IDGF site based on the functionalities of Liferay portal framework and the forum provides community based support: http://desktopgridfederation.org/forum

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 12/52

D2.3.1 – Report on best practices in infrastructure support

Figure 3 - Message board in operation

The message boards implements the 1st level centralized support functionalities for the IDGF-SP infrastructure but tickets must be opened in the above described cases.

There is control over who can see certain message threads. The threads concerning the infrastructure support are restricted to members as default since they may contain sensitive information about the IT infrastructures (such as firewall settings).

The infrastructure support is channeled through the expert groups and members.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 13/52

D2.3.1 – Report on best practices in infrastructure support

6 Best practices in different desktop grid scenarios

The scenarios described in Section 6 are available on the technical wiki website (http://desktopgridfederation.org/technical-wiki or http://doc.desktopgrid.hu) in details.

6.1 Simple scenarios

Core setup The core setup is a minimal version of a SZTAKI Desktop Grid site, i.e. only the SZTAKI-BOINC software package is installed and based on it a volunteer or a private desktop grid can be formed. The core setup is the starting point for every DG site, but can be upgraded at any time by adding other components like other setups and scenarios detail.

Figure 4 - Layout of the core setup

Basic setup The basic setup is a standard version of a SZTAKI Desktop Grid site, i.e. beyond the SZTAKI-BOINC software package, the 3G Bridge component is also installed. 3G Bridge enhances the way of submitting and handling jobs of the BOINC project, therefore we recommend to start with this setup. Based on this setup, both a volunteer or a private desktop grid can be formed. The basic setup is the starting point to use the various features and services 3G Bridge provides through its various plugins.

Figure 5 - Layout of the basic setup

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 14/52

D2.3.1 – Report on best practices in infrastructure support

Cloud setup The cloud setup is an enhanced version of a SZTAKI Desktop Grid site, i.e. beyond a complete BOINC project and the enhanced job submission mechanism provided by 3G Bridge a cloud resource handling mechanism supported by the Cloud plugin of 3G Bridge is also available. Comparing to the basic setup this scenario offers the possibility to dynamically attach BOINC resources in a Cloud infrastructure on demand. We recommend to use this setup whenever cloud resources are about to be used for workunit processing. Based on this setup, both a volunteer and a private desktop grid can be formed. The cloud setup is the starting point to integrate cloud resources to your SZTAKI Desktop Grid server.

Figure 6 - Layout of the cloud setup

6.1.1 SZDG BOINC project with the KOPI application

This section describes a best practice for the Basic setup scenario.

Figure 7 shows the two main processes involved in the Desktop Grid enabled KOPI system. The first one (see 1. on Figure 7) is the preprocessing of Wikipedia data with the help of SZTAKI Desktop Grid. Here, one at a time, a periodically published Wikipedia database dump is sliced into multiple jobs (also called as work units) by the Work Unit Generator.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 15/52

D2.3.1 – Report on best practices in infrastructure support

Figure 7 - The two main processes for the Desktop Grid enabled KOPI system: (1) the preprocessing of Wikipedia data on SZTAKI Desktop Grid, and (2) the plagiarism search at the KOPI Portal

These work units are submitted as a single batch to the desktop grid where the volunteer resources process them. The returned results are continuously assimilated into a single dataset. Once all work units are finished the dataset is complete and the KOPI database is updated. This process must be repeated for each different language Wikipedia dump and every time a new dump is published by Wikipedia. The exact steps are detailed in the next section. The second process (see 2. on Figure 7) is the plagiarism search initiated at the KOPI Portal (c.f. Section 0). This is a request in the form of a document submitted by the user of the portal. First, during candidate selection the submitted text is compared to the KOPI database and suspicious sentences and fragments are selected. These candidates are then evaluated based on a similarity metric and the most promising ones are returned to the user via the portal. Section 0 discusses the details of the selection and evaluation.

Desktop grid based Wikipedia data preprocessing

Figure 8 shows the overview of the architecture that is used for preprocessing the Wikipedia content for the KOPI Plagiarism Search Portal using desktop grid resources. The architecture consists of four main components which are detailed in the following subsections. First the KOPI Server related parts: (i) the Master (Work Unit Generator and Dataset Builder) and the (ii) Scheduler used for improving

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 16/52

D2.3.1 – Report on best practices in infrastructure support

batch completion times; then the (iii) SZTAKI Desktop Grid (SZDG) Server components and volunteer resources; and finally (iv) the KOPI desktop grid application deployed at the SZDG Project.

Figure 8 - The architecture used for Wikipedia data preprocessing

KOPI Server

The KOPI Server (shown on Figure 2) is responsible for managing the desktop grid and contains two components: (i) the Master and (ii) the Scheduler. A single Wikipedia dump is submitted to the desktop grid and handled as a batch. The Work Unit Generator (Master) is responsible for partitioning these XML dumps into smaller pieces for the desktop grid and combining the results into a single dataset as they return. Wikipedia publishes its content periodically in an XML format file for each different language. They have a size of several GBs (e.g. Hungarian - 1.7GiB, French - 8.7GiB, German - 10GiB and English - 36GiB) and are therefore too large and require too much CPU time to be processed sequentially. Fortunately the pre-processing procedure of these dumps can be easily parallelized by splitting them into smaller chunks (e.g., by splitting on article boundaries). The chunks can be sized arbitrarily, however we wanted to satisfy several constraints: (1) size of the input and output files and thus the time spent doing network I/O should be small compared to the time spent processing (e.g., inputs and outputs less than 10MiB); (2) runtime should be less than an hour, since we are going to run legacy code on volatile (volunteer) resources and have no possibility to implement any checkpointing mechanism. Based on our empirical studies a work unit with around a total of 3 MiB input files requires an hour of processing time on a current dual core computer (however the application is not multi-threaded) and will create around a 7 MiB compressed output file.

The Scheduler component is responsible for the submission and management of the work units of the batch. It is also responsible for minimizing the total time required to finish the batches (“makespan”). This is currently achieved by resubmission techniques which are discussed in detail in Section 0. The

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 17/52

D2.3.1 – Report on best practices in infrastructure support

Dataset building part of the Master is responsible for processing the computed results returned from the desktop grid and combining them into a single dataset that will be used for updating the KOPI database.

SZTAKI Desktop Grid

SZTAKI Desktop Grid (SZDG) Project is a BOINC based public desktop grid (DG) project, or is also referred as (VC) project, running since 2005 at the Laboratory of Parallel and Distributed Systems (LPDS) of MTA SZTAKI. Currently SZDG has over 41,000 registered volunteers with more than 92,000 hosts total. Volunteers join the project via installing a small client application, the BOINC Client, which first downloads the deployed application(s) from the project. Second it fetches periodically new input files and parameters for the application in form of “work units”. These compute intensive tasks are then processed in the background. Any application for the desktop grid must be first deployed and registered, thus it is not possible to submit and run arbitrary ones. Desktop grids are best suited for long term bag-of-task or parameter study type of applications where the application does not change frequently. There is no possibility of communication between running tasks either, thus so called “embarrassingly parallel” applications are required. These restrictions stem from many factors: (i) the volunteer resources are traditionally home desktop behind Network Address Translation (NAT) and firewall so they cannot be reached from outside; as a result (ii) the clients on the volunteer resources always pull tasks from the server, and not the server pushes them to the clients, thus there is no guarantee that two clients are processing work (two tasks are running on different hosts) at the same time. Contrary to traditional desktop grids, which run usually a single application, SZTAKI Desktop Grid is considered as an “umbrella” project since it hosts many applications, KOPI being one of them. Any volunteer has the freedom to select which application they want to run from those deployed at the project. This means that applications usually compete against each other for a given set of resources and each of them can use a fraction of the available total for the project. Volunteer resources tend to be volatile and non-reliable; it is possible that an assigned work unit never finishes on a host. The middleware is prepared to handle this by setting a “deadline” for all in progress work units. Once this deadline passes the work unit is considered lost and a new instance is sent to another host.

The 3G Bridge (in Figure 8) running on the SZDG Server is a generic grid-grid bridge, which provides interoperability between different distributed computing infrastructures (DCIs) on a job based level. It is able to fetch jobs from different DCIs, queue them internally and submit them to different middleware using its modular plug-in architecture. Its “WSSubmitter” component provides a generic SOAP based web service interface for accessing infrastructures and remote job management: it allows submitting jobs directly to the different queues of 3G Bridge and also managing these jobs. 3G Bridge provides command line tools as well, which can access the web service interface, thus allowing (a) remote job submission to SZDG (or any BOINC based project with 3G Bridge deployed); (b) querying the status of running jobs; (c) cancelling jobs and (d) retrieving their outputs. In our case this service and the remote access feature are used by the KOPI Server components to interact with SZTAKI Desktop Grid.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 18/52

D2.3.1 – Report on best practices in infrastructure support

KOPI Desktop Grid Application

Generally developing and porting applications to desktop grids are not a straightforward process. BOINC has several constraints and requirements for application development: (a) binaries have to be linked with the BOINC libraries; (b) special BOINC API function calls for initialization and termination must be called from the application; (c) file access functions should be replaced by the ones provided by the BOINC API; (d) various optional functions (e.g., reporting completion ratio of the currently running work unit) should be implemented and (e) the BOINC API is only available for a set of programming languages (C/ C++, Fortran and Python). There are some additional requirements if the desktop grid is open to the public and volunteer resources are used: (i) application binaries should be available for the major operating systems (e.g., Windows, Linux); (ii) 32/ 64 bit application binaries should be provided as well for the supported operating systems; and (iii) from the optional functions (see (d)) the progress reporting should be implemented to keep the volunteers informed about the work they are currently processing. These requirements and constraints stem from many factors, an extensive list detailing them and the API description can be found on the BOINC wiki .

In the case of KOPI, chunking and natural language processing are used at two distinct places in the whole system. First, for the data used for dataset building and database update (see process 1 in Figure 7), second, when the suspected document is compared to the index and the candidate chunks (see process 2 in Figure 7). It is necessary to use exactly the same method at these two points to ensure that even if there are some bugs or errors (like a stemming error) they are the same on both sides. The easiest way to ensure this is to have only one implementation of the functions. The functionality was originally implemented in PHP at the KOPI Portal, so we decided to use this PHP code for the desktop grid application as well. The resulting code is possibly slightly slower than it would be in case of using compiled code (e.g., Java). However, it requires less memory and the stemming, which is one of the slowest parts, is done by an external program, namely Hunspell [1]. This functionality (the preprocessing) is resource-intensive and embarrassingly parallel, therefore it can be ported to desktop grids. It performs five distinct steps: (i) language identification; (ii) MediaWiki XML to plain text conversion; (iii) text cleansing and chunking; (iv) stemming; and (v) output creation.

Language identification (see (i) above) at first seems to be superfluous as the language of the different Wikipedia from which these chunks are extracted is known by the system. However, as Wikipedia is only one of the many possible inputs, the system is designed so that the same software can be used to chunk text files with unknown content. The input does not even need to be one single file either; it can be hundreds of smaller files differing in language and size. Our current scenario is Wikipedia, however later the same system are to be used to convert document collections harvested from other sources (e.g., from the web). The system uses the n-gram method for language identification [2][3], but the algorithm was modified to be able to quickly recognize if the document contains parts written in other languages as well, even if those parts are scattered all over the document [4]. The latter is important when, for example, documents harvested from the Internet are used. There are currently 15 languages supported, including English, Hungarian, French, German, Spanish, Bulgarian and Dutch. The list can be easily extended with the use of some sample text written in the desired language. The client can detect the file format of the input document; in case MediaWiki is detected it will be converted to plain text (see (ii) above). There are several possibilities to convert wiki format to plain text. The most obvious would be to install a MediaWiki instance and harvest the pages. This, however, raises two

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 19/52

D2.3.1 – Report on best practices in infrastructure support

obstacles: first, on the desktop grid client this would be unfeasible; secondly, this would generate high load even on a server (which the clients could somehow access) as the XML to HTML conversion is compute-intensive. The other option is to convert it with some stand-alone software. Most of the software available freely are either operating system dependant or need installed software, which makes them unsuitable to be used in desktop grid environments. Others make errors when converting special pages, or truncated long ones. For these reasons the conversion is done by a self-written component with the following features:

● the names and boundaries of the Wikipedia pages are kept, ● within this only the textual information is necessary, ● “info boxes” – as they are duplicated information – are filtered out, ● comments, templates and math tags are dismissed, ● other pieces of information, like tables, are converted to text.

For the purposes of multilingual plagiarism search having text versions is only half of the work. Each XML file contains thousands of Wikipedia articles which have to be split to determine where the copied content can be found at the plagiarism search level. Articles are then split into smaller chunks (see (iii) above), according to the algorithm used by the search engine and all non-alphanumeric characters are disregarded, as they do not carry any useful information. After chunking the chunks are stemmed word-by-word (see (iv) above) and unlike by automatic translation engines, all the possible stems (lemmas) are kept (e.g., divers → divers (assorted), diver). In English this is not so important, but for agglutinative languages like Hungarian this step is essential as word-forms have more often multiple lemmas. The free Hunspell program is used for stemming, as it is available for both Linux and Windows with dictionary files for more than 100 languages and dialects. Dictionary files are 1-2 MiB large per language so only English, German, French and Hungarian are currently included in the client. The last step of the process is the creation of the output result file (see (v) above), which is used as an input for the database upload and indexing step at the server. This file contains all the necessary information for the search process: the fast indexed candidate selection and the linear similarity metric.

Our goal was to reuse the existing implementation from the KOPI Portal code – written in PHP – and to be able to update the desktop grid application with little effort if the portal code changes over time. This meant that we could not use the BOINC API directly, thus alternative methods for application development had to be considered. GenWrapper [5] – a previous work of ours – is aimed at specially solving these types of problems. It acts as a generic wrapper between the application and the middleware, in this case BOINC. It allows running applications without modification (“legacy applications”) on different DCIs. It provides a POSIX like scripting environment (shell) for bootstrapping the environment for the application, executing the application and post-processing the results on the clients before the upload. GenWrapper is available for different operating systems (for 32 and 64 bit versions of Mac OS X, Linux and Windows). In the case of KOPI this meant that no changes to the portal code were necessary. Only the GenWrapper script had to be written, which performs the following steps: (1) bootstraps the environment by deploying the bundled PHP and Hunspell binaries in the working directory, (2) prepares the input files; (3) executes the portal code via PHP which then processes the input files and finally (4) collects and compresses output files to be uploaded to the desktop grid server. The resulting desktop grid application bundle including all binaries is only 8-13 MiB, depending on the operating system.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 20/52

D2.3.1 – Report on best practices in infrastructure support

Improving batch completion times on the desktop grid

Desktop grids provide access to a considerable amount of computing power; however reliability is always a problem when using volatile volunteer resources. Batch makespans are greatly affected even by a single faulty (or slow) resource. Machine availability in wide-area distributed computing follow the Weibull or Pareto distributions [6], which are power law probability distributions, thus resulting in the so called “long tail effect” for batch completion times. This effect can be mitigated, thus batch completion times improved e.g., by simple resubmission mechanisms. In this section our “black box” approach is detailed. We refer to a Wikipedia dump as a “batch”, tasks belonging to a batch and submitted to 3G Bridge as “jobs”, and finally tasks in the SZDG middleware as “work units”. Depending on the application settings of the desktop grid a single submitted job can be executed by multiple volunteers. In our approach the middleware is considered as a black box, namely we have (a) no influence on the work unit settings (e.g., replication factor, required quorum), (b) no influence on the scheduling policies. However, we assume that (c) the middleware is not overcommitted and (d) its load (non KOPI related in our case) does not cause interference (or can be considered constant) for our application. We also do not handle failures explicitly, we assume that (e) in the first place the middleware is responsible for failure tolerance and (f) there is no permanent failure in the system and (g) if we still encounter a failed job from the middleware it can be resubmitted safely. We chose the black box approach (see (a), (b) and (c)) since we wanted to make a generic approach independent of the capabilities of the underlying middleware (in this case BOINC). However we acknowledge that in some cases this approach may result in increased load at the middleware, e.g., BOINC uses replication and a resubmission at a higher level will result in a new work unit which is then replicated to multiple instances instead of a single new instance of the previous work unit. Constraints (c) and (d) can be considered heavy restrictions however for example in our case public desktop grid projects usually run a single application which in turn can use all the resources available.

First let t denote a job, and B a batch consisting of n jobs:

B  t1,t2 ,t3 ,...,tn 

imj is the j-th running instance of the m-th job, rc is a resend constant which defines how many jobs can be resubmitted for a smaller batch, rr is a relative value which defines the percentage of jobs that can be resubmitted, the function unfinished(B) returns the unfinished t-s from a batch B, the function useful(imj) returns 1 if tm unfinished(B) and 0 else. All the unfinished jobs are resubmitted if the following criterion is met:

n c useful(imj) < max(rc , rr  B ) m1 j1

If at all resubmissions all unfinished jobs are resubmitted it can be rewritten: let c denote the number of times jobs have been resubmitted:

n c useful(imj)  c 1 unfinished(B) m1 j1

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 21/52

D2.3.1 – Report on best practices in infrastructure support

and with that the new criterion for resubmission is:

c 1 unfinished(B)  max(rc ,rr  B)

For our experiments we chose empirically the following constants:

r  100 c rr  0.1

These parameters mean that batches with less than 1000 jobss are resubmitted after having less than 100 unfinished ones running. For larger batches this occurs after having fewer potentially useful running jobs than 10 percent of the size of the batch.

KOPI Portal

Figure 9 - The interface of the KOPI Portal

The KOPI Online Plagiarism Detection Portal is a unique, open service for Hungarian and English speaking web users that enables them to check for identical or similar contents between their own documents and the files uploaded by other authors. The monolingual plagiarism search function works in any European language, due to the language-independent algorithm. At the time of this writing Hungarian, English and German texts can be compared to the English and Hungarian Wikipedia, but the translational detection system is being continuously improved to support other languages as well. On the web interface (see Figure 9) the user can upload documents and start a plagiarism search using

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 22/52

D2.3.1 – Report on best practices in infrastructure support

these documents and the options defined above. When the search is finished, the result is emailed to the user and can be seen on the portal interface as well. If the document was compared to the Wikipedia, a list of possible Wikipedia articles is returned (see Figure 10), with the title of the article, the original sentences, and the suspicious ones from the document. Thus the user can hand-check the result given by the system, which is necessary as the system does not decide in place of the user: it does not distinguish between citation and plagiarism, the last word is that of the users.

Figure 10 - Results of a plagiarism search presented by the KOPI Portal. The user can decide what is considered as citation or plagiarism.

Results

Until June 2008 static HTML dumps from all Wikipedia wikis were available from Wikimedia Foundation [7], but this project has discontinued since then. As these text versions can be used for several other purposes as well, we decided to share them and make them available for everybody [8]. Currently the English (5.5 GiB), German (2.1 GiB), French (1.4 GiB) and Hungarian (311 MiB) versions can be downloaded, other languages will follow shortly.

The KOPI application is deployed and running on SZTAKI Desktop Grid permanently. SZDG is an umbrella project, but donors can select which application(s) they want to run. This leads to a great number of donors who support KOPI exclusively. KOPI work units have higher priority set on the server, so if volunteers allow multiple applications from SZDG, still KOPI will be processed first. These steps ensure that there is computing capacity available for the KOPI application (see criterion (c) and (d) in Section 0) whenever required. For evaluation we included the measurements of six representative Wikipedia batch conversions: two English, one French, one German and two Hungarian batches.

Table 2 shows the results, namely (i) the number of jobs in the batch; (ii) the total number of jobs executed (including resubmissions) for the batch; (iii) the mean round trip time (RTT) for the initial submission of the batch; (iv) the standard deviation of the RTT’s; (v) the mean and (vi) standard deviation for all jobs; finally (vii) the mean and (viii) standard deviation for the “useful” jobs. The

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 23/52

D2.3.1 – Report on best practices in infrastructure support

dates given in Table 2 denote the date of the Wikipedia dump used by the conversion and not the date when experiments started, as all experiments were executed independently. For discussion we group the six conversions (batches) into three groups based on their number of jobs: (1) the English ones, (2) the French and German ones and (3) the Hungarian conversions.

jobs in jobs first first all jobs: all jobs: useful useful jobs: batch executed submission: submission: mean deviation jobs: deviation mean deviation mean

English - 3326 4277 120464 157630 117847 152527 76744 53313 16/11/2011

English - 3348 4189 151052 182702 130929 172591 94140 69613 02/12/2011

French - 823 1093 67114 104129 71312 111911 43767 22632 17/01/2012

German - 946 1209 96231 100432 100080 123244 72002 26847 17/01/2012

Hungarian - 162 409 62483 91627 72987 119442 24285 16354 06/01/2012

Hungarian - 162 381 75307 120757 61917 114965 22181 14100 15/01/2012

Table 2 - Statistics for round trip times of batches in seconds

For the first group our algorithm introduced 28.5% (951 jobs) and 25.1% (841 jobs) overheads respectively for the total job numbers (including resubmissions), while resulting in 36.29% and 37.28% improvements in mean round trip times and 66.18% and 61.9% in standard deviation considering the useful jobs and first submission round trip times. For the second group the algorithm caused 33.2% (270 jobs) overhead for the French and 27.8% (263 jobs) overhead for the German conversion in terms of job numbers. However the mean round trip times were improved by 38.63% and 28.06% while the standard deviations reduced by 79.78% and 78.22%. The algorithm caused the largest overhead for the last group (for the Hungarian conversions) in terms of percent, namely 152.47% and 135.19%, however these mean only 247 and 219 additional jobs. In this case the mean round trip time was improved by 66.73% and 64.18%, while the standard deviation by 86.31% and 87.74%. This algorithm proved to be effective, it only resulted in additional 25-28% of jobs at the largest, the English Wikipedia.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 24/52

D2.3.1 – Report on best practices in infrastructure support

As can be seen in Table 2 smaller languages like Hungarian (which has around 20 times less content and thus fewer jobs in the conversion than the English) results in relatively the highest resend rate (152.4%), but even so in numbers it is less than for the English Wikipedia (245 vs. 841 and 219 vs. 951). The overall resend rate for the 4 languages is 31%. By resending all the unfinished jobs regularly the probability that all copies of a given one will fail decreases exponentially (considering criteria (e), (f) and (g) in Section 0). This algorithm calculates the time of the resubmission based on the number of returned jobs. As our results show, this is more efficient for larger batches, where the resend rate is around 25-38%. As the size of the batch decreases the resend rate (in percentage) grows.

Figure 11 - Time lapse of a batch (left side) and its jobs in round trip time order (right side) for an English Wikipedia dataset

Figure 11 - Figure 13 show the time lapse (left side) and jobs in round trip time order (right side) for three selected batches from the previous six. The time lapse measurements are displayed up to 4.00e+05 seconds (~4.6 days) from start. Similarly round trip time measurements are limited in time, but show times up to 8.00e+05 seconds (~9.25 days) and any job reaching this threshold can be considered as unfinished and highly responsible for the “tail effect”. Each chart shows the details of (i) the initial submission, (ii) total jobs and (iii) the useful jobs, which can be considered as three individual batches with own time lapse and round trip measurements on the same chart. The vertical and horizontal dashed lines on the time lapse charts represent the time and number of initial jobs when the first resubmission occurred.

We can see that the time lapses on the three charts have similar characteristics and the round trip time measurements as well. We can see that the number of initial and total jobs is the same obviously until the first resubmission; however the difference between the initial and useful jobs shows from the beginning that some jobs were considered from the first resubmission. Also the first size difference of the job number increase at the first resubmission (denoted with dashed horizontal and vertical lines) shows that a single resubmission was not enough, not all new jobs are considered useful. Similarly the round trip measurements show that there are jobs from the initial submission which are considered unfinished and resubmitted instances were useful jobs. We can also observe that although some jobs

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 25/52

D2.3.1 – Report on best practices in infrastructure support

have lower round trip times they were still not considered as useful jobs, e.g., the chart of the French conversion shows this: solid line (total jobs) deviates upwards from the dashed line (total). This means that a job was resubmitted, but a previous running instance finished before the new one. In such case our algorithm currently does not cancel the remaining instances, so they remain running.

Figure 12 - Time lapse of a batch (left side) and its jobs in round trip time order (right side) for a French Wikipedia dataset

Figure 13 - Time lapse of a batch (left side) and its jobs in round trip time order (right side) for a French Wikipedia dataset

If we look at the useful “batch” completion times we see that the English conversion finished after 3.1e+05 seconds, the French one after 1.76e+05 seconds while the German one after 1.97e+05 seconds. However, if we look at the initial batch we can see that in each case jobs were running at the

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 26/52

D2.3.1 – Report on best practices in infrastructure support

8e+05 cutoff thresholds of the measurements. Considering this threshold we can state that the English conversion took 61.25% less time with 25.1% (841 jobs) overhead, the German conversion took 75.38% less time with 27.8% (263 jobs) overhead and the French conversion took 78.01% less time with 33.2% (270 jobs) overhead total.

6.1.2 edges@home with clouds

This section describes a best practice for the Cloud setup scenario.

Hosting machine specifications The BOINC version of EDGeS@home is hosted on a virtual machine run on the computing infrastructure of MTA SZTAKI. The BOINC project URL is http://home.edges-grid.eu/home/. The virtual machine has the resource allocations described in Table 3. Additional 100GB disk capacity was allocated to satisfy the needs of the new applications (GBAC and Autodock Vina in particular).

Property Value Operating System Debian GNU/Linux 6.0.5 (Squeeze) CPU Intel Xeon E5420 (2 cores) Memory 1GB Swap 512 MB Disk 14 GB OS + 200 GB BOINC Network connection 1 GBit/s Table 3 - EDGeS@home BOINC server specifications

Installed software versions The currently installed versions of the BOINC and EDGI related software components for the EDGeS@home BOINC server are summarised in Table 4. The software components are installed from Debian packages using their official software repositories where possible. The software components and their dependencies are regularly updated from these repositories to keep the system secure and incorporate new features in the infrastructure as they become available.

Property Value Operating System Debian GNU/Linux 6.0.5 (Squeeze) BOINC SZDG 6.11.0+r18946-11 3G Bridge 1.9-1 Table 4 - EDGeS@home BOINC server software components

Installed BOINC applications The EDGeS@home BOINC desktop grid has the following applications installed:

 ISDEP (6.06) The purpose of this code is solving the collisional transport in fusion plasmas using the equivalence between the Fokker-Planck and Langevin equation (Stochastic Differential

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 27/52

D2.3.1 – Report on best practices in infrastructure support

Equation). Altought in principle it solves the linearized kinetic equation, we include non linear terms using an iterative method. EGI Application Database: http://appdb.egi.eu/store/software/isdep

 EMMIL (1.02) One of the e-Markets simulators is EMMIL. EMMIL facilitates three-sided negotiation between buyers, sellers and third party logistics providers aimed at optimising the total costs that has never been offered before. EGI Application Database: http://appdb.egi.eu/store/software/emmil

 PR (1.00) Frequent patient readmissions have a significant organisational consequence. This has enabled healthcare commissioners in England to use emergency readmission as an indicator in the performance rating framework, where hospitals are rated based on their levels of readmission. The Patient Readmission Application is a statistical model developed in R, where individual hospitals propensities for first readmission, second readmission, third (and so on) are considered to be measures of performance. EGI Application Database: http://appdb.egi.eu/store/software/patient.readmission.application

 X-Ray (1.00) The X-ray is a technique used a lot in some areas (physical, materials science, medicine) to obtain information about the particle which the X-ray hits (e.g. size and shape of the particle). EGI Application Database: http://appdb.egi.eu/store/software/x.ray

 VisIVO (1.00) Visualisation Interface to the Virtual Observatory. See the document avilable at http://www.edges-grid.eu/c/document_library/get_file?folderId=63854&name=DLFE- 1633.pdf for details about this application.

 Protein (1.00) Protein Molecule Simulation Application. See the document available at https://sites.google.com/a/staff.westminster.ac.uk/engage/Home for details about this application.

 Laserac (1.00) This application simulates the dynamics of laser devices using a Cellular Automata-based discrete model. Individual-based models like Cellular Automata are very effective to carry out detailed simulations of complex systems in a broad range of fields of science and technology. This kind of models has been recently applied by the proponent to simulate one of the most paradigmatic complex systems of particular technological importance: laser systems. Our application uses this model to carry out simulations intended to understand the emergence of macroscopic behaviors in lasers, arising from the interaction of simple microscopic components, and to simulate specific optoelectronic devices of arbitrary shape. EGI Application Database: http://appdb.egi.eu/store/software/cald

 DSP (1.00) Digital Alias-free Signal Processing (DASP) is a set of methodologies that use non-uniform sampling and specialised signal processing algorithms that allow processing signals with unknown spectral support in the frequency ranges that are wider than half the average sampling rate. The grid enabled application allows researchers in this area to design and analyse non-uniform sampling sequences more efficiently than it was possible earlier. EGI Application Database: http://appdb.egi.eu/store/software/dasp

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 28/52

D2.3.1 – Report on best practices in infrastructure support

 Autodock 4.2.3 (1.01) AutoDock is a suite of automated docking tools. It is designed to predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure. EGI Application Database: http://appdb.egi.eu/store/software/autodock

 Autodock VINA (1.03) AutoDock Vina. The latter provides several enhancements over the former, increasing average simulation accuracy whilst also being up to two orders of magnitude faster. Autodock Vina is particularly useful for virtual screening, whereby a large set of ligands can be compared for docking suitability with a single receptor. EGI Application Database: http://appdb.egi.eu/store/software/autodock.vina

 GBAC (1.70) Generic BOINC Application Client. Virtualization based generic application execution framework, see http://gbac.sourceforge.net/ for more details.

Cloud resources attached to the project Table 5 describes the attached Cloud IaaS resources to EDGeS@home. These resources are single core thus the number of cores equals to the number of resources. All the resources run a specially prepared Virtual Appliance for BOINC. These instances run 64bit version Debian Linux 6.0 and have at least 2GB free space for BOINC tasks. The appliance uses contextualization data to attach to a specified BOINC project with a specified user (using the authenticator of the user).

Cloud Acronym Cloud Provider EDGeS@home Middleware Number of User Resources CICA Centro CloudCICA OpenNebula 100 Informatico Cientifico de Andalucia SZTAKI MTA SZTAKI CloudLPDS OpenNebula 64 UNIZAR University of CloudUNIZAR OpenStack 50 Zaragoza UoW University of UoW OpenStack 52 Westminster EC2 Amazon EC2 CloudAMAZON Amazon EC2 2+ (dynamic)

Total 268++ Table 5 - IaaS cloud resources attached to EDGeS@home

The clouds are managed from a special instance on the SZTAKI cloud. This instance runs a 3G Bridge with several instances of the cloud plug-in. Each plug-in instance handles a cloud from Table 5 for EDGeS@home. The plug-in starts, stops instances and makes sure that new instances are contextualized properly, i.e., connect the instances to EDGeS@home with the proper credentials.

Job assignment to cloud resources A new BOINC component (daemon) was developed and deployed which continuously assigns the oldest unfinished jobs to cloud resources to improve batch completion times.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 29/52

D2.3.1 – Report on best practices in infrastructure support

Property Value Number of users 12225 Number of hosts 19575 Number of CPU cores 66605 Estimated performance 1240 GFlop/s average Results ready to send 407 Results in progress 7505 Workunits waiting for validation 0 Workunits waiting for assimilation 1625

Table 6 - EDGeS@home BOINC project status

6.2 Complex scenarios

University DG The most typical scenario that fits to the requirements of scientists on universities to execute jobs on a private (i.e. resources owned by the university) desktop grid is called as “University Desktop Grid”. This scenario provides

 an easy-to-install SZTAKI-BOINC package to schedule the incoming workunits among the attached BOINC clients

 a 3G Bridge component to ease the handling of jobs submitted externally

 a MetaJob plugin to handle the submission of multiple jobs

 a GBAC virtualization framework (deployed as BOINC application) to enable the execution of non-registered linux binaries

 a comfortable web-based graphical portal, called gUSE, to ease the submission of jobs and complex applications like scientific workflows

There are special cases when the deployed Desktop Grid server is open for volunteers as well. In this situation the infrastructure is the same, whoever the applications and workunits must be handled with special care, not to flood the volunteers with erroneous workunits or faulty applications.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 30/52

D2.3.1 – Report on best practices in infrastructure support

Figure 14 - Layout of University DG scenario

6.2.1 University of Westminster Local Desktop Grid This section describes a best practice for the University DG complex scenario.

The University of Westminster Local Desktop Grid (WLDG) connects laboratory PCs of the University of Westminster, London, UK into a BOINC based Desktop Grid infrastructure. The University is set over four main campuses and some other locations in Central- and North-West- London, each of them offering a different number of Microsoft Windows based dual core PCs for teaching purposes. Over 1900 of these machines are connected to the WLDG. The BOINC project url is http://dg-server2.wmin.ac.uk/WminDG, however the server is not accessible from outside of the university due to firewall restrictions.

Figure 15 - The Westminster Local Desktop Grid

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 31/52

D2.3.1 – Report on best practices in infrastructure support

The installed desktop grid middleware is the SZTAKI BOINC Debian package. This is a modified version of the original BOINC installation that automates the installation of other related middleware such as the 3G Bridge and the Wssubmitter, It's been compiled in order to accomplish with the requirements of local or institutional desktop grids. In these scenarios the resources are controlled centrally (in the case of the WLDG by the central computing services of the University), and there is no need for a public website or a credit system to attract donors. The Figure 15 illustrates the WLDG and the resources available at different campuses in August 2012. As the University is under major re-organisation, the number of PCs connected from the different campuses will change significantly in the near future. Also, the PCs in the labs are continuously upgraded and replaced. However, the applied automatic deployment procedures guarantee that the newly deployed machines are connected automatically to the WLDG right after their installation.

Upgrading WLDG

Upgrading BOINC Server

The server of the WLDG is called dg-server2.wmin.ac.uk is the second generation of the old dg- server.wmin.ac.uk which was used also in the EDGES project. The whole debian system has been upgraded to Debian Squeeze (6.0). In order to accommodate the requirements of the EDGI project, both the SZTAKI Local Desktop Grid and the 3G Bridge packages had to be upgraded to the newest version. As a result the boinc-server2 and boinc-skin-ldg packages have been upgraded to version 1.6.11.0+r18946-11, whilst the 3G Bridge package has been replaced with version 1.9-1. In both cases we used the SZDG APT repository for Debian Squezze to obtain the packages.

Setting Up a New 3G Bridge The currently installed WS-PGRADE portal in Westminster is version 3.4.5 which requires at least version 1.8 of the 3G Bridge to be installed. This requirement clashes with the requirement of the EDGI project that specified the version 1.8 bridge release as the required minimum version for BOINC based DGs.

The initial solution was to deploy a second bridge on the dg-portal machine. For this purpose an additional 3G Bridge had been deployed manually. This bridge was 1.02 version. Similarly to the production bridge it submits work to the main BOINC project so the jobs could be computed on any of the 1600 PCs connected to the WLDG. Currently, the WS-PGRADE portal works directly with the 1.9.1 3G Bridge version installed on the dg-server2.wmin.ac.uk server using the new DCI bridge technology.

Utilising the Westminster Local Desktop Grid The University of Westminster Local Desktop Grid is utilised in two different ways depending on the expertise of the user community and on the complexity of the ported application. In the first scenario the user uses the WS-PGRADE portal, whilst in the second a 3G Bridge wsclient based command line client and custom scripts are used for submission.

The Westminster Grid Application Supports Service (W-GRASS) offers application porting services and runtime support for perspective users. Currently 9 different applications are supported by the WLDG from diverse disciplines, including bio-molecular simulations, 3-D video rendering, x-ray profile analysis, astrophysics data sheets visualizations and digital signal processing. Users can access

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 32/52

D2.3.1 – Report on best practices in infrastructure support

the infrastructure via an easy-to-use generic portal interface, the WS-PGRADE portal. The portal uses the bridge to submit work to the BOINC server.

Submitting Applications to WLDG

Submission from the WS-PGRADE portal Three applications, the SIMAP, the rendering and the Protein Molecule Simulation are submitted this way. In all cases the easy user mode of the portal is used. The user submits the work through a customised web form. In all cases the size of a typical scenario can consist of 100-1000 work units and the experiments are submitted in daily basis, in some large experiments the number of workunits can reach some thousands of them.

Command line submission In the case of the second scenario the user runs a command line client for the submission. This is depicted in Figure 16. The wsclient does inject workunits one by one in the BOINC server concerning some parameters such as input and output files, contextualization, etc. In order to minimize the scientist effort a generic wsclient based submitter script has been developed. The script provides functions for jobs (based on hundreds or thousands of workunits) submission and assimilation of them as well. It also takes care of cleaning up the unnecessary files and database entries from the 3G Bridge after each experiment.

Submitter script 3g-bridge

wsclient 3g-bridge wssubmiter

Figure 16 - Command-line Submission to WLDG

The AutoDock Vina, public_autodock_4_2_3, public_autodock_vina_1_1_2, autodock_4, mentalray, keepalive, classification, simul8 and visivo_space_mission applications can be submitted this way. The size of the typical experiment can be from 10-200 to 10000-20000 jobs. All the cases the generic submitter script has been customized in order to match the application requirements.

Running the rendering application on WLDG

Setting up Maya rendering As a new school project the students of the University will submit AutoDesk Maya rendering jobs to the Westminster Local Desktop Grid. However the AutoDesk Suite that is needed by the application is only installed on some of the University labs. Only the machines of these labs are capable to execute the Maya rendering jobs. Therefore a separate BOINC user had to be set up as depicted in Figure 17 to avoid the other clients picking up Maya work and to avoid the Maya enabled machines to compute work from other applications.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 33/52

D2.3.1 – Report on best practices in infrastructure support

The users of the application start rendering jobs from the WS-PGRADE portal. During this last year the application workflow and the WS-PGRADE integrating scripts has been fully rewritten from scratch. Now the users are able to submit easily from the WS-PGRADE portal trough the 3G Bridge without any special middleware as it was in earlier environments.

Figure 17 - Running Maya on WLDG

Setting up clients for Maya rendering In order to connect to the aforementioned Maya rendering BOINC project with the lab machines, a different BOINC user configuration had to be deployed on the Maya enabled lab machines. All other configuration options are identical with the previous configurations, the client applications are running under an unprivileged user account and the computation is suspended when the machines are used by students. In order to deploy the client application automatically, investigations are being performed to

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 34/52

D2.3.1 – Report on best practices in infrastructure support

create the necessary ZenWorks objects, ZenWorks is the automatic software deploying appliance used in the University of Westminster for MS Windows systems.

Figure 18 - Welcome page of the Autodock portal

6.2.2 Autodock portal + edges@home

This section also describes a best practice for the University DG complex scenario, however it has less restrictions, i.e. provides open access and uses volunteer DG resources.

The AutoDock portal (Figure 18) available at http://autodock-portal.sztaki.hu offers ready-to-use AutoDock and AutoDock Vina applications for its users on the EDGeS@home BOINC public desktop grid. Access to the service is open, that is everyone is able to register with an e-mail address or a Facebook account.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 35/52

D2.3.1 – Report on best practices in infrastructure support

Figure 19 shows the outline of the AutoDock portal installation and its surrounding infrastructure. As it can be seen, the gUSE service, the EDGeS@home desktop grid and the MySQL database used by the desktop grids are located within one network connected through a GbE connection. The infrastructure hosting these services is protected by a firewall restricting access to these services through their web interfaces. The generic portal installation's infrastructure is similar, with the addition that it is connected to the EDGI WMS as well so jobs can be submitted through the gLite to desktop grid bridge as well.

SZTAKI Internal Infrastructure

Portal Users AutoDock MySQL Portal Database

HTTP(S)

Gigabit Ethernet Network

Firewall DG Clients

3G Bridge

HTTP

EDGeS@home DG

Figure 19 - The autodock portal infrastructure

The installed AutoDock portal service is gUSE version 3.5.0, and is running on two virtual machines (one for the front-end service, and one for the bakc-end services) in the cloud infrastructure of MTA SZTAKI.

Workflows in the Autodock portal

The AutoDock workflow can be seen on left side of Figure 20. This is a random blind docking requiring pdb input files. This workflow requires pdb input files and docks a small ligand molecule on a larger receptor molecule structure in a Monte-Carlo simulation. The workflow uses version 4.2.3 of the AutoDock docking simulation package. Users are expected to provide:  input files for AutoGrid (molecules in pdb format),  grid parameter (gpf) file,  docking parameter (dpf) file,  the number of simulations to be carried out,  the number of required results.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 36/52

D2.3.1 – Report on best practices in infrastructure support

Figure 20 - Autodock and Autodock Vina workflows in the portal

The AutoDock without AutoGrid workflow can be seen in the middle of Figure 20. This is a random blind docking requiring pdbqt input files. This workflow requires pdbqt input files and docks a small ligand molecule on a larger receptor molecule structure in a Monte-Carlo simulation. The workflow uses version 4.2.3 of the AutoDock docking simulation package. Users are expected to provide:  a docking parameter file,  a zip file containing input files for AutoDock that were generated using version 4.2.3 of AutoGrid and the former docking parameter file,  the number of simulations to be carried out,  the number of required results.

The AutoDock Vina workflow is on the right side of Figure 20. This provides virtual screening of a library of ligands. This workflow performs virtual screening of molecules using version 1.1.2 of AutoDock Vina. It docks a library of small ligands on a selected receptor molecule. Users provide:  a config file for AutoDock Vina,  an input receptor molecule in PDBQT file format,  a zip file containing a number of ligands in PDBQT file format,  the number of parallel work units,  the number of required results.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 37/52

D2.3.1 – Report on best practices in infrastructure support

6.3 Bridging scenarios

gLite to DG The goal of bridging gLite to Desktop Grid systems is to transparently transfer parametric jobs from the gLite VO to one or more supporting BOINC systems and to distribute the large number of job instances of the parametric job among the large number of BOINC client resources.

Figure 21 - Layout of gLite to DG scenario

In order to extend gLite VOs with BOINC systems we designed a bridging solution. The following two key components in the infrastructure represent the two pillars of the bridge:

. modified gLite CREAM CE: it extracts the job from the gLite system and transfers it to a remote desktop grid site

. 3G Bridge service: it is running on the remote desktop grid site to receive the incoming jobs through its WS-SOAPinterface and to insert the jobs into the BOINC project

Two optional components that are strongly recommended to be deployed:

. MetaJob plugin: due to scalability issues on the gLite side, the MetaJob plugin allows submitting a batch of jobs through the bridge

. GBAC: to handle heterogeneity of the BOINC clients and untrusted or unsafe applications, the GBAC virtualization framework enables running jobs inside virtual machines on the BOINC clients

DG to gLite The DG to gLite bridge has two sides:

1. It has a BOINC side which is a modified BOINC Core Client that, instead of running the downloaded WUs locally, writes the contents of the WU to a jobwrapper config file and launches a jobwrapper process in place of the executable specified in the WU. The jobwrapper creates an archive of the slot directory and generates a shell script which extracts the archive, runs the executable specified in the WU and finally creates an output archive. Then it submits

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 38/52

D2.3.1 – Report on best practices in infrastructure support

this script to a 3G Bridge queue on the same machine with the archived WU as its input. Then the jobwrapper stays running and tracks the execution via polling the 3G Bridge periodically and passing the results back to the BOINC client after the job has finished or if errors are encountered the jobwrapper exits with an error letting the BOINC client know about it.

2. The gLite side is a 3G Bridge instance with the EGEE output plugin that submits jobs to a gLite VO. The VO, WMS and the DN of the user used for job submission are specified in the plugin configuration. This is because in BOINC typically only the project administrator can install applications thus, the DN of the project administrator can be specified in the 3G Bridge plugin configuration. The proxies are retrieved from a MyProxy server by the 3G Bridge plugin. Also specified in the plugin configuration is a gsiftp accessible Storage Element which is used to transfer WU archives and output archives. This is done to avoid putting these files in the input and output sandbox that would generate too much load on the WMS.

The same 3G Bridge can have multiple instances of the EGEE plugin that are configured differently and handling different queues where jobs are submitted by different jobwrapper clients. Thus, a single BOINC-gLite bridge can connect several desktop grids and gLite VOs.

Figure 22 - Layout of the DG to gLite scenario

6.3.1 Modified Computing Element + edges@home

This section describes a best practice for the gLite to DG bridging scenario.

In IDGF-SP, SZTAKI runs two gLite-DG bridge mCE to connect gLite VOs to DGs. While sites and VOs can (and are encouraged to) run their own bridge mCEs, these mCEs serve as a reference installation and allows VOs to utilise DG resources who do not want to run additional infrastructure components for this or to evaluate the technology during its initial adoption phase. This section describes the modified Computing Elements as part of the IDGF-SP production infrastructure, the installed gLite and EDGI software components and their configuration.

Hosting machine specifications

The gLite-DG bridges are hosted on virtual machines run on the computing infrastructure of MTA SZTAKI. The virtual machines have the resource allocations described in Table 7. The two machines are called cr1.edgi-grid.eu and cr2.edgi-grid.eu.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 39/52

D2.3.1 – Report on best practices in infrastructure support

Property cr1 Value cr2 Value Operating System Scientific Linux 5 (Boron) CPU AMD Opteron 2210 (3 cores) AMD Opteron 2210 (1 core) Memory 2GB 512MB Swap 2GB 1GB Disk 64GB 10GB Network connection 1 GBit/s Table 7 - gLite-DG bridge mCE specifications

Installation

The installation steps were a bit different for the two CREAM mCE machines, as cr1 is an older machine installed based on a gLite distribution, whereas cr2 is a new installation based on the EMI repository. Although the two machine installations come from different software repositories, the functionality they provide (at least from gLite’s point of view) are the same. First a gLite CREAM CE was installed on the machines following the gLite installation instructions (http://glite.cern.ch/glite- CREAM) for cr1 and the EMI installation instructions (http://www.eu- emi.eu/c/document_library/get_file?uuid=fc399549-d94b-4b56-bcbd-f5dcaabe62fa&groupId=14057) for cr2. Then the EDGI yum repository was added and the EDGIExecutor component implementing the bridging to DGs were added. The configuration was done with YAIM as the supported way for configuring the service for both gLite, EMI and the EDGI bridging extension. The currently installed software versions are summarized in Table 8.

Package cr1 version cr2 version Operating System Scientific Linux 5.5 (Boron) gLite 3.2.2-1 - EMI - 1.11.0-1 EDGIExecutor 1.5-0 1.6.1-0 Table 8 - gLite-DG bridge mCE software components

Current Configuration

The gLite-DG bridge mCE allows gLite VOs to utilise DG resources by bridging jobs received from the SG side to the 3G Bridge WSSubmitter service of connected DGs . The CREAM DesktopGrid VO mCEs operated by SZTAKI are connected to all DGs in the EDGI production infrastructure which appear as queues that can be made available for the gLite VOs. The queues on the EDGI CREAM mCEs and corresponding DGs are shown in Table 9.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 40/52

D2.3.1 – Report on best practices in infrastructure support

Table 9 - Configured Queues and DGs on the gLite-DG bridge mCEs

As with normal gLite CREAM CEs the queues can be enabled for different VOs or VOMS FQANs which allows us to control which DGs can be used by which VOs. Currently supported VOs are desktopgrid.vo.edges-grid.eu, demo.vo.edges-grid.eu, fusion, gilda, hungrid, seegrid, edgiprod.vo.edgi-grid.eu, chem.vo.ibergrid.eu, compchem, gaussian, trgrida, trgridb, biomed, enmr.eu, inaf, lsgrid and vlemed (17 in total). The VO–queue relation can be seen in Table 10.

However, unlike normal CREAM CEs the gLite-DG bridge mCE supports an additional policy decision via the EDGI Application Repository (AR). Since the security model of BOINC DGs is based on trusted applications the bridge mCEs consults the AR to decide if the job can be bridged or not. Apart from checking if the application is supported by the destination desktop grid the AR also allows the admins to specify if a VO is allowed to use a specific application.

This is also checked by the mCEs additionally to local configuration mentioned above. The EDGI CREAM mCEs are configured to use the AR servlets interface of the production EDGI AR that can be found http://edgi-repository.cpc.wmin.ac.uk/repository/mce/.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 41/52

D2.3.1 – Report on best practices in infrastructure support

Table 10 - Configured VOs per queues on the gLite-DG bridge mCE

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 42/52

D2.3.1 – Report on best practices in infrastructure support

6.3.2 Bridges from DGs to the DesktopGrid VO

This section describes a best practice for the DG to gLite bridging scenario.

The DG-SG Bridges are routing jobs from connected Desktop Grids to the DesktopGrid gLite VO resources. Currently, there are two DG-SG 3G Bridges in production.  ui3.grid.edges-grid.eu (operated by SZTAKI)  3gbridge.cloud.sztaki.hu (operated by SZTAKI)

The infrastructure consists of  gLite resources performing the computation  3G Bridges performing the automatic forwarding of the incoming DG workunits  supported DGs where workunits are download from  Munin and Nagios to monitor the operation

The overview of the infrastructure can be seen on Figure 23.

Figure 23 - DesktopGrid VO to support DG projects

Currently, the connected Desktop Grids from where jobs are redirected to gLite resources are as follows:  CAS@Home   OPTIMA@Home  SLinCA@Home  SZDG  UoW  YOYO@Home

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 43/52

D2.3.1 – Report on best practices in infrastructure support

7 Best practices for security in desktop grid computing

7.1 Introduction

The very nature of desktop grid computing gives rise to unique and serious security issues; in particular, someone with control over the grid infrastructure could easily run malware (malicious software) on the desktop machines connected to the grid. The validity of the results from your grid also depend on whether you can trust the desktop machines involved; at least in a volunteer desktop grid setting, this is not the case, and further steps should be taken to minimise the risk of incorrect results. Surveys of potential users for volunteer desktop grids have shown that a lack of trust about security issues is a major factor discouraging participation in such grids; the possibility of security vulnerabilities and concerns about potential misuse of their computers are serious issues of concern to end-users. (Some volunteer desktop grid projects provide details to potential volunteers of how they secure their projects. For example, IBM's discusses digital signatures, biometric access control to their servers, and thorough security reviews.) This document provides an overview of some best practices for operating grid infrastructure in a way which minimises such security risks.

7.2 Securing servers

Standard server security practices are a vital basis for keeping grid infrastructure secure. ● Keep the operating system and server software up-to-date. Security fixes should be applied as soon as possible. ● Restrict access to servers as much as possible. Configure firewalls and limit the user accounts to only those which are necessary, and give them access to only the resources they need for specific tasks. ● Configure logging and use an Intrusion Detection System. Make sure that logs are retained for a suitable period of time, and make sure that the gathered information is used (for example, by using automated log file analysis tools). ● Use encrypted connections if viable. Where resource limitations limit the use of encrypted connections, make sure that at least user login pages are encrypted, to avoid passwords being sent in plaintext. For a more detailed discussion, you may wish to read the NIST publication “Guidelines on Securing Public Web Servers” (NIST SP 800-44 Version 2), which covers a number of security best practices; both for servers in general, as well as for machines such as web servers which serve and process content, something which is applicable to many pieces of grid infrastructure.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 44/52

D2.3.1 – Report on best practices in infrastructure support

7.3 Securing the grid This section describes some best practices concerning security issues of Desktop Grids. The next version of the deliverable will provide more details based on the on-going related studies.

1. Keep the infrastructure software up-to-date with security updates. Most grid software has both a mailing list for announcing such updates and a webpage with a list of recent security advisories/fixes available. The fixes may not necessarily apply to older versions of the software, so if you don't update your infrastructure software to newer releases regularly, applying the patches can become more and more difficult. When many public BOINC projects were contacted to remind them to apply a recent security fix, they responded with many different explanations, which may give some insight into the difficulties of keeping such software up-to-date:  Not being subscribed to the relevant BOINC mailing list, or not having read the relevant e-mail.  Not having realised that the BOINC source code had moved to a new version control system.  Lack of time to upgrade custom infrastructure code to work with recent versions of BOINC, and/or difficulty applying the fix to older versions.  Problems with changes/bugs present in the newer versions of the code. 2. Secure public-facing web-based software. Checking that your web applications aren't vulnerable to the OWASP Top 10 (covering common problems such as SQL injections and insufficient access checking) is a good start. 3. Review all code which handles user-provided data. Data coming from machines on the desktop grid should not be trusted; assimilator/validator programs should be carefully checked for potential security vulnerabilities (for example, using code reviews and static analysis tools) and the amount of code involved should be kept as minimal as possible. 4. Use code signing, and keep your keys secret. By signing the executables you distribute, and keeping the keys secret (ideally, stored on a separate machine which is not connected to any kind of network), you can make sure that any attacker who does manage to compromise your grid infrastructure is prevented from distributing malicious software to the desktop computers. This can be done with a simple signing key (such as the default situation with BOINC), but ideally you should use (or participate in) a certification authority system, whether your own internal system, or one run by a project like the European Grid Infrastructure. 5. Consider task data. Signing task data in a secure manner, while a simple way to make sure that tasks have not been tampered with, may not be viable for a project with large amounts of work being dynamically generated. If this is the case, try and ensure that your applications don't suffer common vulnerabilities such as buffer overflows if provided with hostile data. 6. Enable sandboxing functionality if available. Sandboxing allows the access of distributed software to the resources of the desktop machines it runs on to be restricted to only the minimum which is necessary (for example, only access to read/write its

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 45/52

D2.3.1 – Report on best practices in infrastructure support

own files), hopefully removing any opportunity for maliciously-distributed software to access local information or the network. This can be done by use of a virtual machine (such as Java code), virtualisation (such as VirtualBox), or other techniques (such as running all code using a user account with heavily limited permissions). Consider also the use of software such as GBAC (the Generic BOINC Application Client) to simplify the work of configuring and deploying virtualisation-based sandboxing. 7. Participate in an Application Repository if possible. Submitting your grid applications to an application repository allows end-users to verify and trust a single authoritative source for the review and verification of software. 8. Restrict access to submit jobs. If end-users are able to submit jobs, make sure that users are authenticated and require that the jobs are signed by a valid end-user certificate, whether submitted via bridges or directly to grids. In any case, make sure that job access is suitably restricted (for example, by being firewalled), in particular considering mechanisms such as SOAP interfaces which might be used by bridges.

7.4 Making projects reliable

1. Protect yourself against flooding. Use quotas for the number of tasks which can be requested per day, and the maximum file sizes which can be uploaded. 2. Ensure reasonable response times. Set reasonable time limits on result submission in volunteer grids; setting the limit too low can discourage volunteers who have lower-end hardware or are not connected to the internet all the time, while setting it too high leads to lengthy waits for results if users have stopped running the grid software. 3. Verify submitted results. Try to ensure that your applications provide as much information as possible to detect incorrect results (e.g. due to hardware issues), and check these in your verifier. This is especially important in systems like BOINC which offer credit to volunteers, since this provides a motivation for users to submit deliberately-incorrect results and to collude with each other while doing so. There have also been various instances of users replacing grid applications with unofficial “optimised” versions - with the best of intentions, but which provide invalid (or, in any case, untrustable) results. In addition, to try and mitigate the possibility of hosts tampering with results, use credibility-based functionality if available; consider systems like majority voting as well. For a more detailed discussion (and suggestions of other techniques you might use to validate results), you might want to read the paper “Security and Result Certification” (starting on page 211 of Desktop Grid Computing, CRC Press).

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 46/52

D2.3.1 – Report on best practices in infrastructure support

8 Further case studies

8.1 DesktopGrid Testbed at JINR

In order to utilize underloaded computational capacities of personal computers at JINR and its member states organizations a DesktopGrid (DG) testbed was deployed with core services at LIT JINR. That work was performed with kind help of MTA SZTAKI team working in the IDGF-SP. The plans are to use those kind of resources for performing calculations in interests of different JINR and member states research groups. DesktopGrid testbed is based on SZTAKI DG package which is in turn based on BOINC. Right now DG testbed consists of the BOINC-server with 3G Bridge and a few BOINC- clients. The JINR grid site of the t-infrastructure has a separate EMI CREAM CE with EDGI Executor enabled what allows to submit jobs from EMI UI to DG resources. It is planned to build DG production infrastructure consisting of JINR and its member states organizations' resources and port some apps able to run in such environment.

Figure 24 - Service Grid to Desktop Grid bridging at JINR, Russia

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 47/52

D2.3.1 – Report on best practices in infrastructure support

8.2 DIRAC supported by EDGeS@home

The IDGF infrastructure has been used to support the DIRAC community with resources. Currently, the gLite mCE (cr2.edgi-grid.eu) is accessible from the BIOMED VO, where DIRAC pilots are submitted to EDGIDemo and EDGeS@home boinc project. In order to execute DIRAC pilots on BOINC resources, we are using GBAC (virtualization framework on the BOINC resource). GBAC executes DIRAC inside a VM with a linux OS. Several slight changes have been performed to enable DIRAC execution:  the cr2 mCE can handle “native” gLite jdls. Previously, mCE were only able to handle jobs coming from the EDGI repository. since the modification, any binary can be executed thanks to the GBAC virtualization framework.  the DIRAC pilot is implemented by python. The GBAC VM had to be updated by python framework.  the DIRAC pilot download jobs from the DIRAC servers, the GBAC VM is now enhanced with networking possibilities In this scenario, the DIRAC automatic pilot submission mechanism makes sure that as many pilots are running as possible. It submits pilots to cr2 which forwards them as GBAC jobs to EDGeS@home. Once a volunteer starts running a GBAC workunit with the DIRAC pilot, the pilot comes to live and starts downloading and executing DIRAC jobs on the resource. Currently, this support is an on-going work, experiences are being collected on this topic and reported later.

8.3 Charity Engine

Figure 25 - Charity Engine

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 48/52

D2.3.1 – Report on best practices in infrastructure support

Charity Engine is a BOINC Account Manager ran by The Worldwide Computer Company Limited. The project works by selling spare home computing power to universities and corporations, then sharing the profits between nine partner charities and periodic cash prize draws for the users; those running the BOINC software on their home computers. When there are no corporations purchasing the computing power, Charity Engine donates it to existing volunteer computing projects such as Einstein@Home and Malaria Control and prize draws are funded by donations.

8.3.1 How we raise money for great causes -- and the prize draw Charity Engine takes enormous, expensive computing jobs and chops them into 1000s of small pieces, each simple enough for a home PC to work on as a background task. Once each PC has finished its part of the puzzle, it sends back the correct answer and earns some money for charity – and for the prize fund. (It also earns more chances to win.) Where does the money come from? Science and industry. The grid is rented like a giant supercomputer, then all the profits shared 50-50 between the charities and the lucky prize winners. Charity Engine typically adds less than 10 cents per day to a PC's energy costs and can generate $10- $20 for charity – and the prize draws – for each $1 of electricity consumed.

8.3.2 How the prize winner is chosen When Charity Engine announces a prize draw, the List of Entries is published and a date for the draw chosen in advance. The List of Entries shows each user's unique ID number and how many prize draw entries they have in total. Everyone can verify their own place on the list and the block of numbers assigned to their entries. Any mistakes must be reported within 48 hours, then the list is final. We then begin choosing a random number by noting the last digits of the following stock market closing prices, beginning on the date of the draw: Tokyo (Nikkei 225), Hong Kong (Hang Seng), Mumbai (Sensex), Frankfurt (DAX) and finally New York (Dow Jones). Each is in a different time zone, so the winning number is revealed one digit at a time. The prize winning number is chosen right-to-left, or smallest first. If a stock market chooses a number that is too high for the final digit, it is discarded and we keep going until a valid number is chosen. Human hands are never involved. Last digits of stock market closing prices are completely random and mathematically impossible to predict or tamper with in any way. That's why we use them for Charity Engine.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 49/52

D2.3.1 – Report on best practices in infrastructure support

9 Questionnaire for BOINC project operators

In order to get feedback on the methods, procedures, experiences applied by the BOINC operators, we prepared a questionnaire. This questionnaire contains a few questions on different areas related to the operation of a BOINC server/project. This questionnaire is going to be distributed and the result will be introduced in the next version of this deliverable. Here is the initial version of the questionnaire:

Green - Do you apply any green-related solution in your BOINC project/infrastructure? Or do you consider introducing some in the future? - Do you use virtualisation on the clients? If yes, please summarise your solution.

Security - What kind of security related updates do you apply on your BOINC server? - Please, summarise your own security patches if any. - How often do you revise your server related to security?

Hardware - Please, summarise your boinc infrastructure and related custom solutions, too. - Are your available processing capacity and generated computational load balanced? If not, how do you handle this case?

Virtual image / virtualised environment, etc. - Are you using virtualised environment for your BOINC server? If not, do you consider it in the near future? If yes, please summarise your architecture.

Monitoring/statistics tools (BOINCstat, SZDG built-in, …) - Please, summarise the tools you are using most often to keep track of the performance of your BOINC clients and your server.

Compliance with the local policies (IT department) - Please, summarise the policies you apply in your IT department related to the BOINC infrastructure.

Maintenance - Please, summarise your applications and their characteristics you are running on your boinc infrastructure.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 50/52

D2.3.1 – Report on best practices in infrastructure support

- Do you generate workunits continuously or occasionally (e.g. few times per month/week)? - How do you perform result validation in your boinc infrastructure? - What tool/mechanism do you use for workunit generation? - Do you consider/handle batches/job-collections during work generation? If yes, how do you handle tail effect?

Upgrade - If you operate BOINC clients on your infrastructure, how do you perform software updates? - Do you use the boinc built-in credit system? If yes, what is your method to define the amount of credits to be assigned to a given workunit?

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 51/52

D2.3.1 – Report on best practices in infrastructure support

10 References

[1] BME MOKK: Hunspell stemmer. [Online] http://hunspell.sourceforge.net/. [2] Cavnar, W., B. and Trenkle, J., M. N-Gram-Based Text Categorization. Proceedings of Third Annual Symposium on Document Analysis and Information Retrieval. 1994, Las Vegas, NV, UNLV Publications/Reprographics (1994), pp. 161-175. [3] Rehurek, R. and Kolkus, M. Language Identification on the Web: Extending the Dictionary Method. 10th International Conference on Intelligent Text Processing and Computational Linguistics. 2009. [4] Pataki, M. and Vajna, M. Detecting the language of document written in multiple languages ("Többnyelvű dokumentum nyelvének megállapítása"). VIII. Hungarian Computer Linguistics Conference (MSZNY 2011.). 2011. [5] Marosi, A. Cs., Balaton, Z. and Kacsuk, P. GenWrapper: A generic wrapper for running legacy applications on desktop grids. IEEE Parallel and Distributed Processing Symposium, International. pp. 1-6, 2009. [6] Nurmi, D., Brevik, J. and Wolski, R. Modeling Machine Availability in Enterprise and Wide- area Distributed Computing Environments. In proceedings of the Euro-Par’05 Conference. 2003, pp. 432-441. [7] A copy of all pages from all Wikipedia wikis, in HTML form. [Online] 6 2008. http://dumps.wikimedia.org/other/static_html_dumps/. [8] Wikipedia text dumps. [Online] http://kopiwiki.dsd.sztaki.hu/.

WP2 Copyright (c) 2013. Members of IDGF-SP consortium - http://idgf-sp.eu 52/52