Managing Applications and Data in Distributed Computing Infrastructures
Total Page:16
File Type:pdf, Size:1020Kb
Dedicated to my Family List of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. Project - 1: In papers I and II, work on application execution environments is described. In paper I, we present tools for general purpose solutions using portal technology while paper II addresses access of grid resources within an application specific problem solving environment. I Erik Elmroth, Sverker Holmgren, Jonas Lindemann, Salman Toor, and Per-Olov Östberg. Empowering a Flexible Application Portal with a SOA-based Grid Job Management Framework. In Proc. 9th Workshop on State-of-the-art in Scientific and Parallel Computing (PARA 2008), Springer series Lecture Notes in Computer Science (LNCS), 6126 – 6127. II Mahen Jayawardena, Carl Nettelblad, Salman Toor, Per–Olov Östberg, Erik Elmroth, and Sverker Holmgren. A Grid–Enabled Problem Solving Environment for QTL Analysis in R. In Proc. 2nd International Conference on Bioinformatics and Computational Biology (BiCoB 2010), 2010. ISBN 978-1-880843-76-5. Contributions: In this project I participated in architecture design, integration component implementation and design of the QTL specific interface in LAP. I have also participated in system deployment, running experiments and in writing the article. Project - 2: Paper III, IV and V describe file-oriented distributed storage solutions. Papers III is focused on the architectural design of the Chelonia system whereas papers IV and V addressed stability, performance and identified issues. III Jon Kerr Nilsen, Salman Toor, Zsombor Nagy, and Bjarte Mohn. Chelonia – A Self-healing Storage Cloud. M. Bubak, M. Turala, and K. Wiatr, editors, In CGW’09 Proceedings, Krakow, 2 2010. ACC CYFRONET AGH. ISBN 978-83-61433-01-9. IV Jon Kerr Nilsen, Salman Toor, Zsombor Nagy, and Alex Read. Chelonia: A self-healing, replicated storage system. Published in Journal of Physics: Conference Series, 331(6):062019, 2011. V Jon Kerr Nilsen, Salman Toor, Zsombor Nagy, Bjarte Mohn, and Alex Read. Performance and Stability of the Chelonia Storage System. Accepted in International Symposium on Grids and Clouds (ISGC) 2012. Contributions: I did part of the system design and implementation. Also I designed, implemented and executed the test scenarios presented in all the articles. I was also heavily involved in technical discussions and papers writing. Project - 3: In papers VI and VII a database driven approach for managing data and the analysis requirements from scientific applications is discussed. Paper VI focuses on the data management whereas paper VII presents a solution for data analysis. VI Salman Toor, Manivasakan Sabesan, Sverker Holmgren, and Tore Risch. A Scalable Architecture for e-Science Data Management. Published in Proc. 7th IEEE International Conference on e-Science, ISBN 978-1-4577-2163-2. VII Salman Toor, Andrej Andrejev, Andreas Hellander, Sverker Holmgren, and Tore Risch. Scientific Analysis by Queries in Extended SPARQL Over a Distributed e-Science Data Store. Submitted in The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2012). Contributions: I did the architecture design, interface implementation and static partitioning for complex datatypes in Chelonia. I also participated in designing use-cases to demonstrate the system and in article writing. Project - 4: Paper VIII also addresses a distributed storage solution. In this paper we explore a cloud based storage solution for scientific applications. VIII Salman Toor, Rainer Töebbicke, Maitane Zotes Resines, and Sverker Holmgren. Investigating an Open Source Cloud Infrastructure for CERN-Specific Data Analysis. Accepted in 7th IEEE International Conference on Networking, Architecture, and Storage (NAS 2012). Contributions: I participated in enabling access from the ROOT framework to SWIFT and in prototype system deployment. I worked on design, implementation and execution of the test-cases presented, contributed to the technical discussion, and participated in paper writing. Reproduced with the permission of the publishers, presented here in another format than in the original publication. Contents Part I: Introduction ........................................................................................... 11 1 Introduction ................................................................................................ 13 1.1 Overview of Distributed Computing ............................................. 15 1.1.1 Communication Protocols ............................................... 15 1.1.2 Architectural Designs ...................................................... 16 1.1.3 Frameworks for Distributed Computing ........................ 17 1.2 Models for Scalable Distributed Computing Infrastructures ...... 18 1.2.1 Grid Computing ............................................................... 19 1.2.2 Cloud Computing ............................................................ 20 1.2.3 Grids vs Clouds ............................................................... 21 1.2.4 Other Relevant Models ................................................... 21 1.3 Technologies for Large Scale Distributed Computing Infrastructures ................................................................................. 21 Part II: Application Execution Environments ................................................ 25 2 Application Environments for Grids ........................................................ 27 2.1 Grid Portals ..................................................................................... 27 2.2 Application Workflows .................................................................. 27 2.3 The Job Management Component ................................................ 28 2.4 Thesis Contribution ........................................................................ 28 2.4.1 System Architecture ........................................................ 29 Part III: Distributed Storage Solution ............................................................. 31 3 Distributed Storage Systems ..................................................................... 33 3.1 Characteristics of Distributed Storage .......................................... 34 3.2 Challenges of Distributed Storage ................................................ 34 3.3 Thesis Contribution ........................................................................ 35 3.3.1 Chelonia Storage System ................................................ 35 3.3.2 Database Enabled Chelonia ............................................ 38 3.3.3 Cloud based Storage Solution ......................................... 39 Part IV: Resource Allocation in Distributed Computing Infrastructures ..... 41 4 Resource Allocation in Distributed Computing Infrastructures ............. 43 4.1 Models for Resource Allocation ................................................... 43 4.2 Thesis Contribution ........................................................................ 44 Part V: Article Summary ................................................................................. 47 5 Summary of Papers in the Thesis ............................................................. 49 5.1 Paper-I ............................................................................................. 49 5.2 Paper-II ........................................................................................... 49 5.3 Paper-III .......................................................................................... 50 5.4 Paper-IV .......................................................................................... 50 5.5 Paper-V ........................................................................................... 50 5.6 Paper-VI .......................................................................................... 51 5.7 Paper-VII ........................................................................................ 51 5.8 Paper-VIII ....................................................................................... 52 6 Svensk sammanfattning ............................................................................. 53 7 Acknowledgments ................................................................................... 55 References ........................................................................................................ 57 List of Other Publications These publications have been written during my PhD studies but are not part of the thesis. However, some of the material in publications I and II below is included in other papers in the thesis. Also, some of the conclusions in publi- cation III are presented in Section 4.2 in the thesis summary. I. Mahen Jayawardena, Salman Toor, and Sverker Holmgren. A grid portal for genetic analysis of complex traits. Proc. 32nd International Conven- tion on Information and Communication Technology, Electronics and Microelectronics : Volume I. - Rijeka, Croatia : MIPRO, 2009. - S. 281-284. II. Mahen Jayawardena, Salman Toor, and Sverker Holmgren. Compu- tational and visualization tools for genetic analysis of complex traits. Technical Report no. 2010-001. Department of Information Technol- ogy, Uppsala University. III. Salman Toor, Bjarte Mohn, David Cameron, and Sverker Holmgren. Case-Study for Different Models of Resource Brokering in Grid Sys- tems. Technical Report no. 2010-009. Department of Information Tech- nology, Uppsala University. 9 List of Presentations The material presented in this thesis has