Platform to Ease the Deployment and Improve the Availability of TRENCADIS Infrastructure

Total Page:16

File Type:pdf, Size:1020Kb

Platform to Ease the Deployment and Improve the Availability of TRENCADIS Infrastructure 7 Madrid, SPAIN 7 2013 7 Madrid, SPAIN Madrid , Spain September 19-20, 2013 EDITORS Ignacio Blanquer Isabel Campos Gonçalo Borges Jorge Gomes ISBN 978-84-9048-110-3 http://www.ibergrid,eu/2013 [email protected] GOBIERNO MINISTERIO DE ESPAÑA DE ECONOMÍA Y COMPETITIVIDAD 2013 IBERGRID 7th IBERIAN Grid Infrastructure Conference Proceedings Madrid, Spain September 19th20th 2013 Editorial Universitat Politècnica de València The contents of this publication have been subject to revision by Scientific Committee First edition, 2013 Editors Ignacio Blanquer Universitat Polit`ecnicade Val`encia Isabel Campos Instituto de F´ısicade Cantabria Gon¸caloBorges Laborat´oriode Instrumenta¸c~aoe F´ısicaExpermiental de Particulas Jorge Gomes Laborat´oriode Instrumenta¸c~aoe F´ısicaExpermiental de Particulas ©of this edition: Editorial Universitat Politècnica de València Distribution: Tel. +34 – 963 877 012 / http://www.lalibreria.upv.es / Ref.: 2241 ISBN 978-84-9048-110-3 (Versión impresa) Print on demand This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. IBERGRID'2013 III Contents Message from the conference chairs ....................... V Chairs ......................................................... VI Session I: Energy Efficiency Infrastructures SPARKS, a dynamic power-aware approach for managing computing cluster resources .................................................. 3 J. Martins , G. Borges, N. Dias, H. Gomes, J. Gomes, J. Pina, C. Manuel Performance and Energy Aware Scheduling Simulator for High-Performance Computing ...................................... 17 C. G´omez, M. A. Vega, J.L. Gonz´alez,J. Corral, D. Cort´es Session II: Interoperability and Federation of Infrastructures Towards Federated Cloud Image Management ........................ 33 A. Sim´on,E.Freire,R.Rosende,I.D´ıaz, A. Feij´oo, P. Rey, J. L´opez-Cacheiro, C. Fern´andez,O. Synge Infrastructure-Agnostic Programming and Interoperable Execution in Heterogeneous Grids .............................................. 45 E. Tejedor, J. Alvarez,´ R.M. Badia Session III: Advanced Data Processing Graph Database for Structured Radiology Reporting .................. 61 L. D´ıaz, D. Segrelles, E. Torres, I. Blanquer Big data and urban mobility ....................................... 75 A. Tugores, P. Colet Analyzing File Access Patterns in Distributed File-systems ............. 89 J. Gomes, J. Pina, G. Borges, J. Martins, N. Dias, H. Gomes, C. Manuel IV IBERGRID'2013 Studying the improving of data locality on distributed Grid applications in bioinformatics .................................................. 103 J. Herrera, I. Blanquer Session IV: Scientific Clouds Service Construction Tools for Easy Cloud Deployment ................ 119 J.Ejarque,A.Sulistio,F.Lordan,P.Gilet,R.Sirvent,R.M. Badia Platform to Ease the Deployment and Improve the Availability of TRENCADIS Infrastructure ........................................ 133 D. Segrelles, M. Caballer, E. Torres, G. Molt´o, I. Blanquer Analysis of Scientific Cloud Computing requirements .................. 147 A. L´opez,E.Fern´andez Easing the Structural Analysis Migration to the Cloud ................. 159 M.Caballer,P.delaFuente,P.Lozano,J.M.Alonso,C.de Alfonso, V. Hern´andez Session IV: Scientific Applications in Distributed Infrastructures Leveraging Hybrid Data Infrastructure for Ecological Niche Modelling: The EUBrazilOpenBio Experience .................................. 175 I. Blanquer, L. Candela, P. Pagano, E. Torres An unattended and fault-tolerant approach for the execution of distributed applications ............................................ 189 M. Rodr´ıguez-Pascual,R. Mayo-Garc´ıa Calculation of Two-Point Angular Correlation Function: Implementations on Many-Core and Multicore Processors .............. 203 M. C´ardenas-Montes et al Performance Assessment of a Chaos-based Image Cipher on Multi-GPU .. 215 J.J. Rodr´ıguez-V´azquez, M. C´ardenas-Montes Operation of the ATLAS Spanish Tier-2 during the LHC Run I ......... 227 E. Oliver et al The LHC Tier1 at PIC: experience from first LHC run and getting prepared for the next .............................................. 241 J. Flix et al Author Index ................................................ 255 IBERGRID'2013 V Message from the conference chairs 2013 is becoming a challenge year for the research in Europe in general and both in Spain and Portugal in particular. Reduction on investment is making scientist to become more and more efficient on the use of resources and international cooperation will become a key target. These major aims are clearly reflected on the activity of IBERGRID. The usage of the resources of the IBERGRID NGI has increased in a 18% from last year reaching almost 180 million KSI2K hours from July 2012 until June 2013. The Large Hadron Collider (LHC) experiments continue representing the major share of these usage, but multidisciplinary Virtual Organizations keep the same share as previous years, being IBERGRID one of top NGIs in EGI dedicating a major share to non-LHC scientists. On the other side, the usage of the EGI infrastructure by Spanish users have increased up to 161 million KSI2K hours from July 2012 until June 2013, 20% more than in the same period of the previous year. The participation of Spanish institutions in European projects in Research Infrastructures has increased up to 12 European projects, some of them showing their result on this book. This year’s edition specially addresses cloud computing and expertise from user communities. Challenges such as the federation of virtual infrastructures and the easy deployment of services and virtual appliances constitute a major trend in today’s research. The Spanish participation in the EGI’s Virtualized Clouds Task Force has been important and relevant. Regarding the use cases, experiences in the LHC, energy, biodiversity, en- gineering and security will be presented and discussed, surely showing many common points for exchanging approaches and solutions. ICT advances in dis- tributed systems in terms of data, fault-tolerance and interoperability will also be presented. Finally, it is important to mention that IBERGIRD organizes the 2013 Tech- nical Forum of EGI in Madrid, Spain. This event will bring over 400 of par- ticipants with interest on distributed infrastructures for research, including top members of the European Commission. This seventh IBERGRID conference has been co-located with this major event. We believe that attendees to IBERGRID will strongly benefit from the wider scope of the technical forum while increasing the changes of impact to their presentations. IBERGRID is a mature and lively community that seamlessly work with a wide impact on the Spanish and Portuguese research communities. It constitutes a major example of bi-lateral cooperation and a solid partner for international cooperation. The Conference Chairs Ignacio Blanquer, Isabel Campos Gon¸calo Borges, Jorge Gomes VI IBERGRID'2013 Chairs Conference Chairs Ignacio Blanquer, Universitat Polit`ecnicade Val`encia- UPV Isabel Campos, Instituto de F´ısicade Cantabria - IFCA-CSIC Gon¸calo Borges, Laborat´oriode Instrumenta¸c˜ao e F´ısica Expermiental de Particulas - LIP Jorge Gomes, Laborat´orio de Instrumenta¸c˜ao e F´ısica Expermiental de Particulas - LIP Scientific Advisory Committee Ignacio Blanquer Andreu Pacheco Tom´as de Miguel Isabel Campos Ant´onioPina Victor Castelo Carlos Fern´andez Eduardo Huedo Carlos Redondo Francisco Castej´on Enol Fern´ndez Rosa Badia Jesus Marco Gaspar Barreira Miguel C´ardenas-Montes Vicente Hern´andez Gonzalo Merino Germ´an Molt´o Jorge Gomes Ignacio L´opez Erik Torres Gon¸calo Borges Javier G. Tobo Dami´an Segrelles Mario David Javier Lopez Carlos de Alfonso Jo˜ao Pina Jose Salt V´ıtor Oliveira Alvaro Fern´andez Josep Flix Helmut Wolters Local Organizing Sponsors Committee Jointly promoted by Ignacio Blanquer Spanish Ministry of Isabel Campos Science and Innovation Carlos Fern´andez Funda¸c˜ao para a Antonio Fuentes Ciˆenciae Tecnologia Francisco Castej´on Miguel Caballer Enrique Bayonne With the Support of Instituto de F´ısica de Cantabria - IFCA-CSIC Universitat Polit`ecnica de Val`encia - UPV IBERGRID'2013 3 SPARKS, a dynamic power-aware approach for managing computing cluster resources J. Martins G. Borges, N. Dias, H. Gomes, J. Gomes, J. Pina, C. Manuel Laborat´oriode Instrumenta¸c˜ao e F´ısica Experimental de Part´ıculas, Portugal [email protected], [email protected] Abstract. SPARKS is a new flexible and modular solution to optimize the execution of computing tasks with the minimum of available resources, focusing on power consumption reduction. SPARKS acts as a third party agent inter playing with the Local Resource Management System (LRMS) and the cluster resources. The agent tries to influence the resource al- location decisions with the objective to concentrate executing tasks in non-filled hosts, minimize the necessity to wake-up new resources, and power down unused hosts. The proposed approach provides the necessary resources to the LRMS without really interfering on any of its scheduling decisions guaranteeing that the resource allocation policies are never in- fringed. 1 Introduction High Energy and Particle Physics research institutions are well known to host scientists performing state of the art research with extraordinary computing power requirements. Such researchers are normally part of wider worldwide col- laborations focused, at a technical level, on the processing of large amounts of experimental data collected at big
Recommended publications
  • Slurm Overview and Elasticsearch Plugin
    Slurm Workload Manager Overview SC15 Alejandro Sanchez [email protected] Copyright 2015 SchedMD LLC http://www.schedmd.com Slurm Workload Manager Overview ● Originally intended as simple resource manager, but has evolved into sophisticated batch scheduler ● Able to satisfy scheduling requirements for major computer centers with use of optional plugins ● No single point of failure, backup daemons, fault-tolerant job options ● Highly scalable (3.1M core Tianhe-2 at NUDT) ● Highly portable (autoconf, extensive plugins for various environments) ● Open source (GPL v2) ● Operating on many of the world's largest computers ● About 500,000 lines of code today (plus test suite and documentation) Copyright 2015 SchedMD LLC http://www.schedmd.com Enterprise Architecture Copyright 2015 SchedMD LLC http://www.schedmd.com Architecture ● Kernel with core functions plus about 100 plugins to support various architectures and features ● Easily configured using building-block approach ● Easy to enhance for new architectures or features, typically just by adding new plugins Copyright 2015 SchedMD LLC http://www.schedmd.com Elasticsearch Plugin Copyright 2015 SchedMD LLC http://www.schedmd.com Scheduling Capabilities ● Fair-share scheduling with hierarchical bank accounts ● Preemptive and gang scheduling (time-slicing parallel jobs) ● Integrated with database for accounting and configuration ● Resource allocations optimized for topology ● Advanced resource reservations ● Manages resources across an enterprise Copyright 2015 SchedMD LLC http://www.schedmd.com
    [Show full text]
  • Proceedings of the Linux Symposium
    Proceedings of the Linux Symposium Volume One June 27th–30th, 2007 Ottawa, Ontario Canada Contents The Price of Safety: Evaluating IOMMU Performance 9 Ben-Yehuda, Xenidis, Mostrows, Rister, Bruemmer, Van Doorn Linux on Cell Broadband Engine status update 21 Arnd Bergmann Linux Kernel Debugging on Google-sized clusters 29 M. Bligh, M. Desnoyers, & R. Schultz Ltrace Internals 41 Rodrigo Rubira Branco Evaluating effects of cache memory compression on embedded systems 53 Anderson Briglia, Allan Bezerra, Leonid Moiseichuk, & Nitin Gupta ACPI in Linux – Myths vs. Reality 65 Len Brown Cool Hand Linux – Handheld Thermal Extensions 75 Len Brown Asynchronous System Calls 81 Zach Brown Frysk 1, Kernel 0? 87 Andrew Cagney Keeping Kernel Performance from Regressions 93 T. Chen, L. Ananiev, and A. Tikhonov Breaking the Chains—Using LinuxBIOS to Liberate Embedded x86 Processors 103 J. Crouse, M. Jones, & R. Minnich GANESHA, a multi-usage with large cache NFSv4 server 113 P. Deniel, T. Leibovici, & J.-C. Lafoucrière Why Virtualization Fragmentation Sucks 125 Justin M. Forbes A New Network File System is Born: Comparison of SMB2, CIFS, and NFS 131 Steven French Supporting the Allocation of Large Contiguous Regions of Memory 141 Mel Gorman Kernel Scalability—Expanding the Horizon Beyond Fine Grain Locks 153 Corey Gough, Suresh Siddha, & Ken Chen Kdump: Smarter, Easier, Trustier 167 Vivek Goyal Using KVM to run Xen guests without Xen 179 R.A. Harper, A.N. Aliguori & M.D. Day Djprobe—Kernel probing with the smallest overhead 189 M. Hiramatsu and S. Oshima Desktop integration of Bluetooth 201 Marcel Holtmann How virtualization makes power management different 205 Yu Ke Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps 215 J.
    [Show full text]
  • Teaching Operating Systems Concepts with Systemtap
    Session 8B: Enhancing CS Instruction ITiCSE '17, July 3-5, 2017, Bologna, Italy Teaching Operating Systems Concepts with SystemTap Darragh O’Brien School of Computing Dublin City University Glasnevin Dublin 9, Ireland [email protected] ABSTRACT and their value is undoubted. However, there is room in introduc- e study of operating systems is a fundamental component of tory operating systems courses for supplementary approaches and all undergraduate computer science degree programmes. Making tools that support the demonstration of operating system concepts operating system concepts concrete typically entails large program- in the context of a live, real-world operating system. ming projects. Such projects traditionally involve enhancing an is paper describes how SystemTap [3, 4] can be applied to existing module in a real-world operating system or extending a both demonstrate and explore low-level behaviour across a range pedagogical operating system. e laer programming projects rep- of system modules in the context of a real-world operating sys- resent the gold standard in the teaching of operating systems and tem. SystemTap scripts allow the straightforward interception of their value is undoubted. However, there is room in introductory kernel-level events thereby providing instructor and students alike operating systems courses for supplementary approaches and tools with concrete examples of operating system concepts that might that support the demonstration of operating system concepts in the otherwise remain theoretical. e simplicity of such scripts makes context of a live, real-world operating system. is paper describes them suitable for inclusion in lectures and live demonstrations in an approach where the Linux monitoring tool SystemTap is used introductory operating systems courses.
    [Show full text]
  • Red Hat Enterprise Linux 7 Systemtap Beginners Guide
    Red Hat Enterprise Linux 7 SystemTap Beginners Guide Introduction to SystemTap William Cohen Don Domingo Jacquelynn East Red Hat Enterprise Linux 7 SystemTap Beginners Guide Introduction to SystemTap William Cohen Red Hat Performance Tools [email protected] Don Domingo Red Hat Engineering Content Services [email protected] Jacquelynn East Red Hat Engineering Content Services [email protected] Legal Notice Copyright © 2014 Red Hat, Inc. and others. This document is licensed by Red Hat under the Creative Commons Attribution-ShareAlike 3.0 Unported License. If you distribute this document, or a modified version of it, you must provide attribution to Red Hat, Inc. and provide a link to the original. If the document is modified, all Red Hat trademarks must be removed. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the Infinity Logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. Linux ® is the registered trademark of Linus Torvalds in the United States and other countries. Java ® is a registered trademark of Oracle and/or its affiliates. XFS ® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. MySQL ® is a registered trademark of MySQL AB in the United States, the European Union and other countries. Node.js ® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
    [Show full text]
  • D2.3 Transport Research in the European Open Science Cloud 29
    Ref. Ares(2020)5096102 - 29/09/2020 European forum and oBsErvatory for OPEN science in transport Project Acronym: BE OPEN Project Title: European forum and oBsErvatory for OPEN science in transport Project Number: 824323 Topic: MG-4-2-2018 – Building Open Science platforms in transport research Type of Action: Coordination and support action (CSA) D2.3 Transport Research in the European Open Science Cloud Final Version European forum and oBsErvatory D2.3: Transport Research in the European Open Science Cloud for OPEN science in transport Deliverable Title: Open/FAIR data, software and infrastructure in European transport research Work Package: WP2 Due Date: 2019.11.30 Submission Date: 2020.09.29 Start Date of Project: 1st January 2019 Duration of Project: 30 months Organisation Responsible of Deliverable: ATHENA RC Version: Final Version Status: Final Natalia Manola, Afroditi Anagnostopoulou, Harry Author name(s): Dimitropoulos, Alessia Bardi Kristel Palts (DLR), Christian von Bühler (Osborn-Clarke), Reviewer(s): Anna Walek, M. Zuraska (GUT), Clara García (Scipedia), Anja Fleten Nielsen (TOI), Lucie Mendoza (HUMANIST), Michela Floretto (FIT), Ioannis Ergas (WEGEMT), Boris Hilia (UITP), Caroline Almeras (ECTRI), Rudolf Cholava (CDV), Milos Milenkovic (FTTE) Nature: ☒ R – Report ☐ P – Prototype ☐ D – Demonstrator ☐ O – Other Dissemination level: ☒ PU - Public ☐ CO - Confidential, only for members of the consortium (including the Commission) ☐ RE - Restricted to a group specified by the consortium (including the Commission Services) 2 | 53 European forum and oBsErvatory D2.3: Transport Research in the European Open Science Cloud for OPEN science in transport Document history Version Date Modified by (author/partner) Comments 0.1 Afroditi Anagnostopoulou, Alessia EOSC sections according to EC 2019.11.25 Bardi, Harry Dimopoulos documents.
    [Show full text]
  • Monday, Oct 12, 2009
    Monday, Oct 12, 2009 Keynote Presentations Infrastructures Use, Requirements and Prospects in ICT for Health Domain Karin Johansson European Commission, Brussels, Belgium Progress in Integrating Networks with Service Oriented Architectures / Grids: ESnet's Guaranteed Bandwidth Service William E. Johnston Senior Scientist, Energy Sciences Network, Lawrence Berkeley National Laboratory, USA The past several years have seen Grid software mature to the point where it is now at the heart of some of the largest science data analysis systems - notably the CMS and Atlas experiments at the LHC. Systems like these, with their integrated, distributed data management and work flow management routinely treat computing and storage resources as 'services'. That is, resources that that can be discovered, queried as to present and future state, and that can be scheduled with guaranteed capacity. In Grid based systems the network provides the communication among these service-based resources, yet historically the network is a 'best effort' resource offering no guarantees and little state transparency. Recent work in the R&E network community, that is associated with the science community, has made progress toward developing network capabilities that provide service-like characteristics: Guaranteed capacity can be scheduled in advance and transparency for the state of the network from end-to-end. These services have grown out of initial work in the Global Grid Forum's Grid Performance Working Group. The services are defined by standard interfaces and data formats, but may have very different implementations in different networks. An ad hoc international working group has been implementing, testing, and refining these services in order to ensure interoperability among the many network domains involved in science collaborations.
    [Show full text]
  • Intel® SSF Reference Design: Intel® Xeon Phi™ Processor, Intel® OPA
    Intel® Scalable System Framework Reference Design Intel® Scalable System Framework (Intel® SSF) Reference Design Cluster installation for systems based on Intel® Xeon® processor E5-2600 v4 family including Intel® Ethernet. Based on OpenHPC v1.1 Version 1.0 Intel® Scalable System Framework Reference Design Summary This Reference Design is part of the Intel® Scalable System Framework series of reference collateral. The Reference Design is a verified implementation example of a given reference architecture, complete with hardware and software Bill of Materials information and cluster configuration instructions. It can confidently be used “as is”, or be the foundation for enhancements and/or modifications. Additional Reference Designs are expected in the future to provide example solutions for existing reference architecture definitions and for utilizing additional Intel® SSF elements. Similarly, more Reference Designs are expected as new reference architecture definitions are introduced. This Reference Design is developed in support of the specification listed below using certain Intel® SSF elements: • Intel® Scalable System Framework Architecture Specification • Servers with Intel® Xeon® processor E5-2600 v4 family processors • Intel® Ethernet • Software stack based on OpenHPC v1.1 2 Version 1.0 Legal Notices No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice.
    [Show full text]
  • SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4
    SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4 An administrator's guide for problem detection, resolution and optimization. Find how to inspect and optimize your system by means of monitoring tools and how to eciently manage resources. Also contains an overview of common problems and solutions and of additional help and documentation resources. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xiii
    [Show full text]
  • SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3
    SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3 SUSE Linux Enterprise High Performance Computing Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2020–2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents Preface vii 1 Available documentation vii 2 Giving feedback vii 3 Documentation conventions viii 4 Support x Support statement for SUSE Linux Enterprise High Performance Computing x • Technology previews xi 1 Introduction 1 1.1 Components provided 1 1.2 Hardware platform support 2 1.3 Support
    [Show full text]
  • Version Control Graphical Interface for Open Ondemand
    VERSION CONTROL GRAPHICAL INTERFACE FOR OPEN ONDEMAND by HUAN CHEN Submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY AUGUST 2018 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Huan Chen candidate for the degree of Master of Science Committee Chair Chris Fietkiewicz Committee Member Christian Zorman Committee Member Roger Bielefeld Date of Defense June 27, 2018 TABLE OF CONTENTS Abstract CHAPTER 1: INTRODUCTION ............................................................................ 1 CHAPTER 2: METHODS ...................................................................................... 4 2.1 Installation for Environments and Open OnDemand .............................................. 4 2.1.1 Install SLURM ................................................................................................. 4 2.1.1.1 Create User .................................................................................... 4 2.1.1.2 Install and Configure Munge ........................................................... 5 2.1.1.3 Install and Configure SLURM ......................................................... 6 2.1.1.4 Enable Accounting ......................................................................... 7 2.1.2 Install Open OnDemand .................................................................................. 9 2.2 Git Version Control for Open OnDemand
    [Show full text]
  • Extending SLURM for Dynamic Resource-Aware Adaptive Batch
    1 Extending SLURM for Dynamic Resource-Aware Adaptive Batch Scheduling Mohak Chadha∗, Jophin John†, Michael Gerndt‡ Chair of Computer Architecture and Parallel Systems, Technische Universitat¨ Munchen¨ Garching (near Munich), Germany Email: [email protected], [email protected], [email protected] their resource requirements at runtime due to varying computa- Abstract—With the growing constraints on power budget tional phases based upon refinement or coarsening of meshes. and increasing hardware failure rates, the operation of future A solution to overcome these challenges and efficiently utilize exascale systems faces several challenges. Towards this, resource awareness and adaptivity by enabling malleable jobs has been the system components is resource awareness and adaptivity. actively researched in the HPC community. Malleable jobs can A method to achieve adaptivity in HPC systems is by enabling change their computing resources at runtime and can signifi- malleable jobs. cantly improve HPC system performance. However, due to the A job is said to be malleable if it can adapt to resource rigid nature of popular parallel programming paradigms such changes triggered by the batch system at runtime [6]. These as MPI and lack of support for dynamic resource management in batch systems, malleable jobs have been largely unrealized. In resource changes can either increase (expand operation) or this paper, we extend the SLURM batch system to support the reduce (shrink operation) the number of processors. Malleable execution and batch scheduling of malleable jobs. The malleable jobs have shown to significantly improve the performance of a applications are written using a new adaptive parallel paradigm batch system in terms of both system and user-centric metrics called Invasive MPI which extends the MPI standard to support such as system utilization and average response time [7], resource-adaptivity at runtime.
    [Show full text]
  • COLLABORATIONS, EXPLOITATION and SUSTAINABILITY PLAN Doc
    EUROPEAN M IDDLEWARE INITIATIVE DNA2.4.2 - C OLLABORATIONS , E XPLOITATION AN D S USTAINABILITY P LAN EU D ELIVERABLE : D3.1.1 EMI-DNA3.1.1-1450889- Document identifier: Collaborations_Exploitation_and_Sustainability_Plan_M24- v1.0.doc Date: 30/04/2012 Activity: NA3 – Sustainability and Long-Term Strategies Lead Partner: NIIF Document status: Final Document link: https://cdsweb.cern.ch/record/1450889?ln=en Abstract: This document is a 12-month follow-up on the initial Exploitation and Sustainability Plan, DNA2.4.2, and a 16-month follow-up of the Initial Collaboration Programs DNA2.1.1. In this document we described the current definition of the sustainability and exploitation plans, summarize the progress made in its implementation and give an outline of the work planned for the final year of the project. INFSO-RI-261611 2010-2013 © Members of EMI collaboration PUBLIC 1 / 79 DNA2.4.2 - COLLABORATIONS, EXPLOITATION AND SUSTAINABILITY PLAN Doc. Identifier: EMI-DNA3.1.1-1450889-Collaborations_Exploitation_and_Sustainability_Plan_M24-v1.0.doc Date: 30/04/2012 I. DELIVERY SLIP Name Partner/Activity Date From Peter Stefan NIIF/NA3 24 May 2012 Alberto Di Meglio CERN/NA1 Reviewed by Florida Estrella CERN/NA1 25 May 2012 Ludek Matyska CESNET/CB Approved by PEB 26/05/2012 II. DOCUMENT LOG Issue Date Comment Author/Partner 0.7 18/05/2012 First draft Peter Stefan/NIIFI Peter Stefan/NIIFI Alberto Di 0.8- 22- Iterative revisions Meglio/CERN 0.16 25/05/2012 Florida Estrella/CERN 1.0 26/05/2012 Final version for release PEB III. DOCUMENT CHANGE RECORD Issue Item Reason for Change IV.
    [Show full text]