Platform to Ease the Deployment and Improve the Availability of TRENCADIS Infrastructure
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Slurm Overview and Elasticsearch Plugin
Slurm Workload Manager Overview SC15 Alejandro Sanchez [email protected] Copyright 2015 SchedMD LLC http://www.schedmd.com Slurm Workload Manager Overview ● Originally intended as simple resource manager, but has evolved into sophisticated batch scheduler ● Able to satisfy scheduling requirements for major computer centers with use of optional plugins ● No single point of failure, backup daemons, fault-tolerant job options ● Highly scalable (3.1M core Tianhe-2 at NUDT) ● Highly portable (autoconf, extensive plugins for various environments) ● Open source (GPL v2) ● Operating on many of the world's largest computers ● About 500,000 lines of code today (plus test suite and documentation) Copyright 2015 SchedMD LLC http://www.schedmd.com Enterprise Architecture Copyright 2015 SchedMD LLC http://www.schedmd.com Architecture ● Kernel with core functions plus about 100 plugins to support various architectures and features ● Easily configured using building-block approach ● Easy to enhance for new architectures or features, typically just by adding new plugins Copyright 2015 SchedMD LLC http://www.schedmd.com Elasticsearch Plugin Copyright 2015 SchedMD LLC http://www.schedmd.com Scheduling Capabilities ● Fair-share scheduling with hierarchical bank accounts ● Preemptive and gang scheduling (time-slicing parallel jobs) ● Integrated with database for accounting and configuration ● Resource allocations optimized for topology ● Advanced resource reservations ● Manages resources across an enterprise Copyright 2015 SchedMD LLC http://www.schedmd.com -
Proceedings of the Linux Symposium
Proceedings of the Linux Symposium Volume One June 27th–30th, 2007 Ottawa, Ontario Canada Contents The Price of Safety: Evaluating IOMMU Performance 9 Ben-Yehuda, Xenidis, Mostrows, Rister, Bruemmer, Van Doorn Linux on Cell Broadband Engine status update 21 Arnd Bergmann Linux Kernel Debugging on Google-sized clusters 29 M. Bligh, M. Desnoyers, & R. Schultz Ltrace Internals 41 Rodrigo Rubira Branco Evaluating effects of cache memory compression on embedded systems 53 Anderson Briglia, Allan Bezerra, Leonid Moiseichuk, & Nitin Gupta ACPI in Linux – Myths vs. Reality 65 Len Brown Cool Hand Linux – Handheld Thermal Extensions 75 Len Brown Asynchronous System Calls 81 Zach Brown Frysk 1, Kernel 0? 87 Andrew Cagney Keeping Kernel Performance from Regressions 93 T. Chen, L. Ananiev, and A. Tikhonov Breaking the Chains—Using LinuxBIOS to Liberate Embedded x86 Processors 103 J. Crouse, M. Jones, & R. Minnich GANESHA, a multi-usage with large cache NFSv4 server 113 P. Deniel, T. Leibovici, & J.-C. Lafoucrière Why Virtualization Fragmentation Sucks 125 Justin M. Forbes A New Network File System is Born: Comparison of SMB2, CIFS, and NFS 131 Steven French Supporting the Allocation of Large Contiguous Regions of Memory 141 Mel Gorman Kernel Scalability—Expanding the Horizon Beyond Fine Grain Locks 153 Corey Gough, Suresh Siddha, & Ken Chen Kdump: Smarter, Easier, Trustier 167 Vivek Goyal Using KVM to run Xen guests without Xen 179 R.A. Harper, A.N. Aliguori & M.D. Day Djprobe—Kernel probing with the smallest overhead 189 M. Hiramatsu and S. Oshima Desktop integration of Bluetooth 201 Marcel Holtmann How virtualization makes power management different 205 Yu Ke Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps 215 J. -
Teaching Operating Systems Concepts with Systemtap
Session 8B: Enhancing CS Instruction ITiCSE '17, July 3-5, 2017, Bologna, Italy Teaching Operating Systems Concepts with SystemTap Darragh O’Brien School of Computing Dublin City University Glasnevin Dublin 9, Ireland [email protected] ABSTRACT and their value is undoubted. However, there is room in introduc- e study of operating systems is a fundamental component of tory operating systems courses for supplementary approaches and all undergraduate computer science degree programmes. Making tools that support the demonstration of operating system concepts operating system concepts concrete typically entails large program- in the context of a live, real-world operating system. ming projects. Such projects traditionally involve enhancing an is paper describes how SystemTap [3, 4] can be applied to existing module in a real-world operating system or extending a both demonstrate and explore low-level behaviour across a range pedagogical operating system. e laer programming projects rep- of system modules in the context of a real-world operating sys- resent the gold standard in the teaching of operating systems and tem. SystemTap scripts allow the straightforward interception of their value is undoubted. However, there is room in introductory kernel-level events thereby providing instructor and students alike operating systems courses for supplementary approaches and tools with concrete examples of operating system concepts that might that support the demonstration of operating system concepts in the otherwise remain theoretical. e simplicity of such scripts makes context of a live, real-world operating system. is paper describes them suitable for inclusion in lectures and live demonstrations in an approach where the Linux monitoring tool SystemTap is used introductory operating systems courses. -
Red Hat Enterprise Linux 7 Systemtap Beginners Guide
Red Hat Enterprise Linux 7 SystemTap Beginners Guide Introduction to SystemTap William Cohen Don Domingo Jacquelynn East Red Hat Enterprise Linux 7 SystemTap Beginners Guide Introduction to SystemTap William Cohen Red Hat Performance Tools [email protected] Don Domingo Red Hat Engineering Content Services [email protected] Jacquelynn East Red Hat Engineering Content Services [email protected] Legal Notice Copyright © 2014 Red Hat, Inc. and others. This document is licensed by Red Hat under the Creative Commons Attribution-ShareAlike 3.0 Unported License. If you distribute this document, or a modified version of it, you must provide attribution to Red Hat, Inc. and provide a link to the original. If the document is modified, all Red Hat trademarks must be removed. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the Infinity Logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. Linux ® is the registered trademark of Linus Torvalds in the United States and other countries. Java ® is a registered trademark of Oracle and/or its affiliates. XFS ® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. MySQL ® is a registered trademark of MySQL AB in the United States, the European Union and other countries. Node.js ® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project. -
D2.3 Transport Research in the European Open Science Cloud 29
Ref. Ares(2020)5096102 - 29/09/2020 European forum and oBsErvatory for OPEN science in transport Project Acronym: BE OPEN Project Title: European forum and oBsErvatory for OPEN science in transport Project Number: 824323 Topic: MG-4-2-2018 – Building Open Science platforms in transport research Type of Action: Coordination and support action (CSA) D2.3 Transport Research in the European Open Science Cloud Final Version European forum and oBsErvatory D2.3: Transport Research in the European Open Science Cloud for OPEN science in transport Deliverable Title: Open/FAIR data, software and infrastructure in European transport research Work Package: WP2 Due Date: 2019.11.30 Submission Date: 2020.09.29 Start Date of Project: 1st January 2019 Duration of Project: 30 months Organisation Responsible of Deliverable: ATHENA RC Version: Final Version Status: Final Natalia Manola, Afroditi Anagnostopoulou, Harry Author name(s): Dimitropoulos, Alessia Bardi Kristel Palts (DLR), Christian von Bühler (Osborn-Clarke), Reviewer(s): Anna Walek, M. Zuraska (GUT), Clara García (Scipedia), Anja Fleten Nielsen (TOI), Lucie Mendoza (HUMANIST), Michela Floretto (FIT), Ioannis Ergas (WEGEMT), Boris Hilia (UITP), Caroline Almeras (ECTRI), Rudolf Cholava (CDV), Milos Milenkovic (FTTE) Nature: ☒ R – Report ☐ P – Prototype ☐ D – Demonstrator ☐ O – Other Dissemination level: ☒ PU - Public ☐ CO - Confidential, only for members of the consortium (including the Commission) ☐ RE - Restricted to a group specified by the consortium (including the Commission Services) 2 | 53 European forum and oBsErvatory D2.3: Transport Research in the European Open Science Cloud for OPEN science in transport Document history Version Date Modified by (author/partner) Comments 0.1 Afroditi Anagnostopoulou, Alessia EOSC sections according to EC 2019.11.25 Bardi, Harry Dimopoulos documents. -
Monday, Oct 12, 2009
Monday, Oct 12, 2009 Keynote Presentations Infrastructures Use, Requirements and Prospects in ICT for Health Domain Karin Johansson European Commission, Brussels, Belgium Progress in Integrating Networks with Service Oriented Architectures / Grids: ESnet's Guaranteed Bandwidth Service William E. Johnston Senior Scientist, Energy Sciences Network, Lawrence Berkeley National Laboratory, USA The past several years have seen Grid software mature to the point where it is now at the heart of some of the largest science data analysis systems - notably the CMS and Atlas experiments at the LHC. Systems like these, with their integrated, distributed data management and work flow management routinely treat computing and storage resources as 'services'. That is, resources that that can be discovered, queried as to present and future state, and that can be scheduled with guaranteed capacity. In Grid based systems the network provides the communication among these service-based resources, yet historically the network is a 'best effort' resource offering no guarantees and little state transparency. Recent work in the R&E network community, that is associated with the science community, has made progress toward developing network capabilities that provide service-like characteristics: Guaranteed capacity can be scheduled in advance and transparency for the state of the network from end-to-end. These services have grown out of initial work in the Global Grid Forum's Grid Performance Working Group. The services are defined by standard interfaces and data formats, but may have very different implementations in different networks. An ad hoc international working group has been implementing, testing, and refining these services in order to ensure interoperability among the many network domains involved in science collaborations. -
Intel® SSF Reference Design: Intel® Xeon Phi™ Processor, Intel® OPA
Intel® Scalable System Framework Reference Design Intel® Scalable System Framework (Intel® SSF) Reference Design Cluster installation for systems based on Intel® Xeon® processor E5-2600 v4 family including Intel® Ethernet. Based on OpenHPC v1.1 Version 1.0 Intel® Scalable System Framework Reference Design Summary This Reference Design is part of the Intel® Scalable System Framework series of reference collateral. The Reference Design is a verified implementation example of a given reference architecture, complete with hardware and software Bill of Materials information and cluster configuration instructions. It can confidently be used “as is”, or be the foundation for enhancements and/or modifications. Additional Reference Designs are expected in the future to provide example solutions for existing reference architecture definitions and for utilizing additional Intel® SSF elements. Similarly, more Reference Designs are expected as new reference architecture definitions are introduced. This Reference Design is developed in support of the specification listed below using certain Intel® SSF elements: • Intel® Scalable System Framework Architecture Specification • Servers with Intel® Xeon® processor E5-2600 v4 family processors • Intel® Ethernet • Software stack based on OpenHPC v1.1 2 Version 1.0 Legal Notices No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. -
SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4
SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4 An administrator's guide for problem detection, resolution and optimization. Find how to inspect and optimize your system by means of monitoring tools and how to eciently manage resources. Also contains an overview of common problems and solutions and of additional help and documentation resources. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xiii -
SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3
SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3 SUSE Linux Enterprise High Performance Computing Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2020–2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents Preface vii 1 Available documentation vii 2 Giving feedback vii 3 Documentation conventions viii 4 Support x Support statement for SUSE Linux Enterprise High Performance Computing x • Technology previews xi 1 Introduction 1 1.1 Components provided 1 1.2 Hardware platform support 2 1.3 Support -
Version Control Graphical Interface for Open Ondemand
VERSION CONTROL GRAPHICAL INTERFACE FOR OPEN ONDEMAND by HUAN CHEN Submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY AUGUST 2018 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Huan Chen candidate for the degree of Master of Science Committee Chair Chris Fietkiewicz Committee Member Christian Zorman Committee Member Roger Bielefeld Date of Defense June 27, 2018 TABLE OF CONTENTS Abstract CHAPTER 1: INTRODUCTION ............................................................................ 1 CHAPTER 2: METHODS ...................................................................................... 4 2.1 Installation for Environments and Open OnDemand .............................................. 4 2.1.1 Install SLURM ................................................................................................. 4 2.1.1.1 Create User .................................................................................... 4 2.1.1.2 Install and Configure Munge ........................................................... 5 2.1.1.3 Install and Configure SLURM ......................................................... 6 2.1.1.4 Enable Accounting ......................................................................... 7 2.1.2 Install Open OnDemand .................................................................................. 9 2.2 Git Version Control for Open OnDemand -
Extending SLURM for Dynamic Resource-Aware Adaptive Batch
1 Extending SLURM for Dynamic Resource-Aware Adaptive Batch Scheduling Mohak Chadha∗, Jophin John†, Michael Gerndt‡ Chair of Computer Architecture and Parallel Systems, Technische Universitat¨ Munchen¨ Garching (near Munich), Germany Email: [email protected], [email protected], [email protected] their resource requirements at runtime due to varying computa- Abstract—With the growing constraints on power budget tional phases based upon refinement or coarsening of meshes. and increasing hardware failure rates, the operation of future A solution to overcome these challenges and efficiently utilize exascale systems faces several challenges. Towards this, resource awareness and adaptivity by enabling malleable jobs has been the system components is resource awareness and adaptivity. actively researched in the HPC community. Malleable jobs can A method to achieve adaptivity in HPC systems is by enabling change their computing resources at runtime and can signifi- malleable jobs. cantly improve HPC system performance. However, due to the A job is said to be malleable if it can adapt to resource rigid nature of popular parallel programming paradigms such changes triggered by the batch system at runtime [6]. These as MPI and lack of support for dynamic resource management in batch systems, malleable jobs have been largely unrealized. In resource changes can either increase (expand operation) or this paper, we extend the SLURM batch system to support the reduce (shrink operation) the number of processors. Malleable execution and batch scheduling of malleable jobs. The malleable jobs have shown to significantly improve the performance of a applications are written using a new adaptive parallel paradigm batch system in terms of both system and user-centric metrics called Invasive MPI which extends the MPI standard to support such as system utilization and average response time [7], resource-adaptivity at runtime. -
COLLABORATIONS, EXPLOITATION and SUSTAINABILITY PLAN Doc
EUROPEAN M IDDLEWARE INITIATIVE DNA2.4.2 - C OLLABORATIONS , E XPLOITATION AN D S USTAINABILITY P LAN EU D ELIVERABLE : D3.1.1 EMI-DNA3.1.1-1450889- Document identifier: Collaborations_Exploitation_and_Sustainability_Plan_M24- v1.0.doc Date: 30/04/2012 Activity: NA3 – Sustainability and Long-Term Strategies Lead Partner: NIIF Document status: Final Document link: https://cdsweb.cern.ch/record/1450889?ln=en Abstract: This document is a 12-month follow-up on the initial Exploitation and Sustainability Plan, DNA2.4.2, and a 16-month follow-up of the Initial Collaboration Programs DNA2.1.1. In this document we described the current definition of the sustainability and exploitation plans, summarize the progress made in its implementation and give an outline of the work planned for the final year of the project. INFSO-RI-261611 2010-2013 © Members of EMI collaboration PUBLIC 1 / 79 DNA2.4.2 - COLLABORATIONS, EXPLOITATION AND SUSTAINABILITY PLAN Doc. Identifier: EMI-DNA3.1.1-1450889-Collaborations_Exploitation_and_Sustainability_Plan_M24-v1.0.doc Date: 30/04/2012 I. DELIVERY SLIP Name Partner/Activity Date From Peter Stefan NIIF/NA3 24 May 2012 Alberto Di Meglio CERN/NA1 Reviewed by Florida Estrella CERN/NA1 25 May 2012 Ludek Matyska CESNET/CB Approved by PEB 26/05/2012 II. DOCUMENT LOG Issue Date Comment Author/Partner 0.7 18/05/2012 First draft Peter Stefan/NIIFI Peter Stefan/NIIFI Alberto Di 0.8- 22- Iterative revisions Meglio/CERN 0.16 25/05/2012 Florida Estrella/CERN 1.0 26/05/2012 Final version for release PEB III. DOCUMENT CHANGE RECORD Issue Item Reason for Change IV.