Yu Zhang Performance Improvement of Hypervisors for HPC Workload

Total Page:16

File Type:pdf, Size:1020Kb

Yu Zhang Performance Improvement of Hypervisors for HPC Workload Yu Zhang Performance Improvement of Hypervisors for HPC Workload Yu Zhang Performance Improvement of Hypervisors for HPC Workload Universitätsverlag Chemnitz 2018 Impressum Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbi- bliografie; detaillierte bibliografische Angaben sind im Internet über http://www.dnb.de abrufbar. Das Werk - ausgenommen Zitate, Cover, Logo TU Chemnitz und Bildmaterial im Text - steht unter der Creative-Commons-Lizenz Namensnennung 4.0 International (CC BY 4.0) http://creativecommons.org/licences/by/4.0/deed.de Titelgrafik: Yu Zhang Satz/Layout: Yu Zhang Technische Universität Chemnitz/Universitätsbibliothek Universitätsverlag Chemnitz 09107 Chemnitz https://www.tu-chemnitz.de/ub/univerlag readbox unipress in der readbox publishing GmbH Am Hawerkamp 31 48155 Münster http://unipress.readbox.net ISBN 978-3-96100-069-2 http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa2-318258 Performance Improvement of Hypervisors for HPC Workload Dissertation zur Erlangung des akademischen Grades Dr.-Ing Herr M. Eng Yu Zhang geboren am 7.Juli 1980 in Huhhot, China Fakultät für Informatik an der Technischen Universität Chemnitz Dekan : Prof. Dr.-rer. nat. Wolfram Hardt Gutachter : Prof. Dr.-Ing. habil. Matthias Werner Prof. Dr.-Ing. Alejandro Masrur Tag der Einreichung : 30.01.2018 Tag der Verteidigung : 03.07.2018 Zhang, Yu Performance Improvement of Hypervisors for HPC Workload Dissertation, Fakult¨at fur¨ Informatik Technical University of Chemnitz, Januar 2019 Acknowledgments With eight years efforts, the PhD. study is slowly drawing to an end. Many people have rendered their help, expressed concern, and encouraged me. All these had strengthened my confidence to overcome the difficulties all along the way and go so far to create this work. First, I sincerely thank my former supervisor, Prof. Wolfgang Rehm, who had offered me the opportunity to pursue the PhD study in TU Chemnitz, arranged the equipments and environ- ment that are necessary for scientific research. More importantly, under the guidance of Prof. Rehm, I was ushered to the domains of Computer Architecture, HPC and System Virtualization. Efforts on a topic of these aspects lead to this dissertation. Second, I would like to express my heartful gratitude to my current supervisor, Prof. Matthias Werner, who has sincerely rendered his academic supervision on my research topic and arranged a very comfortable environment for me to complete this research task. Without Prof. Werner's generous support, I cannot image to complete this research. As an earnest scholar and upright man, Prof. Werner really impressed me. My thanks extend to my colleagues in the two professors research groups. Ren´eOertel, Nico Mittenzwey, Hendrik N¨oll and Johannes Hiltscher had rendered me invaluable suggestions on many issues, from equipment, publication to professional technical skills to my stay in Chemnitz. With his patience and experience, Dr. Peter Tr¨oger helped me to avoid quite a lot of incorrect ideas for research and drafting each chapter of this dissertation. I benefited enormously from his comments, feedback, and the conversation with him. It is always my best memory to attend the conferences of EMS 2013 in Manchester, BigData- Science'14 in Beijing, Euro-Par VHPC'15 in Vienna, and ISC VHPC'17 in Frankfurt. I really feel gratitude for many unknown reviewers who evaluate my contributions positively to give me the chances to contact with world's excellent researchers and scholars. During these conferences, the chairs, Prof. David Al-Dabass, Dr. Alvin Chin, and Dr. Michael Alexander, had very nice talks with me. Dr. Andrew J. Younge, whom I get to know at ISC VHPC'17, gave me valuable constructive comments for a portion of this manuscript. Prof. Andreas Goerdt encourages me to keep on even at the tough moments. Other friends, Wang Jian, Chou Chih-Ying, and Bai Qiong supported me all through the past eight years. They are true friends with whom I can share joy and frustration. Furthermore, I would like to mention the China Scholarship Council (CSC), whose financing makes most of this research possible and the Deutscher Akademischer Austauschdienst (DAAD), who also sponsored me in the framework of a research project in the last year. This manuscript is finally proofread by Ms. Jennifer Hofmann and Ms. Christine Jakobs. They took great patience to scrutinize the lengthy text, correct the grammatical and typograph- ical errors, punctuation, spelling, and inconsistencies. Their efforts had helped me to produce a significantly clearer and more professional scientific work. A few lines of words could never be enough to express my gratitude. Finally, I express my gratitude to my parents and my brother, who silently supported me all through the years with their love, concern and understanding from thousands of miles away. I dedicate this book to my father, who loved me so much but left me forever before the submission. Abstract The virtualization technology has many excellent features beneficial for today's high-performance computing (HPC). It enables more flexible and effective utilization of the computing resources. However, a major barrier for its wide acceptance in HPC domain lies in the relative large perfor- mance loss for workloads. Of the major performance-influencing factors, memory management subsystem for virtual machines is a potential source of performance loss. Many efforts have been invested in seeking the solutions to reduce the performance overhead in guest memory address translation process. This work contributes two novel solutions - \DPMS" and\STDP". Both of them are presented conceptually and implemented partially for a hypervisor - KVM. The benchmark results for DPMS show that the performance for a number of workloads that are sensitive to paging methods can be more or less improved through the adoption of this solution. STDP illustrates that it is feasible to reduce the performance overhead in the second- dimension paging for those workloads that cannot make good use of the TLB. Zusammenfassung Virtualisierungstechnologie verfugt¨ uber¨ viele hervorragende Eigenschaften, die fur¨ das heutige Hochleistungsrechnen von Vorteil sind. Es erm¨oglicht eine flexiblere und effektivere Nutzung der Rechenressourcen. Ein Haupthindernis fur¨ Akzeptanz in der HPC-Dom¨ane liegt jedoch in dem relativ großen Leistungsverlust fur¨ Workloads. Von den wichtigsten leistungsbeeinflussenden Faktoren ist das Speicherverwaltung-Subsystem fur¨ virtuelle Maschinen eine potenzielle Quelle der Leistungsverluste. Es wurden viele Anstrengungen unternommen, um L¨osungen zu finden, die den Leistungsaufwand beim Konvertieren von Gastspeicheradressen reduzieren. Diese Arbeit liefert zwei neue L¨osungen DPMS\ und STDP\. Beide werden konzeptionell vorgestellt und teilweise fur¨ einen Hyper- " " visor - KVM - implementiert. Die Benchmark-Ergebnisse fur¨ DPMS zeigen, dass die Leistung fur¨ eine Reihe von pagingverfahren-spezifischen Workloads durch die Einfuhrung¨ dieser L¨osung mehr oder weniger verbessert werden kann. STDP veranschaulicht, dass es m¨oglich ist, den Leistungsaufwand im zweidimensionale Paging fur¨ diejenigen Workloads zu reduzieren, die die von dem TLB anbietende Vorteile nicht gut ausnutzen k¨onnen. Contents List of Figures xvii List of Tables xix List of Abbreviations xxi 1 Introduction 1 1.1 Background . 1 1.1.1 HPC and its Major Concerns . 1 Performance . 1 Power Consumption and Efficiency . 3 Hardware Utilization and Efficiency . 5 1.1.2 System Virtualization . 6 Overview of Virtualization . 6 Virtualization's Benefits . 7 Virtualization's Limitations . 7 1.1.3 Exploit Virtualization in HPC . 9 Customizable Execution Environment . 9 System Resource Utilization Enhancing . 9 Power and Hardware Efficiency Enhancing . 9 1.2 Problem Description . 10 1.3 Thesis Contribution to the Current Research . 10 1.4 Outline of the Thesis . 11 2 Related Work 13 2.1 System Virtualization . 13 2.2 Virtualization in HPC . 17 2.3 Performance for Memory Virtualization . 21 2.4 Summary . 23 3 A Study of the Performance Loss for Memory Virtualization 25 3.1 HPC Workload Deployment Patterns . 25 3.2 Benchmark Selection for Intra-node Pattern . 27 3.3 Benchmark Platform . 29 3.3.1 Intel Platform . 29 Intel CoreTM i7-6700K (Skylake) . 29 Intel Xeon E5-1620 (Ivy Bridge-E) . 30 3.3.2 AMD Platform . 30 xiii Contents AMD FXtm-8150 (Bulldozer) . 30 3.3.3 System Software . 31 3.4 Performance for Memory Paging . 31 3.5 Summary . 38 4 DPMS - Dynamic Paging Method Switching and STDP - Simplified Two Dimensional Paging 39 4.1 Reflections and Solutions . 40 4.1.1 Dynamic Paging Method Switching . 40 4.1.2 STDP with Large Page Table . 40 4.2 DPMS - Dynamic Paging Method Switching . 41 4.2.1 Performance Data Sampling . 41 4.2.2 Data Processing . 42 4.2.3 Decision Making . 42 4.2.4 Switching Mechanism . 43 4.3 STDP - Simplified Two-Dimensional Paging . 46 4.3.1 Revisiting the Current Paging Scheme . 46 4.3.2 Restructured Page Table . 50 4.3.3 Page Fault Handling in TDP . 50 4.3.4 Adaptive Hardware MMU . 52 4.4 Summary . 52 5 Implementation 55 5.1 QEMU-KVM Hypervisor Analysis . 55 5.1.1 About QEMU . 56 Main Components . 59 5.1.2 About KVM . 60 Guest Creation and Execution . 60 5.2 Parameter Study for DPMS . 62 5.3 DPMS on QEMU-KVM for x86-64 . 70 5.3.1 Performance Data Sampling . 70 5.3.2 Data Processing . 73 5.3.3 Decision Making . 73 5.3.4 Switching Mechanism . 74 5.3.5 Repetitive Mechanism . 79 5.3.6 PMC Mechanism in QEMU-KVM Context . 80 5.3.7 DPMS for Multi-Core Processor . 81 5.4 STDP on QEMU-KVM for x86-64 . 82 5.4.1 Restructured Page Table . 83 5.4.2 Adaptive MMU for TDP . 85 5.5 Summary . 87 6 Experiments and Evaluation 89 6.1 Objective ........................................ 89 6.2 Testing Design . 90 6.2.1 Functional Correctness Test . 91 Performance Data Sampling . 91 Decision Making . 91 Paging Method Switching . 91 6.2.2 Performance Test . 92 xiv Contents 6.3 Test and Benchmark Results of DPMS .
Recommended publications
  • GPU Developments 2018
    GPU Developments 2018 2018 GPU Developments 2018 © Copyright Jon Peddie Research 2019. All rights reserved. Reproduction in whole or in part is prohibited without written permission from Jon Peddie Research. This report is the property of Jon Peddie Research (JPR) and made available to a restricted number of clients only upon these terms and conditions. Agreement not to copy or disclose. This report and all future reports or other materials provided by JPR pursuant to this subscription (collectively, “Reports”) are protected by: (i) federal copyright, pursuant to the Copyright Act of 1976; and (ii) the nondisclosure provisions set forth immediately following. License, exclusive use, and agreement not to disclose. Reports are the trade secret property exclusively of JPR and are made available to a restricted number of clients, for their exclusive use and only upon the following terms and conditions. JPR grants site-wide license to read and utilize the information in the Reports, exclusively to the initial subscriber to the Reports, its subsidiaries, divisions, and employees (collectively, “Subscriber”). The Reports shall, at all times, be treated by Subscriber as proprietary and confidential documents, for internal use only. Subscriber agrees that it will not reproduce for or share any of the material in the Reports (“Material”) with any entity or individual other than Subscriber (“Shared Third Party”) (collectively, “Share” or “Sharing”), without the advance written permission of JPR. Subscriber shall be liable for any breach of this agreement and shall be subject to cancellation of its subscription to Reports. Without limiting this liability, Subscriber shall be liable for any damages suffered by JPR as a result of any Sharing of any Material, without advance written permission of JPR.
    [Show full text]
  • CTL RFP Proposal
    State of Maine Department of Education in coordination with the National Association of State Procurement Officials PROPOSAL COVER PAGE RFP # 201210412 MULTI-STATE LEARNING TECHNOLOGY INITIATIVE Bidder’s Organization Name: CTL Chief Executive - Name/Title: Erik Stromquist / COO Tel: 800.642.3087 x 212 Fax: 503.641.5586 E-mail: [email protected] Headquarters Street Address: 3460 NW Industrial St. Headquarters City/State/Zip: Portland, OR 97210 (provide information requested below if different from above) Lead Point of Contact for Proposal - Name/Title: Michael Mahanay / GM, Sales & Marketing Tel: 800.642.3087 x 205 Fax: 503.641.5586 E-mail: [email protected] Street Address: 3460 NW Industrial St. City/State/Zip: Portland, OR 97219 Proposed Cost: $294/yr. The proposed cost listed above is for reference purposes only, not evaluation purposes. In the event that the cost noted above does not match the Bidder’s detailed cost proposal documents, then the information on the cost proposal documents will take precedence. This proposal and the pricing structure contained herein will remain firm for a period of 180 days from the date and time of the bid opening. No personnel on the multi-state Sourcing Team or any other involved state agency participated, either directly or indirectly, in any activities relating to the preparation of the Bidder’s proposal. No attempt has been made or will be made by the Bidder to induce any other person or firm to submit or not to submit a proposal. The undersigned is authorized to enter into contractual obligations on behalf of the above-named organization.
    [Show full text]
  • Fax86: an Open-Source FPGA-Accelerated X86 Full-System Emulator
    FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator by Elias El Ferezli A thesis submitted in conformity with the requirements for the degree of Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto Copyright c 2011 by Elias El Ferezli Abstract FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator Elias El Ferezli Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto 2011 This thesis presents FAx86, a hardware/software full-system emulator of commodity computer systems using x86 processors. FAx86 is based upon the open-source IA-32 full-system simulator Bochs and is implemented over a single Virtex-5 FPGA. Our first prototype uses an embedded PowerPC to run the software portion of Bochs and off- loads the instruction decoding function to a low-cost hardware decoder since instruction decode was measured to be the most time consuming part of the software-only emulation. Instruction decoding for x86 architectures is non-trivial due to their variable length and instruction encoding format. The decoder requires only 3% of the total LUTs and 5% of the BRAMs of the FPGA's resources making the design feasible to replicate for many- core emulator implementations. FAx86 prototype boots Linux Debian version 2.6 and runs SPEC CPU 2006 benchmarks. FAx86 improves simulation performance over the default Bochs by 5 to 9% depending on the workload. ii Acknowledgements I would like to begin by thanking my supervisor, Professor Andreas Moshovos, for his patient guidance and continuous support throughout this work.
    [Show full text]
  • Mass-Producing Your Certified Cluster Solutions
    Mass-Producing Your Certified Cluster Solutions White Paper The Intel® Cluster Ready program is designed to help you make the most of your engineering Intel® Cluster Ready resources. The program enables you to sell several different types of clusters from each solution High-Performance you design, by varying the hardware while maintaining the same software stack. You can gain a Computing broader range of cluster products to sell without engineering each one “from scratch.” As long as the software stack remains the same and works on each one of the hardware configurations, you can be confident that your clusters will interoperate with registered Intel Cluster Ready applications. Your customers can purchase your clusters with that same confidence, knowing they’ll be able to get their applications up and running quickly on your systems. Intel® Cluster Ready: Mass-Producing Your Certified Cluster Solutions Table of Contents Overview of Production-Related Activities � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 3 Creating and certifying recipes � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 3 Maintaining recipes � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 3 Mass-producing recipes � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
    [Show full text]
  • NVM Express and the PCI Express* SSD Revolution SSDS003
    NVM Express and the PCI Express* SSD Revolution Danny Cobb, CTO Flash Memory Business Unit, EMC Amber Huffman, Sr. Principal Engineer, Intel SSDS003 Agenda • NVM Express (NVMe) Overview • New NVMe Features in Enterprise & Client • Driver Ecosystem for NVMe • NVMe Interoperability and Plugfest Plans • EMC’s Perspective: NVMe Use Cases and Proof Points The PDF for this Session presentation is available from our Technical Session Catalog at the end of the day at: intel.com/go/idfsessions URL is on top of Session Agenda Pages in Pocket Guide 2 Agenda • NVM Express (NVMe) Overview • New NVMe Features in Enterprise & Client • Driver Ecosystem for NVMe • NVMe Interoperability and Plugfest Plans • EMC’s Perspective: NVMe Use Cases and Proof Points 3 NVM Express (NVMe) Overview • NVM Express is a scalable host controller interface designed for Enterprise and client systems that use PCI Express* SSDs • NVMe was developed by industry consortium of 80+ members and is directed by a 13-company Promoter Group • NVMe 1.0 was published March 1, 2011 • Product introductions later this year, first in Enterprise 4 Technical Basics • The focus of the effort is efficiency, scalability and performance – All parameters for 4KB command in single 64B DMA fetch – Supports deep queues (64K commands per Q, up to 64K queues) – Supports MSI-X and interrupt steering – Streamlined command set optimized for NVM (6 I/O commands) – Enterprise: Support for end-to-end data protection (i.e., DIF/DIX) – NVM technology agnostic 5 NVMe = NVM Express NVMe Command Execution 7 1
    [Show full text]
  • System Design for Telecommunication Gateways
    P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS Alexander Bachmutsky Nokia Siemens Networks, USA A John Wiley and Sons, Ltd., Publication P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS Alexander Bachmutsky Nokia Siemens Networks, USA A John Wiley and Sons, Ltd., Publication P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come This edition first published 2011 C 2011 John Wiley & Sons, Ltd Registered office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com. The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
    [Show full text]
  • Accelerate Your AI Journey with Intel
    Intel® AI Workshop 2021 Accelerate Your AI Journey with Intel Laurent Duhem – HPC/AI Solutions Architect ([email protected]) Shailen Sobhee - AI Software Technical Consultant ([email protected]) Notices and Disclaimers ▪ Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. ▪ No product or component can be absolutely secure. ▪ Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks . ▪ Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks . ▪ Intel® Advanced Vector Extensions (Intel® AVX) provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.
    [Show full text]
  • Ramp: Research Accelerator for Multiple Processors
    ..................................................................................................................................................................................................................................................... RAMP: RESEARCH ACCELERATOR FOR MULTIPLE PROCESSORS ..................................................................................................................................................................................................................................................... THE RAMP PROJECT’S GOAL IS TO ENABLE THE INTENSIVE, MULTIDISCIPLINARY INNOVATION John Wawrzynek THAT THE COMPUTING INDUSTRY WILL NEED TO TACKLE THE PROBLEMS OF PARALLEL David Patterson PROCESSING. RAMP ITSELF IS AN OPEN-SOURCE, COMMUNITY-DEVELOPED, FPGA-BASED University of California, EMULATOR OF PARALLEL ARCHITECTURES. ITS DESIGN FRAMEWORK LETS A LARGE, Berkeley COLLABORATIVE COMMUNITY DEVELOP AND CONTRIBUTE REUSABLE, COMPOSABLE DESIGN Mark Oskin MODULES. THREE COMPLETE DESIGNS—FOR TRANSACTIONAL MEMORY, DISTRIBUTED University of Washington SYSTEMS, AND DISTRIBUTED-SHARED MEMORY—DEMONSTRATE THE PLATFORM’S POTENTIAL. Shih-Lien Lu ...... In 2005, the computer hardware N Prototyping a new architecture in Intel industry took a historic change of direction: hardware takes approximately four The major microprocessor companies all an- years and many millions of dollars, nounced that their future products would even at only research quality. Christoforos Kozyrakis be single-chip multiprocessors, and that N Software
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Extracting and Mapping Industry 4.0 Technologies Using Wikipedia
    Computers in Industry 100 (2018) 244–257 Contents lists available at ScienceDirect Computers in Industry journal homepage: www.elsevier.com/locate/compind Extracting and mapping industry 4.0 technologies using wikipedia T ⁎ Filippo Chiarelloa, , Leonello Trivellib, Andrea Bonaccorsia, Gualtiero Fantonic a Department of Energy, Systems, Territory and Construction Engineering, University of Pisa, Largo Lucio Lazzarino, 2, 56126 Pisa, Italy b Department of Economics and Management, University of Pisa, Via Cosimo Ridolfi, 10, 56124 Pisa, Italy c Department of Mechanical, Nuclear and Production Engineering, University of Pisa, Largo Lucio Lazzarino, 2, 56126 Pisa, Italy ARTICLE INFO ABSTRACT Keywords: The explosion of the interest in the industry 4.0 generated a hype on both academia and business: the former is Industry 4.0 attracted for the opportunities given by the emergence of such a new field, the latter is pulled by incentives and Digital industry national investment plans. The Industry 4.0 technological field is not new but it is highly heterogeneous (actually Industrial IoT it is the aggregation point of more than 30 different fields of the technology). For this reason, many stakeholders Big data feel uncomfortable since they do not master the whole set of technologies, they manifested a lack of knowledge Digital currency and problems of communication with other domains. Programming languages Computing Actually such problem is twofold, on one side a common vocabulary that helps domain experts to have a Embedded systems mutual understanding is missing Riel et al. [1], on the other side, an overall standardization effort would be IoT beneficial to integrate existing terminologies in a reference architecture for the Industry 4.0 paradigm Smit et al.
    [Show full text]
  • Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems
    Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems (15-712) edited by Garth A. Gibson and Hyang-Ah Kim Jangwoo Kim††, Eriko Nurvitadhi††, Eric Chung††; Alex Nizhner†, Andrew Biggadike†, Jad Chamcham†; Srinath Sridhar∗, Jeffrey Stylos∗, Noam Zeilberger∗; Gregg Economou∗, Raja R. Sambasivan∗, Terrence Wong∗; Elaine Shi∗, Yong Lu∗, Matt Reid††; Amber Palekar†, Rahul Iyer† May 2005 CMU-CS-05-138 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 †Information Networking Institute ∗Department of Computer Science ††Department of Electrical and Computer Engineering Abstract This technical report contains six final project reports contributed by participants in CMU’s Spring 2005 Advanced Operating Systems and Distributed Systems course (15-712) offered by professor Garth Gibson. This course examines the design and analysis of various aspects of operating systems and distributed systems through a series of background lectures, paper readings, and group projects. Projects were done in groups of two or three, required some kind of implementation and evalution pertaining to the classrom material, but with the topic of these projects left up to each group. Final reports were held to the standard of a systems conference paper submission; a standard well met by the majority of completed projects. Some of the projects will be extended for future submissions to major system conferences. The reports that follow cover a broad range of topics. These reports present a characterization of synchronization behavior
    [Show full text]
  • Fast Setup and Integration of ABAQUS on HPC Linux Cluster and the Study of Its Scalability
    Fast Setup and Integration of ABAQUS on HPC Linux Cluster and the Study of Its Scalability Betty Huang, Jeff Williams, Richard Xu Baker Hughes Incorporated Abstract: High-performance computing (HPC), the massive powerhouse of IT, is now the fastest-growing sector in the industry, especially for the oil and gas industry, which has outpaced other US industries in integrating HPC into its critical business functions. HPC can offer greater capacity and flexibility to allow more advanced data analysis, which usually cannot be handled by individual workstations. In April 2008, Baker Oil Tools installed a Linux Cluster to boost its finite element analysis and computational fluid dynamics application performance. Platform Open Cluster Stack (OCS) has been implemented as cluster and system management software and load-sharing facility for high-performance computing (LSF-HPC) as job scheduler. OCS is a pre-integrated, modular and hybrid software stack that contains open-source software and proprietary products. It is a simple and easy way to rapidly assemble and manage small- to large-scale Linux-based HPC clusters. 1. Introduction of HPC A cluster is a group of linked computers working together closely so they form a single computer. Depending on the function, clusters can be divided into several types: high-availability (HA) cluster, load- balancing cluster, grid computing, and Beowulf cluster (computing). In this paper, we will focus on the Beowulf-type high-performance computing (HPC) cluster. Driven by more advanced simulation project needs and the capability of modern computers, the simulation world relies more and more on HPC. HPC cluster has become the dominating resource in this area.
    [Show full text]