A Novel Simulator for Heterogeneous Parallel and Distributed Systems

Total Page:16

File Type:pdf, Size:1020Kb

A Novel Simulator for Heterogeneous Parallel and Distributed Systems TE C H N I C A L UN I V E R S I T Y O F CR E T E ELECTRONIC AND COMPUTER ENGINEERING DEPARTMENT MICROPROCESSOR & HARDWARE LABORATORY A novel Simulator for Heterogeneous Parallel and Distributed Systems NIKOLAOS G. TAMPOURATZIS DISSERTETION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE DEGREE OF DOCTOR OF PHILOSOPHY AT TECHNICAL UNIVERSITY OF CRETE CHANIA, GREECE JUNE 2018 2 Doctoral Thesis Committee Ioannis Papaefstathiou (Supervisor) Associate Professor, Technical University of Crete Dionisios Pnevmatikatos Professor, Technical University of Crete Apostolos Dollas Professor, Technical University of Crete Kostas Kalaitzakis Professor, Technical University of Crete Vasilis Samoladas Associate Professor, Technical University of Crete Dimitrios Soudris Associate Professor, National Technical University of Athens Nikolaos Bellas Associate Professor, University of Thessaly 3 Thesis Statement The astonishing growing development of Heterogeneous Parallel Systems and CPS trigger an emergence need of efficient simulators for such platforms, simulators that are extremely complex and processing hungry. 4 Abstract In an era of complex networked heterogeneous systems, simulating independently only parts, components or attributes of a system-under-design is not a viable, accurate or efficient option. By considering each part of a system in an isolated manner, and due to the numerous and highly complicated interactions between the different components, the system optimization capabilities are severely limited. One of the main problems Cyber Physical Systems (CPS) and Highly Parallel Systems (HPS) designers face is the lack of simulation tools and models for system design and analysis. This is mainly because the majority of the existing simulation tools can handle efficiently only parts of a system (e.g. only the processing or only the network). Moreover, most of the existing simulators need extreme amounts of processing resources while faster approaches cannot provide the necessary precision and accuracy. On top of that, the growing use of hardware accelerators in both embedded systems (e.g. mobile phones) and high-end systems (e.g. HPC/Cloud systems) triggers an urgent demand for simulation frameworks that can simulate in an integrated manner all the components (i.e. CPUs, Memories, Networks, Hardware Accelerators) of a system-under- design (SuD). By utilizing such systems, software design can proceed in parallel with hardware development which will result in the reduction of the so important time-to-market. The main problem, however, is that such systems do not exist; most current simulators used for modelling the user applications (i.e. full-system CPU/Mem/Peripheral simulators) lack any type of support for tailor-made hardware accelerators. In this thesis we present the COSSIM simulation framework which is the first known open-source, high-performance simulator that can handle holistically system-of-systems including processors, peripherals and networks; such an approach is very appealing to both CPS and Highly Parallel Heterogeneous Systems designers and application developers. In the context of COSSIM, a novel intercommunication and synchronization scheme is developed which is fully compliant with the IEEE HLA standard. Our highly integrated approach is further augmented with accurate power estimation that can tap on all system 5 components and perform analysis of the overall system under design something which was unfeasible up to know. In addition, we present the ACSIM framework which is the first known open-source, high-performance simulator that can handle holistically system-of-systems including processors, peripherals, networks and accelerators. ACSIM is an extension of the COSSIM simulation framework and it integrates, in a novel and efficient way, a combined system and network simulator with a SystemC simulator, in a transparent to the end-user way. Finally, a sophisticated Eclipse-based GUI has been developed to provide easy simulation set-up, execution and visualization of results. COSSIM and ACSIM have been evaluated when executing several real-world use cases; the end results demonstrate that the presented approach has up to 99% accuracy in the reported SuD aspects (when compared with the corresponding characteristics measured in the real systems), while the overall simulation time can be accelerated almost linearly with the number of CPUs utilized by the simulator. Finally, the presented ACSIM interconnection scheme between the Processing and the SystemC simulators is orders of magnitude faster than the existing solutions, while our simulation framework can efficiently simulate up to several hundreds of processing nodes with hardware accelerators interconnected together, in a full distributed manner. 6 Acknowledgements I am especially grateful to my supervisor, Associate Professor Ioannis Papaefstathiou for his guidance and support during all my years in my Ph.D. I would like to thank Antonios Nikitakis, Andreas Brokalakis, Konstantinos Georgopoulos, Pavlos Malakonakis for the excellent collaboration and their valuable contribution in this thesis work. Moreover, I would like to thank Professors Dionisios Pnevmatikatos, Apostolos Dollas, Kostas Kalaitzakis, Manolis Katevenis, Dimitrios Soudris, Pavlos Mattheakis for their co-advising and their participation in my Ph.D. Thesis committee. Finally, I would like to thank my friends and family, my father George, my mother Irene, my brother Manos, as well as Xrysa for their love, support and encouragement. This work was carried out with the financial support of Telecommunication Systems Institute (Chania Crete Greece) in the framework of the COSSIM and ECOSCALE H2020 European projects with scientific support of Ioannis Papaefstathiou. 7 Contents Doctoral Thesis Committee ...................................................................................................................... 3 Thesis Statement ........................................................................................................................................... 4 Abstract........................................................................................................................................................ 5 Acknowledgements ................................................................................................................................... 7 List of Tables ............................................................................................................................................. 12 List of Figures ........................................................................................................................................... 13 1. Introduction .......................................................................................................................................... 17 1.1 Motivation ....................................................................................................................................... 18 1.2 Contribution.................................................................................................................................... 19 2. Parallel Systems Simulators and COSSIM/ACSIM Approach ..................................................... 23 2.1 CPS Simulation Tools .................................................................................................................... 24 2.1.1 TOSSIM (extensions: PowerTOSSIM) .................................................................................. 24 2.1.2 ATEMU ..................................................................................................................................... 25 2.1.3 AVRORA .................................................................................................................................. 26 2.1.4 WorldSens ................................................................................................................................ 26 2.1.5 Cooja (Contiki OS) .................................................................................................................. 27 2.1.6 SunShine (TOSSIM – SimulAVR - GEZEL) ......................................................................... 28 2.2 Cloud Simulation tools .................................................................................................................. 31 2.2.1 CloudSim .................................................................................................................................. 32 2.2.2 BigHouse – CactoSim - GreenCloud .................................................................................... 34 2.3 HPC Simulation tools .................................................................................................................... 36 3. COSSIM/ACSIM Design Choices ...................................................................................................... 39 3.1 An Overview of the Processor-Only Simulation Tools ............................................................ 40 3.1.1 Imperas Open Virtual Platforms (OVP) ............................................................................... 41 3.1.2 SimpleScalar ............................................................................................................................. 41 3.1.3 CPU Sim ................................................................................................................................... 42 3.1.4 ESCAPE ...................................................................................................................................
Recommended publications
  • Table of Contents
    A Comprehensive Introduction to Vista Operating System Table of Contents Chapter 1 - Windows Vista Chapter 2 - Development of Windows Vista Chapter 3 - Features New to Windows Vista Chapter 4 - Technical Features New to Windows Vista Chapter 5 - Security and Safety Features New to Windows Vista Chapter 6 - Windows Vista Editions Chapter 7 - Criticism of Windows Vista Chapter 8 - Windows Vista Networking Technologies Chapter 9 -WT Vista Transformation Pack _____________________ WORLD TECHNOLOGIES _____________________ Abstraction and Closure in Computer Science Table of Contents Chapter 1 - Abstraction (Computer Science) Chapter 2 - Closure (Computer Science) Chapter 3 - Control Flow and Structured Programming Chapter 4 - Abstract Data Type and Object (Computer Science) Chapter 5 - Levels of Abstraction Chapter 6 - Anonymous Function WT _____________________ WORLD TECHNOLOGIES _____________________ Advanced Linux Operating Systems Table of Contents Chapter 1 - Introduction to Linux Chapter 2 - Linux Kernel Chapter 3 - History of Linux Chapter 4 - Linux Adoption Chapter 5 - Linux Distribution Chapter 6 - SCO-Linux Controversies Chapter 7 - GNU/Linux Naming Controversy Chapter 8 -WT Criticism of Desktop Linux _____________________ WORLD TECHNOLOGIES _____________________ Advanced Software Testing Table of Contents Chapter 1 - Software Testing Chapter 2 - Application Programming Interface and Code Coverage Chapter 3 - Fault Injection and Mutation Testing Chapter 4 - Exploratory Testing, Fuzz Testing and Equivalence Partitioning Chapter 5
    [Show full text]
  • Getting Started Mikrocodesimulator Mikrosim 2010 and Code-Generator Mikrobat 2010
    Mikrocodesimulator MikroSim 2010 Code-Generator MikroBAT 2010 Getting Started Mikrocodesimulator MikroSim 2010 and Code-Generator MikroBAT 2010 On following pages a short overview of the possibilities to create microcoding machine codes with the Mikrocodesimulator MikroSim 2010 is given, and how applying it when assembling binary code with the Code-Generator MikroBAT 2010. By means of several specially selected examples, i.e. in which simply the register A (R1) of the Mikrocodesimulator is incremented by one, the functionality of the simulator is explained step by step. At first, in the so called exploration mode the 49-bit microcode is investigated in its functionality. Already only one single microcode itself can control the execution of the incrementation of a register. In the next example, the microcode simulation mode is explained, that uses already programmed microcode examples provided by the simulator setup routine. The incrementation can be achieved in a sequence of microcode instructions (Example 1), in a looped sequence of microcode instructions (Example 2) or in a looped sequence of microcoded machine codes (Example3). Finally, a microcoded machine code program is presented which calculates the factorial (i.e. n!) of an integer value. Installation of the Mikrocodesimulator MikroSim 2010 The Mikrocodesimulator MikroSim 2010 is distributed with its own installation program that ease you the installation on your MS-Windows system. At the beginning of the installation the user can choose between two installation languages: English and German. The setup language automatically determines the start up-language of the simulation application. Of course, in the application itself the user can choose at any time the start-up language of the user interface of the bilingual software.
    [Show full text]
  • A Generic Description and Simulation of Architectures Based on Microarchitectures
    UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL INSTITUTO DE INFORMÁTICA CURSO DE BACHARELADO EM CIÊNCIA DA COMPUTAÇÃO GUSTAVO GARCIA VALDEZ A Generic Description and Simulation of Architectures Based on Microarchitectures Monograph presented in partial fulfillment of the requirements for the degree of Bachelor of Computer Science Prof. Dr. Raul Fernando Weber Advisor Porto Alegre, December 5th, 2013 CIP – CATALOGING-IN-PUBLICATION Gustavo Garcia Valdez, A Generic Description and Simulation of Architectures Based on Microarchitectures / Gustavo Garcia Valdez. – Porto Alegre: Graduação em Ciên- cia da Computação da UFRGS, 2013. 64 f.: il. Monograph – Universidade Federal do Rio Grande do Sul. Curso de Bacharelado em Ciência da Computação, Porto Alegre, BR–RS, 2013. Advisor: Raul Fernando Weber. 1. Computer Architectures. 2. Machine Simulation. 3. Mi- croarchitectures. 4. Computer Organization. I. Weber, Raul Fer- nando. II. Título. UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL Reitor: Prof. Carlos Alexandre Netto Vice-Reitor: Prof. Rui Vicente Oppermann Pró-Reitor de Graduação: Prof. Sérgio Roberto Kieling Franco Diretor do Instituto de Informática: Prof. Luís da Cunha Lamb Coordenador do CIC: Prof. Raul Fernando Weber Bibliotecário-chefe do Instituto de Informática: Alexsander Borges Ribeiro Instruction tables will have to be made up by mathematicians with com- puting experience and perhaps a certain puzzle-solving ability. There need be no real danger of it ever becoming a drudge, for any processes that are quite mechanical may be turned over to the machine itself. (TURING, 1946) ACKNOWLEDGMENTS Firstly, I would like to thank my advisor Prof. Raul Fernando Weber for the support in my bachelor thesis, for everything that I learned from him in many classes during the bachelor and for envisioning the system that I would them use as conceptual basis for my project idea.
    [Show full text]
  • Enocs: an Interactive Educational Network-On-Chip Simulator
    Paper ID #14561 ENoCS: An Interactive Educational Network-on-Chip Simulator Paul William Viglucci, Binghamton University Prof. Aaron P. Carpenter, Wentworth Institute of Technology Professor Carpenter is an Assistant Professor at the Wentworth Institute of Technology. In 2012, he completed his PhD at the University of Rochester, focusing on the performance and energy of the on-chip interconnect. c American Society for Engineering Education, 2016 ENoCS: An Interactive Educational Network-on-Chip Simulator Paul Viglucci∗ and Aaron Carpentery ∗Dept. of Electrical & Computer Engineering Binghamton University, [email protected] yDept. of Electrical Engineering & Tech. Wentworth Institute of Technology, [email protected] Abstract On-chip networking concepts, which are central to multicore microprocessor design, are often taught using textbooks and the standard lecture model, as it is difficult to provide interactive learning opportunities and hands-on assignments. Available on-chip network simulators typically focus on research-level accuracy and are not suitable for novices or students. Meanwhile, with each new chip generation, the importance of the on-chip interconnect grows. Career opportunities in computer architecture will increasingly rely on an understanding of the chip’s communication substrate. The lack of interactive, approachable tools thus leaves students with a gap in their computer architecture education in an increasingly multicore industry. In this paper, we present ENoCS, the Educational Network-on-Chip Simulator, which allows users of all levels of expertise to explore the on-chip network environment and see the inner workings of the on-chip communication substrate and all of its components. ENoCS contains multiple topologies and network options, as well as multiple traffic patterns for testing.
    [Show full text]
  • Design of Embedded DSP Processors Unit 7: Programming Toolchain
    © Copyright of Linköping University, all rights reserved ® TSEA80 by Dake Liu: [email protected] Design of Embedded DSP Processors Unit 7: Programming toolchain 9/26/2017 Unit 7 of TSEA26 – 2017 –H1 1 © Copyright of Linköping University, all rights reserved ® TSEA80 by Dake Liu: [email protected] Toolchain introduction There are two kinds of tools 1.The ASIP design tool for HW designers 。Frontend tool: Profiler and architecture optimizer 。Backend tool: Processor (ASIP) synthesizer 。Pipeline accurate simulator: For HW debugging 2.The assembly coding tool for programmers 。C-compiler, Assembler, Linker, (Loader) 。 Instruction set simulator: for ASM programmers 。 ASM coding Debugger: for firmware designers 9/26/2017 Unit 7 of TSEA26 – 2017 –H1 2 © Copyright of Linköping University, all rights reserved ® TSEA80 by Dake Liu: [email protected] Job loads to develop an ASIP Processor HW architecture Toolchain and kernel lib SW Ecological env 9/26/2017 Unit 7 of TSEA26 – 2017 –H1 3 Product © Copyright of Linköping University, all rights reserved ® Source code TSEA80 by Dake Liu: [email protected] Specification Lexical analysis, parsing Compiler Understand CFG extraction Front-end Applications CFG Front-end Function and Architecture design Static Specification Annotation profiling ASM Dynamic Instruction set Stimuli Critical path Specification profiling Cost and ASIP design Performance document Estimation ASIP Designer Synthesizer Idea + Application analysis Benchmark nML language, PDG language and chess compiler Synthesized ASIP Architecture Assembly Constraint
    [Show full text]
  • System-Level Energy Optimisation Methodologies for DRAM Memory of Embedded Systems
    System-level Energy Optimisation Methodologies for DRAM Memory of Embedded Systems by Su Myat Min Shwe A Thesis Submitted in Accordance with the Requirements for the Degree of Doctor of Philosophy School of Computer Science and Engineering The University of New South Wales Nov 2013 ⃝c Copyright by Su Myat Min Shwe 2013 All Rights Reserved ii Thesis Publications • S. M. Min, H. Javaid, A. Ignjatovic and S. Parameswaran. A Case Study on Exploration of Last-level Cache for Energy Reduction in DDR3 DRAM. In 2nd Mediterranean Conference on Embedded Computing, MECO 2013 & ECyPS2013,´ Budva, Montenegro. • S. M. Min, H. Javaid and S. Parameswaran. XDRA: Exploration and Opti- mization of Last-level Cache for Energy Reduction in DDR DRAMs. In Design Automation Conference, DAC'13, USA, June 2013. • S. M. Min, H. Javaid and S. Parameswaran. RExCache: Rapid Exploration of Unified Last-level Cache. In Asia and South Pacific Design Automation Conference, ASP-DAC'13, Japan, Jan 2013. • S. M. Min, J. Peddersen, and S. Parameswaran. Realising Cycle Accu- rate Processor-Memory Simulation via Interface Abstraction. In VLSI Design (VLSI Design), 2011 24th International Conference on VLSI Design, VLSI'11, India, Jan 2011. vii Contributions of this Thesis • A novel interface abstraction layer between the processor and memory system is proposed to implement a cycle-accurate processor-memory system simulator, so that detailed statistics of the memory system, such as performance, and power consumption, can be captured cycle-accurately. • A novel estimation methodology of the execution time and energy consumption of the memory system is proposed. • A rapid exploration framework is presented to quickly estimate a suitable last- level cache configuration which enables maximum power savings with negligible performance degradation of the memory system.
    [Show full text]
  • Department of Computer Science and Engineering Curriculum and Syllabi – Regulations 2011
    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CURRICULUM AND SYLLABI – REGULATIONS 2011 HOURS/WEEK MAXIMUM MARKS S.NO CODE COURSE CREDITS L T P CA FE TOTAL SEMESTER - 1 THEORY 1 11USL101 Communication Skills - I 3 0 1 3 40 60 100 2 11USM101 Engineering Mathematics - I 3 1 0 4 40 60 100 3 11USC102 Chemistry for Computing Sciences 3 0 0 3 40 60 100 4 11UCK101 Fundamentals of Computing 3 0 0 3 40 60 100 5 11USP102 Physics for Computing Sciences 3 0 0 3 40 60 100 Basics of Electrical and Electronics 6 11UFK101 3 1 0 4 40 60 100 Engineering 7 11UCK102 History of Science and Engineering 1 0 0 1 100 - 100 PRACTICAL 1 11USH111 Physical Science lab I 0 0 3 1 40 60 100 2 11UCK103 Computing Practices Lab 0 0 3 2 40 60 100 3 11UAK108 Engineering Graphics Lab 1 0 3 2 40 60 100 TOTAL 20 2 10 26 HOURS/WEEK MAXIMUM MARKS S.NO CODE COURSE CREDITS L T P CA FE TOTAL SEMESTER - 2 THEORY 1 11USL201 Communication Skills - II 3 0 1 3 40 60 100 2 11USM201 Engineering Mathematics - II 3 1 0 4 40 60 100 3 11USC201 Environmental Science and Engineering 3 0 0 3 40 60 100 4 11UCK201 C Programming and Practices 3 1 0 4 40 60 100 5 11UAK201 Engineering Mechanics 3 1 0 4 40 60 100 6 11USP202 Science of Engineering Materials 3 0 0 3 40 60 100 PRACTICAL 1 11USH211 Physical Sciences Lab II 0 0 3 1 40 60 100 2 11UCK202 C Programming Lab 0 0 3 2 40 60 100 3 11UAK204 Engineering Practices Lab 0 0 3 2 40 60 100 TOTAL 18 3 10 26 HOURS/WEEK MAXIMUM MARKS S.NO CODE COURSE CREDITS L T P CA FE TOTAL SEMESTER - 3 THEORY 1 11USM301 Engineering Mathematics- III 3 1 0 4 40 60 100
    [Show full text]
  • POLITECNICO DI MILANO Master of Science Intelecommunication Engineering
    POLITECNICO DI MILANO Master of Science inTelecommunication engineering Electronics, Information and Bioengineering Department Visual Search modeling for end to end simulation of a Cyber Physical Systems Supervisor: Prof. Marco Marcon Advisor: Eng. Danilo Pietro Pau Eng. Emanuele Plebani Graduation thesis of: Shen Yun 835888 Acadmic year 2015 - 2016 Acknowledgements There are people whom I would like to acknowledge, for their assistance and support during my studies in Politecnico di Milano. I would like to thank all the wonderful teachers, colleagues, family, and friends whom I have been fortunate to interact with during my lifetime. I would like to take this opportunity to express my sincere gratitude and appreciation to my supervisor in STMicroelectronics, Danilo Pau for his countless efforts in guiding and encouraging me throughout my studies and work. His friendly attitude has been a very strong support for me to work with him. This work without his guidances and encouragements was not possible. I am so grateful to him. I am also thankful to Prof. Marco Marcon, for his valuable advices in monthly meetings and discussions, which was a plus point during my re- search. Without doubt, all these meetings together guided me into a bright way to handle the research and studies. I would like to give special thanks to Eng.Emanuele Plebani and Eng.Marco Brando Paracchini in STMicroelectronics for the uncountable helps and ad- vices they have given to me during my internship in STMicroelectronics. Also I have to thank my colleagues in the university, for their effortless helps, valuable advices and discussions. We had great and unforgettable times during all these years.
    [Show full text]