HW-SW Components for Parallel Embedded Computing on Noc-Based Mpsocs Keywords

HW-SW COMPONENTS FOR PARALLEL EMBEDDED COMPUTING ON NOC-BASED MPSOCS Ph.D. Thesis in Computer Science (Microelectronics and Electronics System Department) Author: Supervisor: Jaume Joven Murillo Prof. Jordi Carrabina Bordoll UNIVERSITAT AUTONOMA` DE BARCELONA (UAB) December 2009 Bellaterra, Spain I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Prof. Jordi Carrabina Bordoll This work was carried out at Universitat Autonoma` de Barcelona (UAB), Ecole Polytechnique Fed´ erale´ de Lausanne (EPFL), and ARM Ltd. R&D Department (Cambridge). °c 2009 Jaume Joven Murillo All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechan- ical, photocopying, recording, or otherwise, without the prior written permission from the copyright owner. Science is an imaginative adventure of the mind seeking truth in a world of mystery. -Sir Cyril Herman Hinshelwood (1897-1967) English chemist. Nobel prize 1956. Abstract Recently, on the on-chip and embedded domain, we are witnessing the growing of the Multi-Processor System-on-Chip (MPSoC) era. Network- on-chip (NoCs) have been proposed to be a viable, efficient, scalable, pre- dictable and flexible solution to interconnect IP blocks on a chip, or full- featured bus-based systems in order to create highly complex systems. Thus, the paradigm to high-performance embedded computing is arriving through high hardware parallelism and concurrent software stacks to achieve max- imum system platform composability and flexibility using pre-designed IP cores. These are the emerging NoC-based MPSoCs architectures. However, as the number of IP cores on a single chip increases exponentially, many new challenges arise. The first challenge is the design of a suitable hardware interconnection to provide adequate Quality of Service (QoS) ensuring certain bandwidth and latency bounds for inter-block communication, but at a minimal power and area costs. Due to the huge NoC design space, simulation and verification environments must be put in place to explore, validate and optimize many different NoC architectures. The second target, nowadays a hot topic, is to provide efficient and flexible parallel programming models upon new generation of highly parallel NoC-based MPSoCs. Thus, it is mandatory the use of lightweight SW li- braries which are able to exploit hardware features present on the execution platform. Using these software stacks and their associated APIs according to a specific parallel programming model will let software application designers to reuse and program parallel applications effortlessly at higher levels of abstraction. Finally, to get an efficient overall system behaviour, a key research challenge is the design of suitable HW/SW interfaces. Specially, it is crucial in heterogeneous multiprocessor systems where parallel programming models and middleware functions must abstract the communication resources during high level specification of software applications. ii Thus, the main goal of this dissertation is to enrich the emerging NoC- based MPSoCs by exploring and adding engineering and scientific contribu- tion to new challenges appeared in the last years. This dissertation focuses on all of the above points: • by describing an experimental environment to design NoC-based systems, xENoC, and a NoC design space exploration tool named NoC- Maker. This framework leads to a rapid prototyping and validation of NoC-based MPSoCs. • by extending Network Interfaces (NIs) to handle heterogeneous traffic from different bus-based standards (e.g. AMBA, OCP-IP) in order to reuse and communicate a great variety off-the-shelf IP cores and software stacks in a transparent way from the user point of view. • by providing runtime QoS features (best effort and guaranteed ser- vices) through NoC-level hardware components and software middleware routines. • by exploring HW/SW interfaces and resource sharing when a Floating Point Unit (FPU) co-processor is interfaced on a NoC-based MPSoC. • by porting parallel programming models, such as shared memory or message passing models on NoC-based MPSoCs. We present the im- plementation of an efficient lightweight parallel programming model based on Message Passing Interface (MPI), called on-chip Message Passing Interface (ocMPI). It enables the design of parallel distributed computing at task-level or function-level using explicit parallelism and synchronization methods between the cores integrated on the chip. • by provide runtime application to packets QoS support on top of the OpenMP runtime library targeted for shared memory MPSoCs in order to boost or balance critical applications or threads during its execution. The key challenges explored in this dissertation are formalized on HW- SW communication centric platform-based design methodology. Thus, the outcome of this work will be a robust cluster-on-chip platform for high- performance embedded computing, whereby hardware and software components can be reused at multiple levels of design abstraction. ii HW-SW Components for Parallel Embedded Computing on NoC-based MPSoCs Keywords NETWORKS-ON-CHIP (NOCS) MULTI-PROCESSOR SYSTEM-ON-CHIPS (MPSOCS) NOC-BASED MPSOCS ON-CHIP INTERCONNECT FABRIC EDA/CAD NOCDESIGN FLOW HW-SW INTERFACES CYCLE-ACCURATE SIMULATION QUALITY-OF-SERVICE (QOS) MESSAGE PASSING PARALLEL PROGRAMMING MODELS iv iv HW-SW Components for Parallel Embedded Computing on NoC-based MPSoCs Acknowledgments A large number of people have been indispensable to succeed on the real- ization of this Ph.D. thesis. First of all, I would like to thank my advisor Prof. Jordi Carrabina for their guidance and support over these years. Also, I would like to express my deeply gratitude to Prof. David Castells, Prof. Lluis Teres,´ and Dr. Fed- erico Angiolini for their assistance and technical support provided, and for listening and helping me every time I needed it doing large research discus- sions and meetings. Another special mention to Prof. Giovanni De Micheli, Prof. David Atienza and Prof. Luca Benini to open the door giving me the opportunity to do research at Ecole Polytechnique Fed´ erale´ de Lausanne (EPFL) side by side with many brilliant people (thanks to Ciprian, Srini, Antonio,...), as well as in Bologna (thanks Andrea). And I am deeply grateful to Per Strid, John Penton, and Emre Ozer to give me the chance to live an incredible working experience in a world- class enterprise like ARM Ltd. at Research & Development Department in Cambridge, as well as all researchers at ARM Ltd R&D Department (specially to Derek, Akash and Antonino). A special thanks to all the people that I had the pleasure to work with in Barcelona (all colleagues of CEPHIS Lab, specially mention to Sergi and Eduard). Thought this Ph.D. I had the possibility to travel around Europe and meet very interesting people. However, special thanks to all people that I had the pleasure to meet during my Ph.D. research stays in Lausanne (Nes- tle ”Spanish mafia”, EPFL friends, ...) which help me to adapt to another country, and surely make me to live one of the best periods of my life. vi Thanks to all my friends, even though they were not very convinced if I was working, studying, or both, during the last four years, they encourage me to chase the goal to finish the Ph.D. Last but not least, I am deeply grateful to my family to support me un- conditionally, and to educate me to do always as much as I can to obtain my targets. This dissertation is dedicated to them. I cannot finish without acknowledge the institutions that supported me and my research with funding and infrastructure along these years. This Ph.D. dissertation has mainly supported by Agencia` de Gestio´ d’Ajuts Uni- versitaris i de Recerca (AGAUR) and Spanish Ministry for Industry, Tourism and Trade (MITyC) through TEC2008-03835/TEC and ”ParMA: Parallel Pro- gramming for Multi-core Architectures - ITEA2 Project (No 06015, June 2007 – May 2010), as well as, the collaboration grants awarded by European Net- work of Excellence on High Performance and Embedded Architecture and Compilation (HIPEAC). Thanks to all of you... ...and now, in this ”crisis” era, lots of fellows nowadays have a B.A., M.D., M.Ph, Ph.D, but unfortunately, they don’t have a J.O.B I hope this not happen to me , and I am expecting impatiently new positive opportunities to do what I like vi HW-SW Components for Parallel Embedded Computing on NoC-based MPSoCs Contents Abstract i Keywords iii Acknowledgments v Contents vii List of Figures xi List of Tables xv Listings xvii 1 Introduction 1 1.1 Background ............................. 5 1.2 NoC-based MPSoC: Opportunities and Challenges ....... 10 1.3 Contributions of This Dissertation ................ 12 1.4 Dissertation Outline ........................ 17 2 Experimental Framework for EDA/CAD of NoC-based Systems 19 2.1 Networks-on-Chip: Preliminary Concepts ............ 19 2.1.1 NoC Paradigm - General Overview ........... 20 2.1.2 Topologies .......................... 21 2.1.3 Routing Protocol and Deadlock Freedom ........ 24 2.1.4 Switching Communication Techniques ......... 26 2.1.5 Buffering and Virtual Channels .............. 28 2.1.6 Flow Control Protocols ................... 30 2.1.7 Micro-network Protocol Stack for NoCs ......... 31 2.2 Motivation and Key Challenges .................. 34 2.3 Related work on NoC Design and EDA Tools .......... 35 viii 2.4 xENoC environment - NoCMaker ................ 36 2.4.1 Design Flow of NoCMaker ................ 37 2.4.2 NoC modeling ....................... 39 2.4.3 Metrics Extraction according to Design Space Points . 42 2.4.4 Modeling Traffic Patterns and Benchmarks ....... 48 2.4.5 System and Circuit Validation ............... 50 2.5 Experiments using xENoC – NoCMaker ............. 52 2.5.1 Synthesis of Nios II NoC-based MPSoC architectures . 53 2.6 Conclusion and Future Work ..................

HW-SW Components for Parallel Embedded Computing on Noc-Based Mpsocs Keywords

Download the Compiled Program File Onto the Chip

An Architecture and Compiler for Scalable On-Chip Communication

On-Chip Interconnect Schemes for Reconfigurable System-On-Chip

A3MAP: Architecture-Aware Analytic Mapping for Networks-On-Chip Wooyoung Jang and David Z

Comparative Study of Various Systems on Chips Embedded in Mobile Devices

Nomadik Application Processor Andrea Gallo Giancarlo Asnaghi ST Is #1 World-Wide Leader in Digital TV and Consumer Audio

ARM Was Developed at Acron Computer Limited Of

AXI Reference Guide

White Paper - Investigate the High-Level HDL Chisel

Wishbone Bus Architecture – a Survey and Comparison

An Overview of Soc Buses

The Hardware Design Toolchain Approaches and State of the Art Fredo Erxleben August 27, 2014