Dissertation Achieving Performance in Networks-On-Chip for Real-Time Systems

Dissertation Achieving Performance in Networks-On-Chip for Real-Time Systems Adam Kostrzewa Achieving Performance in Networks-On-Chip for Real-Time Systems Dissertation an der Technischen Universität Braunschweig, Fakultät für Elektrotechnik, Informationstechnik, Physik Achieving Performance in Networks-On-Chip for Real-Time Systems Von der Fakultät für Elektrotechnik, Informationstechnik, Physik der Technischen Universität Carolo-Wilhelmina zu Braunschweig zur Erlangung des Grades eines Doktors der Ingenieurwissenschaften (Dr.-Ing.) genehmigte Dissertation von Adam Kostrzewa aus Warschau Eingereicht am: 16.05.2018 Mündliche Prüfung am: 13.08.2018 1. Referent: Prof. Dr.-Ing. Rolf Ernst 2. Referent: Prof. Dr.-Ing. Mladen Berekovic Druckjahr: 2018 Abstract In many new applications, such as in automatic driving, high performance requirements have reached safety critical real-time systems. Consequently, Networks-on- Chip (NoCs) must efficiently host new sets of highly dynamic workloads e.g. high resolution sensor fusion and data processing, autonomous decision’s making com- bined with machine- learning. The static platform management, as used in current safety critical systems, is no more sufficient to provide the needed level of service. A dynamic platform management could meet the challenge, but it usually suffers from a lack of predictability and the simplicity necessary for certification of safety and real-time properties. In this work, we propose a novel, global and dynamic arbitration for NoCs with real-time QoS requirements. The scheme follows design principles of Software Defined Networks (SDN) and adjusts them for the purposes of NoCs in real-time, embedded systems. The mechanism decouples the admission control from arbitration in routers thereby simplifying a dynamic adaptation and real-time analysis. Consequently, the proposed solution allows the deployment of a sophisti- cated contract-based QoS provisioning without introducing complicated and hard to maintain schemes, known from the frequently applied static arbiters. The presented work introduces an overlay network to synchronize transmis- sions using arbitration units called Resource Managers (RMs), which allows global and work-conserving scheduling. The description of resource allocation strategies is supplemented by protocol design and verification methodology bringing adaptive control to NoC communication in setups with different QoS requirements and traffic classes. For doing that, a formal worst-case timing analysis for the mechanism has been proposed which demonstrates that this solution not only ex- poses higher performance in simulation but, even more importantly, consistently reaches smaller formally guaranteed worst-case latencies than other strategies for realistic levels of system’s utilization. The approach is not limited to a specific network architecture or topology as the mechanism does not require modifications of routers and therefore can be used 8 together with the majority of existing manycore systems. Indeed, the evaluation followed using the generic performance optimized router designs, as well as two systems-on-chip focused on real-time deployments. The results confirmed that the proposed approach proves to exhibit significantly higher average performance in simulation and execution. Zusammenfassung In vielen neuen sicherheitskritische Anwendungen, wie z.B. dem automatisierten Fahren, werden große Anforderungen an die Leistung von Echtzeitsysteme ge- stellt. Daher müssen Networks-on-Chip (NoCs) neue, hochdynamische Workloads wie z.B. hochauflösende Sensorfusion und Datenverarbeitung oder autonome Ent- scheidungsfindung kombiniert mit maschineller Lernen, effizient auf einem Sys- tem unterbringen. Die Steuerung der zugrunde liegenden NoC-Architektur, muss die Systemsicherheit vor Fehlern, resultierend aus dem dynamischen Verhalten des Systems schützen und gleichzeitig die geforderte Performance bereitstellen. In dieser Arbeit schlagen wir eine neuartige, globale und dynamische Steuerung für NoCs mit Echtzeit QoS Anforderungen vor. Das Schema folgt den Konstrukti- onsprinzipien von Software Defined Networks (SDN) und entkoppelt die Zutritts- kontrolle von der Arbitrierung in Routern. Hierdurch wird eine dynamische An- passung ermöglicht und die Echtzeitanalyse vereinfacht. Der Einsatz einer aus- gefeilten vertragsbasierten Ressourcen-Zuweisung wird so ermöglicht, ohne kom- plexe und schwer wartbare Mechanismen, welche bereits aus dem statischen Platt- formmanagement bekannt sind einzuführen. Diese Arbeit stellt ein übergelagertes Netzwerk vor, welches Übertragungen mit Hilfe von Arbitrierungseinheiten, den so genannten Resource Managern (RMs), synchronisiert. Dieses überlagerte Netzwerk ermöglicht eine globale und laster- haltende Steuerung. Die Beschreibung verschiedener Ressourcenzuweisungstra- tegien wird ergänzt durch ein Protokolldesign und Methoden zur Verifikation der adaptiven NoC Steuerung mit unterschiedlichen QoS Anforderungen und Ver- kehrsklassen. Hierfür wird eine formale Worst Case Timing Analyse präsentiert, welche das vorgestellte Verfahren abbildet. Die Resultate bestätitgen, dass die prä- sentierte Lösung nicht nur eine höhere Performance in der Simulation bietet, son- dern auch formal kleinere Worst-Case Latenzen für realistische Systemauslastun- gen als andere Strategien garantiert. Der vorgestellte Ansatz ist nicht auf eine bestimmte Netzwerkarchitektur oder Topologie beschränkt, da der Mechanismus keine Änderungen an den unterlie- 10 genden Routern erfordert und kann daher zusammen mit bestehenden Manycore- Systemen eingesetzt werden. Die Evaluierung erfolgte auf Basis eines leistungsop- timierten Router-Designs sowie zwei auf Echtzeit-Anwendungen fokusierten Plat- formen. Die Ergebnisse bestätigten, dass der vorgeschlagene Ansatz im Durch- schnitt eine deutlich höhere Leistung in der Simulation und Ausführung liefert. Contents 1 Introduction1 1.1 New Challenges in Design of Safety-Critical Systems . .4 1.2 Real-Time Applications . .7 1.2.1 Real-Time Traffic Requirements . .8 1.2.2 Integration of Traffic . 11 1.3 Safety Standards and Certification . 12 1.3.1 Sufficient Independence . 14 1.3.2 Communication Faults and Real-Time Properties . 15 1.4 Requirements for Real-Time and/or Safety-Critical Workloads . 17 1.5 Thesis Contribution and Outline . 25 2 A Survey of Mechanisms for Supporting Real-Time in NoCs 29 2.1 Spatial Isolation of Traffic in Networks-On-Chip . 30 2.2 Temporal Isolation of Traffic in Networks-on-Chip . 32 2.2.1 TDM-Based Networks-on-Chip Architectures . 33 2.2.2 Traffic Isolation in the Router . 37 2.2.3 Performance Optimized NoCs in the Real-Time Context . 38 2.3 Comparison of Different Mechanisms . 44 2.4 Summary . 49 3 The QoS Control Layer 53 3.1 Baseline NoC Architecture . 54 3.1.1 Router . 55 3.1.2 Network Interface . 56 3.1.3 Traffic Characteristic . 59 3.2 Software Defined Networking . 61 3.2.1 SDN Principles and Real-Time Systems . 62 3.2.2 SDN mechanisms in NoCs . 65 3.3 The Main Concepts of the Control Layer . 66 12 Contents 3.4 Overview of the Architecture . 69 3.5 Synchronization with the Control Layer . 71 3.5.1 Phase One: Initialization . 73 3.5.2 Phase Two: Reservation . 74 3.5.3 Phase Three: Usage . 76 3.5.4 Phase Four: Release . 77 3.6 Synchronization Protocols . 78 3.6.1 Time-Driven Scheduling . 80 3.6.2 Static Priority Based Arbitration . 83 3.6.3 Dynamic Priority Based Arbitration . 86 3.6.4 Adaptive Path Allocation . 89 3.6.5 Adaptive Rate Control . 91 3.7 Interface Between Tiles and NoC . 93 3.7.1 Resource Control in NI . 93 3.7.2 NoC Support for Suspensions . 94 3.7.3 Control Layer Support for Suspensions during NoC Accesses 97 3.8 Interface Between NoC and Memory . 98 3.8.1 Classic Approach for Safe Handling of Accesses to SDRAMs 98 3.8.2 RM-based Admission Control for SDRAMs . 100 3.9 Summary . 101 4 Realization of the Control Layer in NoC 105 4.1 Design Environment . 109 4.1.1 Temporal Analysis . 110 4.1.2 Simulation . 129 4.1.3 Tool for Protocol Configuration . 136 4.2 Implementation . 139 4.2.1 Transmission of Control Messages . 141 4.2.2 HW/SW Co-design for Clients & RM . 145 4.2.3 Hardware NoC-Extensions . 151 4.3 Summary . 155 5 Evaluation 157 5.1 Experimental Setup . 157 5.2 Worst-Case Guarantees . 163 5.2.1 Time-Driven Scheduling . 163 5.2.2 Priority-Based Arbitration . 170 5.2.3 Worst-Case Temporal Overhead . 171 Contents 13 5.3 Performance Measured by Simulation . 172 5.3.1 Performance of Time Driven Scheduling . 173 5.3.2 Work-Conserving Arbitration in NoC . 175 5.4 Resource Overhead . 182 5.5 Case Studies with Commercial and Research MPSoCs . 186 5.5.1 IDAMC - Research Platform . 186 5.5.2 KALRAY MPPA - Commercial Platform . 192 5.6 Evaluation against requirements . 201 5.7 Summary . 210 6 Conclusions 213 Chapter 1: Introduction Multi-Processor Systems-on-Chip (MPSoCs) enable high performance through integration and concurrent execution of previously separated applications and func- tions. In this manner, MPSoCs offer extensive sharing of intellectual property (IP) components and other system resources at low power and competitive cost [69]. Driven by their commercial success and continually increasing requirements of contemporary workloads, designers are constantly increasing the size and com- plexity of such architectures. Indeed, currently available MPSoCs embrace hun- dreds of IP cores e.g. 50 cores in Intels Knights Core [1], 256 cores in MPPA [71], 100 processing cores in Tilera’s TILE-Gx line of processors [2], 1024 cores in Kilo- NoC architecture [57] include DSPs, CPUs and memory

Dissertation Achieving Performance in Networks-On-Chip for Real-Time Systems

System Manual Pdm360smart Monitor CR1070 CR1071 Codesys® V2.3 Target V05 English

Iot LSP Standard Framework Concepts 2015

Catalog of GA Control Systems

SAFE TORQUE OFF a Guide to the Application of the Control Techniques Safe Torque Off Safety Function and Its Compliance with IEC 61800-5-2 SAFE TORQUE OFF

Components DISTRIBUTOR

Standards Publications

Safety Guide for the Americas

Safety Bock I/O Module TBPN-L1-FDIO1-2IOL

Relevant Norms and Standards

CIP Safety Protocol Training Session 0: Overview of Functional Safety and Safety Networks

How to Design Safe Machine Control Systems – a Guideline to EN ISO 13849-1

Introduction and Terminology for Functional Safety of Machines and Systems Reference Manual