Suppressing Jitter by Isolating Bare-Metal Tasks Within a General

Suppressing Jitter by Isolating Bare-Metal Tasks within a General-Purpose OS on x86 NUMA Systems Dämpfung des Jitters durch Isolation von Bare-Metal Tasks in Standardbetriebssystemen auf der x86 NUMA-Architektur Von der Fakultät für Elektrotechnik und Informationstechnik der Rheinisch-Westfälischen Technischen Hochschule Aachen zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften genehmigte Dissertation vorgelegt von Diplom-Ingenieur Georg Wassen aus Leverkusen Berichter: Prof. Dr. rer. nat. Rainer Leupers Prof. Dr.-Ing. Stefan Kowalewski 22. Juni 2016 Diese Diessertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar. ii Bildungsgang 1985 – 1989 Gemeinschaftsgrundschule im Kirchfeld, Leverkusen-Lützenkirchen. 1989 – 1998 Werner-Heisenberg-Schule, städtisches Gymnasium Leverkusen. 1999 – 2007 Diplomstudium Elektrotechnik und Informationstechnik, Studienrich- tung Informations- und Kommunikationstechnik. RWTH Aachen University. 2008 – 2015 Promotionsstudium Fakultät für Elektrotechnik und Informations- technik. RWTH Aachen University. iii Abstract Today, multi-processor systems have become commonplace in the computer market from High Performance Computing down to small embedded systems. This shift of paradigm comes at the cost of adapting the software for concurrent execution. Besides correctness, real-time systems have the additional requirement of predictable timing, especially to reliably meet deadlines. This predictability is particularly hard to implement and verify on multi-processor systems. Different approaches exist that mainly restrict what the entire system executes. This thesis presents a new concept to execute hard real-time tasks in isolation besides a General-Purpose Operating System. This is accomplished by restricting the environment of isolated tasks but still allowing the full power and performance for the remaining system. To limit the impact of arbitrary applications onto isolated tasks via shared resources, the Non-Uniform Memory-Access (NUMA) architecture is analyzed and presented as capable of eliminating the performance interference. The isolated task concept is generally portable to arbitrary General-Purpose Operating Systems (GPOSs) and architectures but was implemented for Linux on x86 multi-processor systems. The isolation of tasks revokes control of that CPU from the operating system which is not tolerated by Linux. Therefore, several modifications are described to restore system stability. The communication and Data Aquisition is realized with shared memory and user-mode drivers which limits the impact of concurrent execution and simplifies the Control Flow Graph (CFG). Consequently, the execution time of real-time tasks is only influenced by the execution time of basic blocks. Hence, the architectural analyzes can be applied to the isolated task concept but also to all other Real-Time Operating System and research approaches. The main contribution in this area is the presentation of NUMA systems being suitable to separate memory and I/O streams to accompany the interference isolation with a complete functional partitioning of the system. The isolated task design allows to extend existing applications with hard real-time tasks for high-frequency Programmable Logic Controller implementations. The application can still base on all available libraries and tools of current Linux distributions extended by the Real-Time Preemption Patch (RT-Patch) for better soft real-time. Hard real-time tasks previously implemented on external Micro-Controllers (µCs) can be merged on the same x86 multi-processor NUMA system to reduce hardware costs, improve communication throughput and latency, and simplify development and verification. The entire concept is formally verified where the documentation allows, specifically for the interrupt-freedom and mutual impact on the CFG. The hardware effects that can not fully be derived from the documentation are analyzed based on extensive benchmarks that show the value of the NUMA partitioning in real-time systems. v Abstract Zusammenfassung Mehrprozessorsysteme haben sich vom Hochleistungsrechnen bis zu kleinen eingebetteten Systemen durchgesetzt. Dieser Paradigmenwechsel geht auf Kosten der Anpassung der Software an die parallele Ausführung. Außer Korrektheit müssen Echtzeitsysteme auch vorhersehbares zeitliches Verhalten garantieren – das bedeutet meist das Einhalten von Zeitfristen. In Mehrprozessorsystemen ist diese Berechenbarkeit besonders schwierig zu rea- lisieren und zu beweisen. Verschiedene Lösungen existieren, die hauptsächlich einschränken, was das gesamte System ausführen kann. Diese Arbeit stellt ein neues Konzept vor, das harte Echtzeitaufgaben neben dem Betriebs- system isoliert. Realisiert wird das durch eine Beschränkung, was in Isolation ausgeführt werden darf. Die volle Mächtigkeit und Leistungsfähigkeit des übrigen Systems wird dabei erhalten. Der negative Einfluss des übrigens Systems auf isolierte Tasks wird weiter ein- geschränkt durch die Analyse und Nutzung der NUMA- (Non-Uniform Memory-Access-) Architektur, die sich dadurch für Mehrprozessor-Echtzeitsysteme besonders anbietet. Das Konzept isolierter Tasks kann generell auf verschiedenen Betriebssystemen und Hardwarearchitekturen realisiert werden. Im Rahmen dieser Arbeit wurde es für Linux auf x86 realisiert und untersucht. Das Isolieren von Tasks entzieht dem Betriebssystem die Kontrolle über diese CPUs, was von Linux nicht ohne weiteres toleriert wird. Daher werden einige Änderungen beschrieben, die das Gesamtsystem wieder stabilisieren. Kommunikation und Datenaustausch mit physischen Systemen werden durch gemeinsamen Speicher und direkten Hardwarezugriff aus dem Benutzermodus realisiert. Diese Maßnahmen beschrän- ken den Einfluss gleichzeitiger Ausführung auf den Kontrollflussgraph. Daraus folgt, dass die Ausführungszeit der Basisblöcke von Echtzeittasks nur noch von Hardwareeinflüssen bestimmt wird. Daher können die Erkenntnisse dieser Analyse auf alle anderen Echtzeitbe- triebssysteme und Forschungsarbeiten übertragen werden. Die wichtigste Erkenntnis ist die Eignung der NUMA-Architektur zur Trennung der Speicher- und E/A-Datenströme, um die Partitionierung der isolierten Tasks durch eine vollständige funktionelle Trennung zu unterstützen. Der Entwurf erlaubt die Erweiterung bestehender Anwendungen um echtzeitfähige, hochfrequente Regelaufgaben. Die Anwendung kann weiterhin alle verfügbaren Bibliotheken und Werkzeuge von beliebigen Linuxdistributionen verwenden. Die Echtzeiterweiterung Linux Preemption Patch kann genutzt werden, um weiche Echtzeitaufgaben zu verbessern. Harte Echtzeitregelungen, die zuvor auf externen Mikrocontrollern realisiert wurden, können nun auf dem gleichen x86 Mehrprozessor-NUMA-System ausgeführt werden und damit Hardwarekosten reduzieren, Kommunikationdurchsatz und -latenz verbessern und die Entwicklung und Verifikation vereinfachen. Das gesamte Konzept wurde formal verifiziert, so weit die Dokumentation dies zulässt. Insbesondere gilt das für die Freiheit von Unterbrechungen und den gegenseitigen Einfluss auf den Kontrollfluss. Die Effekte der Hardware, die wegen mangelnder Dokumentation nicht vollständig modelliert werden können, wurden durch umfangreiche praktische Messungen untersucht. Dies unterstreicht den Wert der NUMA-Partitionierung für Echtzeitsysteme. vi Contents Abstractv 1. Introduction1 1.1. Motivation.................................3 1.2. Contributions...............................8 1.3. Structure................................. 11 2. Fundamental Principles 13 2.1. Hardware................................. 13 2.1.1. Processor............................. 14 2.1.2. Protection Levels......................... 15 2.1.3. Memory.............................. 15 2.1.4. Cache............................... 17 2.1.5. Interrupts............................. 19 2.1.6. Input and Output......................... 20 2.1.7. x86 Architecture......................... 21 2.1.8. Performance Details: Micro-Benchmarks............ 26 2.1.9. Multi-processor Systems..................... 32 2.1.10. System Performance: Application Benchmarks......... 39 2.1.11. Embedded Systems........................ 40 2.2. Operating Systems............................ 42 2.2.1. General Concepts......................... 43 2.2.2. Hardware Abstraction...................... 46 2.2.3. Programming Languages and Application Programming Inter- faces................................ 47 2.2.4. Multi-Processor Aspects..................... 48 2.2.5. Actual Implementation: Linux.................. 50 2.3. Real-Time................................. 52 2.3.1. Application Architecture..................... 54 2.3.2. Scheduling............................. 60 2.3.3. Real-Time Operating Systems.................. 61 2.3.4. Multi-Processor Aspects..................... 62 2.3.5. Performance of Real-Time Systems............... 64 vii Contents 3. Related Work 71 3.1. Fundamental Real-Time Research.................... 71 3.1.1. Isolation or Partitioning on Single-Processor Systems..... 71 3.1.2. Foundations of Real-Time Computing on Multi-Processor Sys- tems................................ 72 3.1.3. Scheduling Theory........................ 73 3.2. Real-Time with a Linux Kernel..................... 74 3.2.1. The Real-Time Preemption Patch................ 75 3.2.2. Interrupt Abstraction....................... 77 3.3. Partitioning and Isolation Approaches.................. 80 3.4. Previous publications........................... 84 3.5. Supporting Student’s Theses......................

Suppressing Jitter by Isolating Bare-Metal Tasks Within a General

Security Assurance Requirements for Linux Application Container Deployments

EEMBC and the Purposes of Embedded Processor Benchmarking Markus Levy, President

Cloud Workbench a Web-Based Framework for Benchmarking Cloud Services

NUUO NDVR Specification NUUO 3.2 Version

Automatic Benchmark Profiling Through Advanced Trace Analysis Alexis Martin, Vania Marangozova-Martin

Blitz Formula Specifications Summary

Multicore Operating-System Support for Mixed Criticality∗

Extended Lifecycle Series

Demarinis Kent Williams-King Di Jin Rodrigo Fonseca Vasileios P

Container and Kernel-Based Virtual Machine (KVM) Virtualization for Network Function Virtualization (NFV)

Automatic Benchmark Profiling Through Advanced Trace Analysis

User Guide XFX X58i Motherboard