Hermes: a Real Time Hypervisor for Mobile and Iot Systems
Total Page:16
File Type:pdf, Size:1020Kb
Session: Performance and Experimentation HotMobile’18, February 12–13, 2018, Tempe, AZ, USA Hermes: A Real Time Hypervisor for Mobile and IoT Systems Neil Klingensmith Suman Banerjee University of Wisconsin University of Wisconsin [email protected] [email protected] ABSTRACT Workshop on Mobile Computing Systems & Applications, Tempe , AZ, USA, We present Hermes, a hypervisor for MMU-less microcontrollers. February 12–13, 2018 (HotMobile ’18), 6 pages. https://doi.org/10.1145/3177102.3177103 Hermes enables high-performance bare metal applications to coex- ist with RTOSes and other less time-critical software on a single CPU. We experimentally demonstrate that a real-time operating system scheduler does not always provide deterministic response 1 INTRODUCTION times for I/O events, which can cause real-time workloads to be Modern embedded sensing and mobile applications increasingly unschedulable. Hermes solves this problem by adding a layer of perform diverse functions, including displaying user interfaces, abstraction between the hardware I/O devices and the software managing networking, performing real-time data acquisition, and that services them, making I/O transactions truly deterministic. more. Some even allow third-party code to be downloaded and Virtualization on low-power mobile and embedded systems also run alongside the factory firmware [17]. Such diversity in runtime enables some interesting software capabilities like secure execu- requirements poses challenges to software architects, who must tion of third-party apps, software integrity attestation, and bare manage the often competing needs of different tasks. metal performance in a multitasking software environment. These To manage the diverse runtime requirements of embedded soft- features otherwise require additional hardware (i.e. multiple CPUs, ware, we have developed a lightweight embedded hypervisor we hardware TPM, etc) or may not be available at all. In other projects, call Hermes1, targeted to ARM Cortex-M microcontrollers. Other we have anecdotally noticed that real time operating systems are authors have proposed similar systems for mobile phone environ- not always able to respond quickly and deterministically enough ments, but none that we are aware of on MMU-less processors to time-sensitive operations, particularly under high I/O load. We [5, 9, 14]. validate this observed timing problem by measuring interrupt la- IoT applications are frequently implemented on CPUs without tency in an RTOS environment and comparing to an experimental an MMU in order to save cost and power. While the cost of MMU- implementation of Hermes. We find that not only is the interrupt ful Linux-capable processors is going down all the time, energy latency lower in the virtualized environment, but it is also much considerations (especially for mobile applications) are not likely to more deterministic—a key figure of merit for real-time software go away. systems. We discuss challenges of implementing a hypervisor on The problem we set out to solve is one of I/O latency in such a a CPU with no memory management unit, and we present some complex runtime environment. Real-time operating system (RTOS) preliminary solutions and workarounds. We go on to explore some scheduling algorithms cannot guarantee deadlines will be met un- other applications of virtualization to mobile and IoT software. der high I/O load. People usually solve this problem by running time-critical operations on a separate CPU [13]. For example, high- CCS CONCEPTS frequency signal sampling may be implemented in bare metal code • Software and its engineering → Virtual machines; • Com- running on an independent microcontroller while the user interface, puter systems organization → Real-time operating systems; networking, storage, etc. runs on the main device. This approach Embedded systems; has a lot of obvious shortcomings: increased hardware and soft- ware complexity, power consumption, physical size, verification KEYWORDS difficulty, etc. Driver-level I/O processing has traditionally been assumed to be Real-time systems; hypervisor; virtualization a negligible component of overall response time—an assumption ACM Reference format: that was valid 30 years ago as these real-time scheduling algo- Neil Klingensmith and Suman Banerjee. 2018. Hermes: A Real Time Hy- rithms were being developed. At that time, embedded computers pervisor for Mobile and IoT Systems. In Proceedings of 19th International were single-purpose machines that largely performed the same task repetitively. Permission to make digital or hard copies of all or part of this work for personal or But that assumption of single-purposeness is becoming less valid. classroom use is granted without fee provided that copies are not made or distributed Modern microcontrollers are equipped with a diverse range of for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the peripherals that was unimaginable in the 1980s.Network interfaces, author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or high-speed data acquisition devices, touchscreens, and more all republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. have a diverse range of requirements, but they are treated the same HotMobile ’18, February 12–13, 2018, Tempe , AZ, USA by the RTOS and CPU. Exception management for low-priority I/O © 2018 Copyright held by the owner/author(s). Publication rights licensed to Associa- tion for Computing Machinery. ACM ISBN 978-1-4503-5630-5/18/02...$15.00 1HypErvisor for Real time MicrocontrollErS https://doi.org/10.1145/3177102.3177103 http://hermes.wings.cs.wisc.edu 101 Session: Performance and Experimentation HotMobile’18, February 12–13, 2018, Tempe, AZ, USA High-Priority Software Environment Entropy of Latency (a) High-Priority ISR User-Mode Processing FreeRTOS, Serial Only 0.85 FreeRTOS, Serial + Ping Flood 1.78 High-Priority (b) High-Priority ISR Low-Priority ISR Bare Metal Guest, Serial Only 0.27 User-Mode Processing Bare Metal Guest, Serial + Ping Flood 0.14 Table 1: Entropy of the distributions of latency measure- Figure 1: Timeline of (a) high-priority ISR followed by user- ments (distributions shown in Figure 2). Low values of en- mode I/O processing and (b) low-priority ISR co-occuring tropy are more deterministic. The bare metal guest running with a high-priority ISR, delaying high-priority user mode in Hermes has much more predictable latency than tasks in processing. FreeRTOS. Under Hermes, latency is still highly determinis- tic under high I/O load. is always performed before user-mode code can respond to high- priority events, creating a kind of unintended priority inversion (depicted in Figure 1). Consequently, response times to latency- computing devices with heterogeneous hardware implemen- sensitive I/O events are not deterministic, which can result in failure tations to run user apps targeted to a common platform. (see an exploration of this in Section 2). A diagram of the Hermes software architecture is shown in Conventional wisdom among real-time programmers is that ISRs Figure 3. We are implementing Hermes on an ARM Cortex M7 CPU should be as short as possible: clear the interrupt, maybe transfer a called the Atmel SAM E70 [6, 7] which has 2 Mbytes of flash and few bytes of data, and exit. Userland code should be responsible for 384 kbytes of RAM. It also includes many advanced features of responding to the event. In a crowded software environment with the latest ARM microcontrollers such as a floating point unit, a multiple drivers and tasks competing for CPU time, this program- memory protection unit, separate instruction and data caches, and ming method has the effect of delaying the actual response of all I/O many peripherals. We have tested Hermes by running a FreeRTOS events until all ISRs have finished executing. These delays break the v9.0.0 [2] guest on top of the hypervisor. The contributions made assumptions that underlie real-time scheduling algorithms, which by this work are the following: require the highest-priority task to always run first. Instead, we • We measure the response time to interrupts in an RTOS are running the driver code associated with low-priority tasks be- environment, demonstrating that latencies can be nondeter- fore the user code for high-priority tasks, and RTOSes do not have ministic under high I/O load. flexibility to change this behavior. • We propose the use of a hypervisor in real time software Hermes is a lightweight virtualization platform that lives be- environments to ameliorate the determinism problem. The tween the hardware and the operating system. At its core, Hermes hypervisor can provide isolation between software tasks, consists of some initialization code and a single interrupt service which can improve timing predictability. routine that catches and preprocesses all exceptions before dis- • We develop a rudimentary implementation of the hypervisor, patching them to the operating system. In its role as a mediator of and we study its performance. We discuss some of the diffi- exception processing, Hermes can allow multiple operating systems culties of implementing a hypervisor on a microcontroller or bare metal applications to run side-by-side on an MMUless mi- with no MMU. crocontroller, dispatching exception processing to the appropriate OS as necessary, much like a hypervisor running on a PC or server. 2 PROBLEM VALIDATION We see several potential advantages to this software architecture: Using performance counters on the ARM Cortex M7 CPU, we mea- (1) Performance. For time-critical applications, Hermes can sure the ISR-user space latency—the time between beginning of provide a thin layer between the software and the hardware. ISR execution to beginning of userspace data processing. This is Real time operating systems (RTOSs) on the other hand, a metric of how long it takes to respond to an I/O event.