A Virtualized Multikernel for Safety-Critical Real-Time Systems
Total Page:16
File Type:pdf, Size:1020Kb
Quest-V: A Virtualized Multikernel for Safety-Critical Real-Time Systems Richard West, Ye Li and Eric Missimer Computer Science Department Boston University Boston, MA 02215 Email: richwest,liye,missimer}@cs.bu.edu Abstract—Modern processors are increasingly featuring called Quest-V [1], is designed as a multikernel [2], or multiple cores, as well as support for hardware virtual- distributed system on a chip. It encapsulates different ization. While these processors are common in desktop kernel instances, and their applications, in separate vir- and server-class computing, they are less prevalent in embedded and real-time systems. However, smartphones tual machines (VMs). Faults in one VM are isolated from and tablet PCs are starting to feature multicore processors other VMs, similar to how processes are isolated from with hardware virtualization. If the trend continues, it is one another in a traditional operating system. However, possible that future real-time systems will feature more each VM has greater capabilities than a traditional pro- sophisticated processor architectures. Future automotive cess running on top of a more privileged OS kernel. With or avionics systems, for example, could replace complex networks of uniprocessors with consolidated services on a Quest-V, each kernel instance in its own VM runs on top smaller number of multicore processors. Likewise, virtual- of a privileged monitor. While a monitor is required to ization could be used to isolate services and increase the be trusted, it can be kept to a minimal size and removed availability of a system even when failures occur. from the normal execution path of each kernel in the This paper investigates whether advances in modern system. processor technologies offer new opportunities to rethink the design of real-time operating systems. We describe some Quest-V is not intended to replace RTOSes found in of the design principles behind Quest-V, which is being relatively simplistic closed embedded systems, with fixed used as an exploratory vehicle for real-time system design tasksets and highly deterministic behavior. Instead, it is on multicore processors with hardware virtualization ca- targeted at open real-time systems with dynamic tasksets pabilities. While not all embedded systems should assume and potentially unpredictable operating environments. such features, a case can be made that more robust, safety- critical systems can be built to use hardware virtualization Such systems often feature a mix of real-time and best- without incurring significant overheads. effort tasks, and data inputs generated by potentially untrusted external sources. Safety and security become I. INTRODUCTION significant, particularly in application domains such as Multicore processors are becoming ubiquitous in all health-care, factory automation, avionics and automotive classes of computing, from desktops to servers, and even systems. As an example, a future automotive system embedded systems. Many of these processors also sup- may not only involve internal tasks communicating over port hardware virtualization capabilities. For example, a controller area network, but may include dynamic Intel VT-x, AMD-V, and more recently, ARM Cortex tasks and data from interaction with other vehicles A15 processors all have native machine virtualization or the surroundings. External factors have significant support. While many such processors have been used in consequences on the safe and timely operation of the virtual datacenters, there is an opportunity to consider vehicle [3]. multicore virtualized systems in new areas of embedded While Linux supports hardware virtualization through computing. The ARM Cortex A15, for example, is being its KVM interface, and hypervisors such as Xen [4] exist, targeted at tablet devices and smartphones, with the abil- neither approach has focused on providing real-time ity to support multiple guest environments that separate guarantees to tasks with safety and security constraints. personal and work-related information and services. Quest-V is an investigation into whether a single system This paper investigates whether advances in modern can be built from a collection of separate kernel images processor technologies offer new opportunities to rethink operating together, to satisfy both temporal and spatial the design of safety-critical real-time operating systems isolation requirements. Temporally, one task should not (RTOSes). We present a new system design that uses interfere with another in its timing requirements, while both virtualization capabilities and the redundancy of- spatially, the misuse of resources such as memory by fered by multiple processing cores, to develop a real-time one task should not affect others. system that is resilient to software faults. Our system, We show how Quest-V does not incur significant 1 operational overheads compared to a non-virtualized discussed in the work on Barrelfish [2]. This leads to version of our system, simply called Quest, designed for reduced resource contention and improved system effi- SMP platforms. We describe how to enforce real-time ciency on platforms with multiple cores, even accounting guarantees on communication, interrupt handling, thread for explicit inter-kernel communication costs. As system scheduling, and cross-core task migration. Similarly, we resources are effectively distributed across cores, and show how to leverage virtualization to prevent software each core is managed separately, there is no need to component failures (either through error or malicious have shared structures such as a global scheduler queue. attacks) from compromising an entire system. This, in turn, can improve predictability by eliminating In the following section we describe the rationale for undue blocking delays due to synchronization. the design of Quest-V. This is followed by a descrip- (2) Fault Resilience – replication of kernel function- tion of the architecture in Section III. An experimental ality or, at least, separation of services in different evaluation of the system is provided in Section IV. protection domains increases fault resilience. This, in Here, we show the overheads of online fault recovery, turn, increases system availability when there are partial along with the costs of using hardware virtualization system failures. to isolate kernels and system components. Section V (3) Highest Safe Privilege – Rather than adopting a describes related work, while conclusions are discussed principle of least privilege for software services, as is in Section VI. done in micro-kernels, a virtualized system can support II. DESIGN RATIONALE the highest safe privilege levels for different services. Quest-V is centered around three main goals: safety, Virtualization provides an extra logical ”ring of pro- predictability and efficiency. The system is focused tection” that allows guests to think they are working on safety-critical application domains, requiring high directly on the hardware. Thus, virtualized services can confidence in their operation [5] to prevent potential be written with traditional kernel privileges, yet still be loss of lives or equipment. With recent advances in isolated from other equally privileged services in other fields such as cyber-physical systems, more sophisticated guest domains. This avoids the costs typically associated OSes beyond those traditionally found in real-time and with micro-kernels, which require added communication embedded computing are now required. Consider, for overheads to request services in different protection example, a collection of automotive sub-systems for domains. engine, body, chassis, transmission, safety and info- (4) Minimal Trusted Code Base – A micro-kernel tainment services. These could be consolidated on the attempts to provide a minimal trusted code base for the same multicore platform, with space-time partitioning to services it supports. However, it must still be accessed ensure malfunctions do not propagate across services. as part of inter-process communication, and basic op- Moreover, if future automotive systems interact with erations such as coarse-grained memory management. their environments as part of vehicle-to-vehicle (V2V) Monitors form a trusted code base in the Quest-V virtual- or vehicle-to-infrastructure (V2I) communication, they ized multikernel. Access to these can be avoided almost are open to potential safety and security breaches from entirely, except to bootstrap (guest) sandbox kernels, external sources. handle faults and manage EPTs. This enables sandboxes While safety is a key goal, hardware virtualization to operate, for the most part, independently of any other provides a method to encapsulate, or sandbox, system code base that requires trust. In turn, the trusted code resources from access by unauthorized sources. Virtual- base (i.e., monitors) can be limited to a small memory ization provides an opportunity to enforce both system- footprint. wide safety and security beyond that achievable with While Quest-V uses hardware virtualization to iso- non-virtualized hardware solutions such as paging and late sandbox kernels, this is not a requirement of our segmentation [6], [7]. Quest-V relies on Extended Page multikernel approach. Many platforms, especially those 1 Tables (EPTs) to separate system software components in embedded systems, still lack hardware virtualization operating as a collection of services in a distributed sys- features. In such cases, it is possible to use alternative tem on a chip. The rationale