Flexpret: a Processor Platform for Mixed-Criticality Systems

FlexPRET: A Processor Platform for Mixed-Criticality Systems Michael Zimmer David Broman Christopher Shaver Edward A. Lee Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2013-172 http://www.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013-172.html October 23, 2013 Copyright © 2013, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Acknowledgement This work was supported in part by the Center for Hybrid and Embedded Software Systems (CHESS) at UC Berkeley, which receives support from the National Science Foundation (NSF awards #0720882 (CSR-EHS: PRET), #0931843 (CPS: Large: ActionWebs), and #1035672 (CPS: Medium: Timing Centric Software)), the Naval Research Laboratory (NRL #N0013-12-1-G015), and the following companies: Bosch, National Instruments, and Toyota. The second author was funded by the Swedish Research Council #623-2011-955. FlexPRET: A Processor Platform for Mixed-Criticality Systems Michael Zimmer∗, David Broman∗y, Chris Shaver∗, Edward A. Lee∗ ∗University of California, Berkeley, CA, USA yLinkoping¨ University, Sweden fmzimmer, broman, shaver, [email protected] Abstract—Mixed-criticality systems, in which multiple tasks of to switch tasks, monitor execution times, or handle sensing varying criticality are executed on a single hardware platform, are and actuation [12]. In this paper, we take a different approach an emerging research area in real-time embedded systems. High- and investigate if a new processor architecture can provide criticality tasks require spatial and temporal isolation guarantees, hardware-based isolation to mixed-criticality systems without whereas low-criticality tasks should efficiently utilize hardware underutilizing resources. resources. Hardware-based isolation is desirable, but commonly underutilizes hardware resources, which can consist of multiple Hardware-based isolation can be achieved by deploying single-core, multicore, or multithreaded processors. We present each task on separate computational components: either pro- FlexPRET, a processor designed specifically for mixed-criticality cessors, cores in a multicore processor, or hardware threads systems by allowing each task to make a trade-off between hardware-based isolation and efficient processor utilization. Flex- in a multithreaded processor. Although potentially better than PRET uses fine-grained multithreading with flexible scheduling one task per processor, one task per core can still underutilize and timing instructions to provide this functionality. resources. A hardware thread, subsequently referred to as just a thread, uses hardware support to share the underlying processor with other threads. Provided thread scheduling maintains I. INTRODUCTION temporal isolation, which is not done in many multithreaded A current trend in real-time embedded systems, driven processors, one task per thread can better utilize resources by by size, weight, and power concerns, is consolidating many enabling multiple tasks to be deployed on a single processor increasingly complex applications onto fewer hardware plat- without losing hardware-based isolation. forms. A processor must then execute tasks with differing importance, safety, or certification requirements—creating a One approach is to use fine-grained multithreading (in- mixed-criticality system [1], [2], [3]. These requirements are terleaved multithreading, barrel processor) [13], [14], where often specified using criticality levels, such as the five levels instructions from different threads are interleaved in the (A-E) used in the DO-178C avionics standard [4]. Higher- pipeline every cycle. Existing fine-grained multithreaded pro- criticality-level tasks correspond to hard real-time tasks, cessors have, however, inflexible thread scheduling mecha- whereas lower-criticality-level tasks correspond to soft real- nisms. PTARM [15] provides scheduling guarantees to each time tasks. Each criticality level has different requirements. thread but cannot utilize cycles when a thread is no longer active, whereas XMOS X1 [16] loses temporal isolation if it Spatial and temporal isolation prevent a task from be- utilizes cycles when a thread is not active. Both processors also ing unintentionally affected by another task: spatial isolation require four threads to be active to fully utilize the pipeline. protects a task’s state (stored in registers and memory), and temporal isolation protects a task’s desired timing behavior. This paper presents FlexPRET, a processor designed to Timing predictability, still an issue even if a task is isolated, is exhibit architectural techniques useful for mixed-criticality desirable to tightly bound a task’s worst-case execution time systems. FlexPRET uses fine-grained multithreading, and by (WCET) and avoid over provisioning resources. These are classifying each thread as either a hard real-time thread desirable properties for verification and certification of a hard (HRTT) or a soft real-time thread (SRTT), provides hardware- real-time task. For a soft real-time task, however, efficiently based isolation to HRTTs while allowing SRTTs to efficiently utilizing the processor is more important than isolation and utilize the processor. Unlike other fine-grained multithreaded predictability guarantees. processors, FlexPRET supports an arbitrary interleaving of threads in the pipeline and uses a novel thread scheduler. Hardware-based isolation is robust and often used in safety- Each thread, either an HRTT or SRTT, can be guaranteed critical systems [5], [6], but traditionally utilizes hardware to be scheduled at certain cycles for isolation or throughput resources inefficiently [7], as each processor can only execute guarantees. If no thread is scheduled for a cycle or a scheduled a single task. As a consequence, extensive research has been thread has completed its task, that cycle is used by some SRTT performed during the past several years on software scheduling in a round-robin fashion—efficiently utilizing the processor. of mixed-criticality systems [1], [8], [9], [2], [10], [11], where isolation is provided by software running on a real-time Even with spatial and temporal isolation, a task’s WCET operating system (RTOS). Although this can drastically reduce can be difficult to bound; the predictability and performance hardware costs, the RTOS itself must be verified and certified. trade-off of the underlying processor depends on architectural This includes accounting for overhead and behavior of run- decisions, typically regarding branch prediction, pipeline or- time monitoring and control, which could include preemption dering, and caches. Current research in scheduling mixed- critically systems assumes that WCET bounds are either un- of task periods) is shown, where rectangles indicate a task is safe [9] (not actually a bound) or overly pessimistic [1], [2]. executing. The upper plot is unusual in that the units of both The problem is many processors lack fine-grained predictabil- axes are processor cycles, but this is done to more clearly show ity, meaning the ability to determine the execution time of which task’s thread is executed each cycle; the vertical axis an instruction with little knowledge of execution history— indicates which threads are executed in a particular pipeline a desirable property for WCET analysis. Complex branch stage each cycle over a four cycle interval. The sequence of prediction and multilevel caches, used to optimize average-case thread interleaving is read top to bottom, left to right, as shown performance, do not have fine-grained predictability and make for two intervals by the lower plot. WCET analysis a non-trivial task [17]. FlexPRET is related Task τ executes once every four cycles and is the first to previous precision-timed (PRET) machines [18], [15], [19], B to complete (t = 2; 000). Its cycles are not needed until its where fine-grained multithreading allows processors to remove next period (t = 5; 000) and are donated to τ . Task τ , dynamic branch prediction and caches for fine-grained pre- C A requiring more cycles than τ to meet its deadline, executes dictability with less of a performance penalty. B once every two cycles and also donates cycles to τC when FlexPRET also uses a variation of timing instruc- it completes (t = 8; 000). Notice that τB does not donate tions [20], [18], [19] to provide direct control over timing cycles to τA; as a hard real-time task, τA would be verified to (in nanoseconds). With these instructions, the processor itself meet its deadline with only its allocated cycles and does not is responsible to execute with the timing specified by the benefit by completing earlier. Tasks τA and τB are temporally instruction—enabling a more direct temporal mapping between isolated by only being executed at prescribed cycles—enabling languages with timing semantics and the processor. Unlike independent verification. Task τC efficiently uses every cycle previous work, cycles are not left unused to satisfy temporal not needed by τA or τB, but has sacrificed temporal isolation: constraints; FlexPRET reallocates these cycles to SRTTs. timing behavior depends on when τA and τB start and

Load more