Virtual Platform for Hw RTOS – Multiprocessor Hardware RTOS

Fabrice Muller, Farooq Muhammad [email protected], [email protected] LEAT-CNRS - University of Nice-Sophia Antipolis – France http://www.polytech.unice.fr/~fmuller

Abstract SiliconOS [1] is a full-fledged in which the majority of the μlTRON functionality is implemented on a coprocessor called Silicon TRON. SiliconOS does not We propose a Hardware Multiprocessor OS running on support multiprocessor architectures. Moreover, services prototyping platform which offers new services to improve like memory and timer management that consumes multiprocessor management and includes a performance processor time are still implemented in software. The analyzer tool. resulting software kernel is one third the size of the original software kernel. 1. Introduction The δ SoC Codesign Framework [2] is built around the With the increasing complexity of applications, Atlanta kernel [3] and allows a more fine-grained Multiprocessor System on Chip (MPSoC) becomes an partitioning with respect to [1]. This kernel provides key important choice for implementation. But, MPSoC RTOS features including multitasking capabilities, event- solutions for real time systems require an efficient Real driven and priority-based pre-emptive scheduling, inter-task Time Operating System (RTOS) to manage the resources communication and synchronization, but it does not permit and to guarantee the real time constraints. However, the to change the scheduling at run time. HOPES [4] is a verification of the application running on the MPSoC RTOS-like system that allows run time partitioning and platform grows to be too complex due to the increase of allocation of reconfigurable FPGAs. It supports both pre- number of processors, the application complexity, the bus emptive and non pre-emptive scheduling methods but does interconnections and the management of resources by one not provide multiprocessor scheduling and semaphore or more OS. Indeed, management of resources and services. FASTCHART [1] is a real time kernel fully scheduling of applications distributed on processors are also implemented in hardware. Key features of this kernel are: complex problems. priority scheduling, synchronization primitives and We propose a new approach which includes a Prototyping interrupt handling but the last version (Sierra 16) does not Platform and Virtual Platform based on SystemC language support multiprocessor architectures. Despite this amount of composed of a number of heterogeneous processors, previous work and initial industrial attempts like [5], at communication busses and a generic HwRTOS (Hardware present commercial RTOS does not offer generally: Real Time Operating System) to manage the MPSoC • Multiprocessor support with the possibility of dynamic platform. The main challenge is to validate the application load balancing (global and local scheduling), behavior on an MPSoPC platform. To make this possible, • Ability for designers to modify policies at run time. we developed a generic RTOS in hardware with a Hardware Users can decide only offline which algorithms to Debug Module included in the HwRTOS and connected to implement in hardware and cannot adapt to user needs the Performance Analyzer tool. It is also possible to make at run time. decisions about scheduling techniques (Rate Monotonic, We propose an embedded and generic hardware EDF) and scheduling taxonomy (Global, Local, Hybrid), multiprocessor RTOS where the user can select statically and adapt these choices to the varying needs of application the different scheduling algorithms. It can also change the at run time. scheduling policies at run time as well. The effort in this One solution often used for the management of applications work focused on not only implementing RTOS in hardware, is an RTOS. Indeed, some RTOS include specific services but also on providing flexibility to the user for changing to carry out memory or input/output management, but all decision-algorithms at run time thanks to VHDL or these functionalities are implemented in software and are SystemC algorithms within the HwRTOS. Multiprocessor supported by the processor. For example, VxWorks or Main OS Hw Services Complex applications Architecture RTLinux are very complete. Thus our goal is not to rival with these RTOS, but to define an efficient hardware implementation and evaluate its benefits. Open The idea of a hardware Operating System that moves Hardware scheduling and inter-process communication from software RTOS Trace Analyzer to hardware has been addressed in some previous works. Scheduling The main idea is to move the RTOS functionalities that Algorithms consume more CPU power into hardware in order to benefit Easy‐plug user Hw services from hardware acceleration. Figure 1 : Hardware RTOS possibilities. 2. Hw MP RTOS feature is the SystemC algorithms integrated on the The HwRTOS is the heart of the SoPC platform and Virtual HwRTOS for scheduling analysis. Platform as shown the Figure 1. It can manage complex In conclusion, we have proposed a new a hardware RTOS applications on a homo/heterogeneous multiprocessor able to support easily different scheduling algorithms and architectures. There are some generic parameters (number new user OS services. The resulting system can be of processors, number of tasks …) to adapt the HwRTOS integrated easily on a SoC/SoPC and Virtual Platform. No for a target application. other multiprocessor hardware RTOS offers this capability Considering the multiprocessor architecture of Figure 2 and and flexibility. a partition of the application, we propose to allocate task execution on set of processors and to use HwRTOS services 128 Global Tasks CLB Slices 64 Global Tasks to schedule tasks and to synchronize the communications 9000 64 Global Tasks 32 Global Tasks between tasks by semaphores or message queues. The goal 8000 32 Global Tasks 7000 32 Global Tasks 16 Global Tasks is to run computational intensive parts of the HwRTOS in 6000 16 Global Tasks 5000 8 Global Tasks Slices hardware to improve the performances and to control all 4000 processors at any time. We have just kept a minimal CLB 3000 2000 software layer to responds quickly to the commands of the 1000 0 hardware part. Per processor 1 proc 2 proc 4 proc The software layer sends parameters and a command 8 local tasks 1507 2035 3037 associated to an OS service, towards the hardware part that 16 local tasks 1998 2816 4668 triggers the execution of hardware OS services. The 32 local tasks 2853 4478 7958 HwRTOS core can be connected to any bus. The designer Dffs or Latches 128 Global Tasks 64 Global Tasks 64 Global Tasks has to create a wrapper to adapt the protocol of the 12000 32 Global Tasks 32 Global Tasks 10000 HwRTOS generic bus. The capability of multi-port is 32 Global Tasks 16 Global Tasks possible to separate the debug port and the application 8000 16 Global Tasks

Latches 8 Global Tasks ports. That does not cause any disturbance in behavior of 6000 or 4000

the application in the Prototyping/Real Platform. DFF 2000 Application Global Area Registers 0 Per processor 1 proc 2 proc 4 proc ‐ OS Name, Version 8 local tasks ‐ Generic paremeters 2196 2914 4545 ‐ Service information 16 local tasks 2798 3962 6732 processorprocessor Processors ‐ Debug information 32 local tasks 3948 6090 10930 Bus Interrupt Processor Area Registers ‐ Parameters ‐ Status Figure 3 : Resources Utilizations on Virtex5 SX50. ‐ Command ‐ Return value ‐ Command,parameters ‐ Status, return value … Wrapper ‐ Tcb pointer 7. References ‐ Timer [1] Nakano, A. Utama, M. Itabashi, A. Shiomi, and M. Debug Area Registers Registers Imai. Hardware implementation of real-time operating port HwRTOS ‐ Busy place, system. s.l. : Proceedings of the 12 TRON Project Hw OS Hw OScore ‐ Debug control International Symposium pp 34-42;1995, 1995. Debug service service ‐ Event filter, trigger HwRTOS IP ‐ Data trace [2] Blough, V. J. Mooney and D. M. A Hardware-Software Real-Time Operating System Framework for SOCs. s.l. : Figure 2 : Interaction with Environment. IEEE Design and Test of Computers, pp. 44-51, November- December 2002, 2002. 5. Results [3] P. Kuacharoen, M. Shalan and V. Mooney. A The HwRTOS implementation has been tested in VirtexII Configurable Hardware Scheduler for Real-Time Systems. Pro, Virtex4 and Virtex5 technology. The Figure 3 sums up s.l. : Proceedings of the International Conference on the resources for configurations of the HwRTOS for Engineering of Reconfigurable Systems and Algorithms Virtex5 target. We have fixed 8 semaphores, 8 message (ERSA'03) pp.96-101, June 2003, 2003. Vol. pp.96-101, queues and no debug service. We also include the local June 2003. scheduling mode (by priority and parallel computation) and [4] K. Baskaran, W. Jigang, and T. .Srikanthan. A mixed scheduling mode (by priority and sequential Hardware Operating System based Approach for Run-time computation). To obtain the total number of schedulable Reconfigurable Platform of Embedded Devices. s.l. : 6th task for a configuration, you must add the local tasks and Real Time Linux Workshop (Singapore), Nov 3-5 2004, global tasks. For example, for 4 processors and 8 local 2004. tasks, the HwRTOS can schedule till 48 tasks. [5] Klevin, Tommy. Get RealFast RTOS with Xilinx FPGAs. XCELL. 2003, 45. 6. Conclusion The multiprocessor Hardware RTOS is efficient (fast) and flexible as well since it provides much freedom to the designer for customization. It integrated main OS services with multiprocessor characteristics. The other important