Real-Time Scheduling on Multi-Core: Theory and Practice

FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO Real-Time Scheduling on Multi-core: Theory and Practice Paulo Manuel Baltarejo de Sousa Doctoral Program in Informatics Engineering Supervisor: Prof. Eduardo Manuel Medicis Tovar Second Supervisor: Prof. Luís Miguel Pinho Almeida October 29, 2013 c Paulo Manuel Baltarejo de Sousa, 2013 Real-Time Scheduling on Multi-core: Theory and Practice Paulo Manuel Baltarejo de Sousa Doctoral Program in Informatics Engineering October 29, 2013 Abstract Nowadays, multi-core platforms are commonplace for efficiently achieving high computational power, even in embedded systems, with the number of cores steadily increasing and expected to reach hundreds of cores per chip in near future. Real-time computing is becoming increasingly important and pervasive, as more and more industries, infrastructures, and people depend on it. For real-time systems too, multi-cores offer an opportunity for a considerable boost in processing capacity, at relatively low price and power. This could, in principle, help with meeting the timing requirements of computationally intensive applications which could not be met on single-cores. However, real-time system designers must adopt suitable approaches for their system to allow them to fully exploit the capabilities of the multi-core platforms. In this line, scheduling algorithms will, certainly, play an important role. Real-time scheduling algorithms for multiprocessors are typically categorized as global, partitioned, and semi-partitioned. Global scheduling algorithms store tasks in one global queue, shared by all processors. Tasks can migrate from one processor to another; that is, a task can be preempted during its execution and resume its execution on another processor. At any moment, the m highest- priority tasks are selected for execution on the m processors. Some algorithms of this kind achieve a utilization bound of 100% but generate too many preemptions and migrations. Partitioned scheduling algorithms partition the task set and assign all tasks in a partition to the same processor. Hence, tasks cannot migrate between processors. Such algorithms involve few preemptions but their utilization bound is at most 50%. In semi-partitioned (or task-splitting) scheduling algorithms most tasks are fixed to specific processors (like partitioned), while a few tasks migrate across processors (like global). This approach produces a better balance of the workload among processors than partitioning and consequently presents higher utilization bounds, additionally, it also reduces the contention on shared queues and the number of migrations (by reducing the number of migratory tasks). However, it must ensure that migratory tasks never execute on two or more processors simultaneously. For all of the categories of scheduling algorithms mentioned, it generally holds that the real- time scheduling theory is based on a set of concepts and assumptions that have little correspon- dence with practice. The gap between theory and practice is wide, and this compromises both the applicability but also the reliability. Therefore, we believe that efficient scheduling algorithm implementations in real-operating systems can play an important role in reducing the gap between theory and practice. We defined that (i) real-time theory cannot be detached from practical issues and (ii) the schedulability analysis must incorporate practical issues, such as context switch (or task switch) and task release overheads, just to mention a few. Therefore, the real-time theory must be overhead-aware in order to be reliable and must also consider the practical (in most cases, operating system-related) issues to increase its implementability in real operating systems. In this dissertation, we deal with semi-partitioned scheduling algorithms that are categorized as slot-based task-splitting to achieve the above mentioned claims. To this end, we define a set of design principles to efficiently implement those scheduling algorithms in a real operating system i ii (a PREEMPT-RT-patched Linux kernel version). Grounded on implementations of the slot-based task-splitting scheduling algorithms in the Linux kernel, we identify and model all run-time overheads incurred by those scheduling algorithms. We, then, incorporate them into a new schedulability analysis. This new theory is based on exact schedulability tests, thus also overcoming many sources of pessimism in existing analysis. Additionally, since schedulability testing guides the task assignment under the schemes in consideration, we also formulate an improved task assignment procedure. The outcome is a new demand-based overhead-aware schedulability analysis that permits increased efficiency and reliability. We also devise a new scheduling algorithm for multiprocessor systems, called Carousel-EDF. Although, it presents some similarities with slot- based task-splitting scheduling algorithms, we classify the Carousel-EDF scheduling algorithm as a reserve-based scheduling algorithm. The underlying theory is also overhead-aware and we also implement it in the Linux kernel. Resumo Hoje em dia, a utilização de plataformas multi-processador de modo a obter maior poder de com- putação é bastante comum e é expectável que o número de processadores atinja as centenas por plataforma num futuro próximo. Por outro lado, tem-se verificado um aumento da importância da computação de tempo real à medida que é mais usada na indústria, em diversas infraestruturas e mesmo pelas pessoas. Dadas as especificidades dos sistemas de tempo real, estas plataformas representam uma oportunidade para aumentar a capacidade de processamento a baixo custo e com baixo consumo de energia. Todo este poder computacional pode, em princípio, ajudar a satisfazer os requisitos temporais de aplicações com elevada exigência computacional que não consegue ser satisfeita por plataformas equipadas com um só processador. No entanto, na engenharia destes de sistemas de tempo real devem ser adoptadas abordagens adequadas por forma a tirar o máximo proveito destas plataformas multi-processador. Nesse sentido, certamente que os algoritmos de escalonamento de tarefas desempenham um papel importante. Os algoritmos de escalonamento para sistemas de tempo real em plataformas multi-processador são, geralmente, categorizados como globais, particionados e semi-particionados. Os algoritmos de escalonamentos globais guardam todas as tarefas numa fila global, partilhada por todos os processadores. As tarefas podem migrar entre os processadores; isto é, a execução de uma tarefa pode ser interrompida (preemptada) num processador e recomeçar a execução noutro processador. A qualquer instante, as m tarefas com mais prioridade estão em execução nos m processadores que compõem a plataforma. Alguns algoritmos globais conseguem taxas de utilização de 100%, no entanto à custa de muitas preempções e migrações. Os algoritmos de escalonamentos particionados, dividem (criam partições) as tarefas que compõem o conjunto de tarefas pelos processadores. As tarefas atribuídas a um processador só podem ser executadas por esse processador. Desta forma, as migrações de tarefas não são permitidas e desta forma diminuem o número de preempções. Porém, a taxa de utilização é de 50%. Os algoritmos semi-particionados criam partições com a maior parte das tarefas (como os algoritmos particionados) e as restantes (poucas) podem migrar entre os processadores (como os algoritmos globais). Se por um lado, esta abordagem permite um melhor balanceamento das tarefas entre os processadores e consequentemente uma taxa de utilização maior. Por outro lado, reduz a disputa pelo acesso às filas partilhadas que armazenam as tarefas migratórias e o número de migrações (porque diminuem as tarefas que podem migrar). Contudo, estes algoritmos têm que assegurar que uma tarefa migratória não executa em dois ou mais processadores simultaneamente. Geralmente, a teoria subjacente a todas as categorias de algoritmos de escalonamento para sistemas de tempo real descritas anteriormente é baseada num conjunto de conceitos e suposições que têm pouca correspondência com a prática; isto é, não têm em conta as especificidades das plataformas e dos sistemas operativos. A diferença entre a teoria e a prática é enorme e consequentemente compromete quer a sua aplicabilidade quer a sua confiabilidade. Portanto, nós acreditamos que implementações eficientes dos algoritmos de escalonamento num sistema operativo podem ajudar a diminuir a diferença entre a teoria e a prática. Assim sendo, nós definimos que (i) a teoria de iii iv sistemas de tempo real não pode estar dissociada dos detalhes de implementação e (ii) a análise de escalonabilidade deve incorporar os detalhes da prática, tais como, o custo associado à troca de tarefas no processador, ao aparecimento das tarefas no sistema, entre outras. Portanto, a teoria de sistemas de tempo real deve ter em conta os custos por forma a ser mais confiável e também mais implementável nos sistemas operativos. Nesta dissertação, nós focamo-nos nos algoritmos semi-particionados, mais especificamente, num conjunto de algoritmos de escalonamento para sistemas de tempo real (para plataformas multi-processador) cuja principal característica é a divisão do tempo em reservas temporais. Com o propósito de alcançar os requisitos enumerados anteriormente, nós definimos um conjunto de princípios para uma implementação eficiente num sistema operativo (neste caso, numa versão do sistema operativo Linux melhorada com funcionalidades apropriadas para

Real-Time Scheduling on Multi-Core: Theory and Practice

Demystifying the Real-Time Linux Scheduling Latency

Real-Time Scheduling: from Hard to Soft Real-Time Systems

Aquosa-Adaptive Quality of Service Architecture

Execution Time Monitoring in Linux∗

UNC Technical Report TR08-010

Low-Latency Audio on Linux by Means of Real-Time Scheduling ∗

Real-Time Attributes in Operating Systems

A Userspace Library for Multicore Real-Time Scheduling

Adaptive Management of Qos in Open Systems

Programming Interfaces for Realtime and Cloud-Based Computing

Implementation of the Multi-Level Adaptive Hierarchical Scheduling Framework

Automata-Based Formal Analysis and Veriﬁcation of the Real-Time Linux Kernel