Effisha: a Software Framework for Enabling Efficient Preemptive

Total Page:16

File Type:pdf, Size:1020Kb

Effisha: a Software Framework for Enabling Efficient Preemptive EffiSha: A Software Framework for Enabling Efficient Preemptive Scheduling of GPU Guoyang Chen, Yue Zhao, Xipeng Shen Huiyang Zhou Computer Science, North Carolina State Electrical and Computer Engineering, North University Carolina State University fgchen11, yzhao30, [email protected] [email protected] Abstract GPU, may get unfairly used: Consider two requests from Modern GPUs are broadly adopted in many multitasking en- applications A and B; even though A may have already used vironments, including data centers and smartphones. How- the GPU for many of its requests recently and B has only is- ever, the current support for the scheduling of multiple GPU sued its first request, the default GPU management—giving kernels (from different applications) is limited, forming a no consideration of the history usage of applications—may major barrier for GPU to meet many practical needs. This still assign the GPU to A and keep B waiting if A’s request work for the first time demonstrates that on existing GPUs, comes just slightly earlier than B’s request. Moreover, the efficient preemptive scheduling of GPU kernels is possi- default scheduling is oblivious to the priorities of kernels. ble even without special hardware support. Specifically, it Numerous studies [1–4] have shown that the problematic presents EffiSha, a pure software framework that enables way to manage GPU causes serious unfairness, response preemptive scheduling of GPU kernels with very low over- delays, and low GPU utilizations. head. The enabled preemptive scheduler offers flexible sup- Latest GPUs (e.g., Pascal from NVIDIA) are equipped port of kernels of different priorities, and demonstrates sig- with the capability to evict a GPU kernel at an arbitrary nificant potential for reducing the average turnaround time instruction. However, the kernel eviction incurs substantial and improving the system overall throughput of programs overhead in state saving and restoring. Some recent work [2, that time share a modern GPU. 4, 5] proposes some hardware extensions to help alleviate the issue. But they add hardware complexities. 1. Introduction Software solutions may benefit existing systems imme- diately. Prior efforts towards software solutions fall into two As a kind of massively parallel architectures, GPUs have at- classes. The first is for trackability. They propose some APIs tained broad adoptions in modern computing systems. Most and OS intercepting techniques [1, 3] to allow the OS or hy- of these systems are multitasking with more than one ap- pervisors to track the usage of GPU by each application. The plication running and requesting the usage of GPU simulta- improved trackability may help select GPU kernels to launch neously. In data centers, for instance, many customers may based on their past usage and priorities. The second class of concurrently submit their requests, multiple of which often work is about granularity. They use kernel slicing [6, 7] to need to be serviced simultaneously by a single node in the break one GPU kernel into many smaller ones. The reduced data center. How to manage GPU usage effectively in such granularity increases the flexibility in kernel scheduling, and environments is important for the responsiveness of the ap- may help shorten the time that a kernel has to wait before it plications, the utilization of the GPU, and the quality of ser- can get launched. vice of the computing system. Although these software solutions may enhance GPU The default management of GPU is through the undis- management, they are all subject to one important short- closed GPU drivers and follows a first-come-first-serve pol- coming: None of them allow the eviction of a running GPU icy. Under this policy, the system-level shared resource, kernel before its finish—that is, none of them allows pre- emptive GPU schedules. The request for GPU from an ap- plication, despite its priority, cannot get served before the finish of the GPU kernel that another application has already Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice launched. The length of the delay depends on the length of and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to the running kernel. To reduce the delay, some prior propos- lists, requires prior specific permission and/or a fee. Request permissions from [email protected] or Publications Dept., ACM, Inc., fax +1 (212) 869-0481. als (e.g., kernel slicing [6, 7]) attempt to split a kernel into PPoPP ’17, February 04-08, 2017, Austin, TX, USA Copyright c 2017 ACM 978-1-4503-4493-7/17/02. $15.00 many smaller kernels such that the wait for a kernel to finish DOI: http://dx.doi.org/10.1145/3018743.3018748 gets shortened. They however face a dilemma: The resulting Kernel of matrix addition: increased number of kernel launches and the reduced par- %threadIdx.x: the index of a thread in its block; allelism in the smaller kernels often cause some substantial %blockIdx.x: the global index of the thread block; performance loss (as much as 58% shown in our experiments %blockDim.x: the number of threads per thread block in Section 8). idx = threadIdx.x + blockIdx.x * blockDim.x; In this work, we propose a simple yet effective way to C[idx] = A[idx] + B[idx]; solve the dilemma. The key is a novel software approach that, for the first time, enables efficient preemptions of ker- Figure 1. A kernel for matrix addition. nels on existing GPUs. Kernels need not be sliced anymore; they voluntarily suspend and exit when it is time to switch kernels on GPU. 1. kernel (traditional GPU) How to enable efficient kernel preemption is challenging 2. predefined # of block-tasks for the large overhead of context savings and restoration for (kernel slicing based work [5]) the massive concurrent GPU threads. Before this work, all 3. single block-task(EffiSha) previously proposed solutions have relied on special hard- ware extensions [2, 4]. 4. arbitrary segment of a block-task(impractical) Our solution is pure software-based, consisting of some Figure 2. Four levels of GPU scheduling granularity. novel program transformations and runtime machinery. We call our compiler-runtime synergistic framework EffiSha (for efficient sharing). As a pure software framework, it is A GPU kernel launch usually creates a large number immediately deployable in today’s systems. Instead of slic- of GPU threads. They all run the same kernel program, ing a kernel into many smaller kernels, EffiSha transforms which usually includes some references to thread IDs to the kernel into a form that are amenable for efficient vol- differentiate the behaviors of the threads and the data they untary preemptions. Little if any data need to be saved or work on. These threads are organized into groups called restored upon an eviction. Compared to prior kernel slicing- thread blocks. based solutions, EffiSha reduces the runtime eviction over- Note that in a typical GPU program1, no synchronizations head from 58% to 4% on average, and removes the need for across different thread blocks are supported. Different thread selecting an appropriate kernel size in kernel slicing. blocks can communicate by operating on the same data loca- EffiSha opens the opportunities for preemptive kernel tions on the global memory (e.g., reduction through atomic scheduling on GPU. We implement two priority-based pre- operations), but the communications must not cause depen- emptive schedulers upon EffiSha. They offer flexible support dence hazards (i.e., execution order constraints) among the for kernels of different priorities, and demonstrate significant thread blocks. Otherwise, the kernel could suffer deadlocks potential for reducing the average turnaround time (18-65% due to the hardware-based thread scheduling. on average) and improving the overall system throughput We call the set of work done by a thread block a block- of program executions (1.35X-1.8X on average) that time- task. For the aforementioned GPU property, the block-tasks share a GPU. of a kernel can run in an arbitrary order (even if they op- This work makes the following major contributions: erate on some common locations in the memory). Different 1) It presents EffiSha, a compiler-runtime framework block-tasks do not communicate through registers or shared that, for the first time, makes beneficial preemptive GPU memory. The execution of each block-task must first set up scheduling possible without hardware extensions. the states of their registers and shared memory. Typically, 2) It proposes the first software approach to enabling the ID of a thread block is taken as the ID of its block-task. efficient preemptions of GPU kernels. The approach consists Figure 1 shows the kernel for matrix addition. of multi-fold innovations, including the preemption enabling Existing GPUs do not allow the eviction of running com- program transformation, the creation of GPU proxies, and a puting kernels. A newly arrived request for GPU by a differ- three-way synergy among applications, a CPU daemon, and ent application must wait for the currently running kernel to GPU proxies. finish before it can use the GPU 2. 3) It demonstrates the potential of EffiSha-based sched- Scheduling Granularity. Scheduling granularity deter- ulers for supporting kernel priorities and improving kernel mines the time when a kernel switch can happen on GPU. responsiveness and system throughput. This section explains the choice we make. 2. Terminology and Granularity 1 Exceptions are GPU kernels written with persistent threads [8], which are discussed later in this paper.
Recommended publications
  • Microkernel Mechanisms for Improving the Trustworthiness of Commodity Hardware
    Microkernel Mechanisms for Improving the Trustworthiness of Commodity Hardware Yanyan Shen Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy School of Computer Science and Engineering Faculty of Engineering March 2019 Thesis/Dissertation Sheet Surname/Family Name : Shen Given Name/s : Yanyan Abbreviation for degree as give in the University calendar : PhD Faculty : Faculty of Engineering School : School of Computer Science and Engineering Microkernel Mechanisms for Improving the Trustworthiness of Commodity Thesis Title : Hardware Abstract 350 words maximum: (PLEASE TYPE) The thesis presents microkernel-based software-implemented mechanisms for improving the trustworthiness of computer systems based on commercial off-the-shelf (COTS) hardware that can malfunction when the hardware is impacted by transient hardware faults. The hardware anomalies, if undetected, can cause data corruptions, system crashes, and security vulnerabilities, significantly undermining system dependability. Specifically, we adopt the single event upset (SEU) fault model and address transient CPU or memory faults. We take advantage of the functional correctness and isolation guarantee provided by the formally verified seL4 microkernel and hardware redundancy provided by multicore processors, design the redundant co-execution (RCoE) architecture that replicates a whole software system (including the microkernel) onto different CPU cores, and implement two variants, loosely-coupled redundant co-execution (LC-RCoE) and closely-coupled redundant co-execution (CC-RCoE), for the ARM and x86 architectures. RCoE treats each replica of the software system as a state machine and ensures that the replicas start from the same initial state, observe consistent inputs, perform equivalent state transitions, and thus produce consistent outputs during error-free executions.
    [Show full text]
  • Advanced Operating Systems #1
    <Slides download> https://www.pf.is.s.u-tokyo.ac.jp/classes Advanced Operating Systems #2 Hiroyuki Chishiro Project Lecturer Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo Room 007 Introduction of Myself • Name: Hiroyuki Chishiro • Project Lecturer at Kato Laboratory in December 2017 - Present. • Short Bibliography • Ph.D. at Keio University on March 2012 (Yamasaki Laboratory: Same as Prof. Kato). • JSPS Research Fellow (PD) in April 2012 – March 2014. • Research Associate at Keio University in April 2014 – March 2016. • Assistant Professor at Advanced Institute of Industrial Technology in April 2016 – November 2017. • Research Interests • Real-Time Systems • Operating Systems • Middleware • Trading Systems Course Plan • Multi-core Resource Management • Many-core Resource Management • GPU Resource Management • Virtual Machines • Distributed File Systems • High-performance Networking • Memory Management • Network on a Chip • Embedded Real-time OS • Device Drivers • Linux Kernel Schedule 1. 2018.9.28 Introduction + Linux Kernel (Kato) 2. 2018.10.5 Linux Kernel (Chishiro) 3. 2018.10.12 Linux Kernel (Kato) 4. 2018.10.19 Linux Kernel (Kato) 5. 2018.10.26 Linux Kernel (Kato) 6. 2018.11.2 Advanced Research (Chishiro) 7. 2018.11.9 Advanced Research (Chishiro) 8. 2018.11.16 (No Class) 9. 2018.11.23 (Holiday) 10. 2018.11.30 Advanced Research (Kato) 11. 2018.12.7 Advanced Research (Kato) 12. 2018.12.14 Advanced Research (Chishiro) 13. 2018.12.21 Linux Kernel 14. 2019.1.11 Linux Kernel 15. 2019.1.18 (No Class) 16. 2019.1.25 (No Class) Linux Kernel Introducing Synchronization /* The cases for Linux */ Acknowledgement: Prof.
    [Show full text]
  • Process Scheduling Ii
    PROCESS SCHEDULING II CS124 – Operating Systems Spring 2021, Lecture 12 2 Real-Time Systems • Increasingly common to have systems with real-time scheduling requirements • Real-time systems are driven by specific events • Often a periodic hardware timer interrupt • Can also be other events, e.g. detecting a wheel slipping, or an optical sensor triggering, or a proximity sensor reaching a threshold • Event latency is the amount of time between an event occurring, and when it is actually serviced • Usually, real-time systems must keep event latency below a minimum required threshold • e.g. antilock braking system has 3-5 ms to respond to wheel-slide • The real-time system must try to meet its deadlines, regardless of system load • Of course, may not always be possible… 3 Real-Time Systems (2) • Hard real-time systems require tasks to be serviced before their deadlines, otherwise the system has failed • e.g. robotic assembly lines, antilock braking systems • Soft real-time systems do not guarantee tasks will be serviced before their deadlines • Typically only guarantee that real-time tasks will be higher priority than other non-real-time tasks • e.g. media players • Within the operating system, two latencies affect the event latency of the system’s response: • Interrupt latency is the time between an interrupt occurring, and the interrupt service routine beginning to execute • Dispatch latency is the time the scheduler dispatcher takes to switch from one process to another 4 Interrupt Latency • Interrupt latency in context: Interrupt! Task
    [Show full text]
  • What Is an Operating System III 2.1 Compnents II an Operating System
    Page 1 of 6 What is an Operating System III 2.1 Compnents II An operating system (OS) is software that manages computer hardware and software resources and provides common services for computer programs. The operating system is an essential component of the system software in a computer system. Application programs usually require an operating system to function. Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers.
    [Show full text]
  • Linux Kernal II 9.1 Architecture
    Page 1 of 7 Linux Kernal II 9.1 Architecture: The Linux kernel is a Unix-like operating system kernel used by a variety of operating systems based on it, which are usually in the form of Linux distributions. The Linux kernel is a prominent example of free and open source software. Programming language The Linux kernel is written in the version of the C programming language supported by GCC (which has introduced a number of extensions and changes to standard C), together with a number of short sections of code written in the assembly language (in GCC's "AT&T-style" syntax) of the target architecture. Because of the extensions to C it supports, GCC was for a long time the only compiler capable of correctly building the Linux kernel. Compiler compatibility GCC is the default compiler for the Linux kernel source. In 2004, Intel claimed to have modified the kernel so that its C compiler also was capable of compiling it. There was another such reported success in 2009 with a modified 2.6.22 version of the kernel. Since 2010, effort has been underway to build the Linux kernel with Clang, an alternative compiler for the C language; as of 12 April 2014, the official kernel could almost be compiled by Clang. The project dedicated to this effort is named LLVMLinxu after the LLVM compiler infrastructure upon which Clang is built. LLVMLinux does not aim to fork either the Linux kernel or the LLVM, therefore it is a meta-project composed of patches that are eventually submitted to the upstream projects.
    [Show full text]
  • A Linux Kernel Scheduler Extension for Multi-Core Systems
    A Linux Kernel Scheduler Extension For Multi-Core Systems Aleix Roca Nonell Master in Innovation and Research in Informatics (MIRI) specialized in High Performance Computing (HPC) Facultat d’informàtica de Barcelona (FIB) Universitat Politècnica de Catalunya Supervisor: Vicenç Beltran Querol Cosupervisor: Eduard Ayguadé Parra Presentation date: 25 October 2017 Abstract The Linux Kernel OS is a black box from the user-space point of view. In most cases, this is not a problem. However, for parallel high performance computing applications it can be a limitation. Such applications usually execute on top of a runtime system, itself executing on top of a general purpose kernel. Current runtime systems take care of getting the most of each system core by distributing work among the multiple CPUs of a machine but they are not aware of when one of their threads perform blocking calls (e.g. I/O operations). When such a blocking call happens, the processing core is stalled, leading to performance loss. In this thesis, it is presented the proof-of-concept of a Linux kernel extension denoted User Monitored Threads (UMT). The extension allows a user-space application to be notified of the blocking and unblocking of its threads, making it possible for a core to execute another worker thread while the other is blocked. An existing runtime system (namely Nanos6) is adapted, so that it takes advantage of the kernel extension. The whole prototype is tested on a synthetic benchmarks and an industry mock-up application. The analysis of the results shows, on the tested hardware and the appropriate conditions, a significant speedup improvement.
    [Show full text]
  • Chapter 5. Kernel Synchronization
    Chapter 5. Kernel Synchronization You could think of the kernel as a server that answers requests; these requests can come either from a process running on a CPU or an external device issuing an interrupt request. We make this analogy to underscore that parts of the kernel are not run serially, but in an interleaved way. Thus, they can give rise to race conditions, which must be controlled through proper synchronization techniques. A general introduction to these topics can be found in the section "An Overview of Unix Kernels" in Chapter 1 of the book. We start this chapter by reviewing when, and to what extent, kernel requests are executed in an interleaved fashion. We then introduce the basic synchronization primitives implemented by the kernel and describe how they are applied in the most common cases. 5.1.1. Kernel Preemption a kernel is preemptive if a process switch may occur while the replaced process is executing a kernel function, that is, while it runs in Kernel Mode. Unfortunately, in Linux (as well as in any other real operating system) things are much more complicated: Both in preemptive and nonpreemptive kernels, a process running in Kernel Mode can voluntarily relinquish the CPU, for instance because it has to sleep waiting for some resource. We will call this kind of process switch a planned process switch. However, a preemptive kernel differs from a nonpreemptive kernel on the way a process running in Kernel Mode reacts to asynchronous events that could induce a process switch - for instance, an interrupt handler that awakes a higher priority process.
    [Show full text]
  • Advanced Operating Systems #1
    <Slides download> http://www.pf.is.s.u-tokyo.ac.jp/class.html Advanced Operating Systems #5 Shinpei Kato Associate Professor Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo Course Plan • Multi-core Resource Management • Many-core Resource Management • GPU Resource Management • Virtual Machines • Distributed File Systems • High-performance Networking • Memory Management • Network on a Chip • Embedded Real-time OS • Device Drivers • Linux Kernel Schedule 1. 2018.9.28 Introduction + Linux Kernel (Kato) 2. 2018.10.5 Linux Kernel (Chishiro) 3. 2018.10.12 Linux Kernel (Kato) 4. 2018.10.19 Linux Kernel (Kato) 5. 2018.10.26 Linux Kernel (Kato) 6. 2018.11.2 Advanced Research (Chishiro) 7. 2018.11.9 Advanced Research (Chishiro) 8. 2018.11.16 (No Class) 9. 2018.11.23 (Holiday) 10. 2018.11.30 Advanced Research (Kato) 11. 2018.12.7 Advanced Research (Kato) 12. 2018.12.14 Advanced Research (Chishiro) 13. 2018.12.21 Linux Kernel 14. 2019.1.11 Linux Kernel 15. 2019.1.18 (No Class) 16. 2019.1.25 (No Class) Process Scheduling Abstracting Computing Resources /* The case for Linux */ Acknowledgement: Prof. Pierre Olivier, ECE 4984, Linux Kernel Programming, Virginia Tech Outline 1 General information 2 Linux Completely Fair Scheduler 3 CFS implementation 4 Preemption and context switching 5 Real-time scheduling policies 6 Scheduling-related syscalls Outline 1 General information 2 Linux Completely Fair Scheduler 3 CFS implementation 4 Preemption and context switching 5 Real-time scheduling policies
    [Show full text]
  • Kernel and Locking
    Kernel and Locking Luca Abeni [email protected] Monolithic Kernels • Traditional Unix-like structure • Protection: distinction between Kernel (running in KS) and User Applications (running in US) • The kernel behaves as a single-threaded program • One single execution flow in KS at each time • Simplify consistency of internal kernel structures • Execution enters the kernel in two ways: • Coming from upside (system calls) • Coming from below (hardware interrupts) Kernel Programming Kernel Locking Single-Threaded Kernels • Only one single execution flow (thread) can execute in the kernel • It is not possible to execute more than 1 system call at time • Non-preemptable system calls • In SMP systems, syscalls are critical sections (execute in mutual exclusion) • Interrupt handlers execute in the context of the interrupted task Kernel Programming Kernel Locking Bottom Halves • Interrupt handlers split in two parts • Short and fast ISR • “Soft IRQ handler” • Soft IRQ hanlder: deferred handler • Traditionally known ass Bottom Half (BH) • AKA Deferred Procedure Call - DPC - in Windows • Linux: distinction between “traditional” BHs and Soft IRQ handlers Kernel Programming Kernel Locking Synchronizing System Calls and BHs • Synchronization with ISRs by disabling interrupts • Synchronization with BHs: is almost automatic • BHs execute atomically (a BH cannot interrupt another BH) • BHs execute at the end of the system call, before invoking the scheduler for returning to US • Easy synchronization, but large non-preemptable sections! • Achieved
    [Show full text]
  • Challenges Using Linux As a Real-Time Operating System
    https://ntrs.nasa.gov/search.jsp?R=20200002390 2020-05-24T04:26:54+00:00Z View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by NASA Technical Reports Server Challenges Using Linux as a Real-Time Operating System Michael M. Madden* NASA Langley Research Center, Hampton, VA, 23681 Human-in-the-loop (HITL) simulation groups at NASA and the Air Force Research Lab have been using Linux as a real-time operating system (RTOS) for over a decade. More recently, SpaceX has revealed that it is using Linux as an RTOS for its Falcon launch vehicles and Dragon capsules. As Linux makes its way from ground facilities to flight critical systems, it is necessary to recognize that the real-time capabilities in Linux are cobbled onto a kernel architecture designed for general purpose computing. The Linux kernel contain numerous design decisions that favor throughput over determinism and latency. These decisions often require workarounds in the application or customization of the kernel to restore a high probability that Linux will achieve deadlines. I. Introduction he human-in-the-loop (HITL) simulation groups at the NASA Langley Research Center and the Air Force TResearch Lab have used Linux as a real-time operating system (RTOS) for over a decade [1, 2]. More recently, SpaceX has revealed that it is using Linux as an RTOS for its Falcon launch vehicles and Dragon capsules [3]. Reference 2 examined an early version of the Linux Kernel for real-time applications and, using black box testing of jitter, found it suitable when paired with an external hardware interval timer.
    [Show full text]
  • Real-Time Audio Servers on BSD Unix Derivatives
    Juha Erkkilä Real-Time Audio Servers on BSD Unix Derivatives Master's Thesis in Information Technology June 17, 2005 University of Jyväskylä Department of Mathematical Information Technology Jyväskylä Author: Juha Erkkilä Contact information: [email protected].fi Title: Real-Time Audio Servers on BSD Unix Derivatives Työn nimi: Reaaliaikaiset äänipalvelinsovellukset BSD Unix -johdannaisjärjestelmissä Project: Master's Thesis in Information Technology Page count: 146 Abstract: This paper covers real-time and interprocess communication features of 4.4BSD Unix derived operating systems, and especially their applicability for real- time audio servers. The research ground of bringing real-time properties to tradi- tional Unix operating systems (such as 4.4BSD) is covered. Included are some design ideas used in BSD-variants, such as using multithreaded kernels, and schedulers that can provide real-time guarantees to processes. Factors affecting the design of real- time audio servers are considered, especially the suitability of various interprocess communication facilities as mechanisms to pass audio data between applications. To test these mechanisms on a real operating system, an audio server and a client utilizing these techniques is written and tested on an OpenBSD operating system. The performance of the audio server and OpenBSD is analyzed, with attempts to identify some bottlenecks of real-time operation in the OpenBSD system. Suomenkielinen tiivistelmä: Tämä tutkielma kattaa reaaliaikaisuus- ja prosessien väliset kommunikaatio-ominaisuudet, keskittyen 4.4BSD Unix -johdannaisiin käyt- töjärjestelmiin, ja erityisesti siihen kuinka hyvin nämä soveltuvat reaaliaikaisille äänipalvelinsovelluksille. Tutkimusalueeseen sisältyy reaaliaikaisuusominaisuuk- sien tuominen perinteisiin Unix-käyttöjärjestelmiin (kuten 4.4BSD:hen). Mukana on suunnitteluideoita, joita on käytetty joissakin BSD-varianteissa, kuten säikeis- tetyt kernelit, ja skedulerit, jotka voivat tarjota reaaliaikaisuustakeita prosesseille.
    [Show full text]
  • Architecture of the Linux Kernel
    Architecture of the Linux kernel Gerion Entrup Felix Herrmann Sophie Matter Elias Entrup Matthias Jakob Jan Eberhardt Mathias Casselt March 21, 2018 Contents 1 Introduction 7 1.1 Overview of Linux components ................................... 7 1.2 License ................................................. 9 2 Process Management 11 2.1 Processes ............................................... 11 2.2 Process creation - Forking a process ................................ 13 2.3 Threads ................................................ 16 3 Scheduling in Linux 17 3.1 Scheduling-Classes .......................................... 17 3.2 Process Categories .......................................... 17 3.3 O(1)-Scheduler ............................................ 18 3.4 Priorities and Timeslices ....................................... 18 3.5 Completely Fair Scheduler ...................................... 19 3.5.1 Implementation details .................................... 20 3.6 Kernel Preemption .......................................... 21 4 Interrupts 23 4.1 Interrupt processing ......................................... 23 4.2 Interrupt entry ............................................ 23 4.3 Event handler ............................................. 24 4.3.1 Example: Generic Top Halve ................................ 24 4.3.2 Example: Handler ...................................... 24 4.4 /proc/interrupts ........................................... 25 4.5 Conclusion .............................................. 25 5 Bottom Halves 27 5.1
    [Show full text]