Rotating Between Scheduling Algorithms

CARNEGIE MELLON UNIVERSITY - 15418: PARALLEL COMPUTER ARCHITECTURE AND PROGRAMMING 1 sdgOS: Rotating Between Scheduling Algorithms Valerie Choung and Samuel Damashek Abstract| We extended the SouperDamGoodOS (sdgOS) from 15-410 to support symmetric multiprocessing (SMP). Additionally, we created a mechanism for a process to select which scheduling mode it would prefer to run under. The two scheduling modes currently supported are a normal round-robin scheduling algorithm with thread-level granularity, and the other is an approximation of gang scheduling. In this paper, we discuss the implementation details our operating system and also analyze the performance of some sample programs under both scheduling algorithms. Keywords|Scheduling, Gang scheduling, OS, Autogroup. F 1 Introduction very similar to gang scheduling, except that if a processor cannot find more work from the cur- A very simple scheduling algorithm (which rently active process, it will steal work from an- we call the \normal scheduling mode", or NSM) other process's work queue. may treat all threads equally, and rotate through Our project builds on the SouperDamGoodOS threads in a round-robin fashion. However, this (sdgOS), which runs on an x86 machine support- introduces process/task-level fairness issues: if one ing a PS/2 keyboard. This environment can be process creates many threads, threads in that pro- simulated on Simics. Our study primarily has cess would take up more processor time slices in three parts to it: a given time frame when compared against other, less thread-intensive processes. 1) Implement symmetric multi-processing Gang scheduling is one way to ensure that a (SMP) process experiences a reasonable number of clock 2) Support scheduling algorithm rotations ticks before the process is switched away from to 3) Implement as syscall to switch a process be run at a later time. A pure gang scheduler from one scheduling mode to another will keep track of a currently active process to (process scheduling mode switching). be ran, and at any given moment, all processors Currently our project supports process will be executing threads that belong to the cur- scheduling mode switching from normal schedul- rently active process (or they will be idle, if no ing mode (NSM) to the gang scheduling mode additional threads are available.) This also means (GSM). The other direction (GSM to NSM) is that a thread in any process can assume that other unimplemented due to time constraints, but the processors will be working exclusively on related logic is almost symmetrical. threads. From this project, we observe that perfor- Also related is coscheduling. Coscheduling is mance of programs under GSM is greatly influ- CARNEGIE MELLON UNIVERSITY - 15418: PARALLEL COMPUTER ARCHITECTURE AND PROGRAMMING 2 enced by the length of the timeslices between . scheduler mode rotations. Additionally, each rota- tion incurs a non-negligible time cost due to inter- processor barriers. 1.1 Secrecy . The 15-410 course staff is notoriously secre- tive about the nature of many design decisions .2 students commonly encounter during the kernel project. This is driven by a desire to cause students to find and come up with solutions to these . Thus, we adapted decisions on their own. To preserve this secrecy, we our kernel mutexes to SMP by occasionally censor design decisions that we made in our original kernel and in our SMP extension in our online report. However, nothing is censored in . the final version that we uploaded to Gradescope.1 2.1.1 GSM adaptation 2 Design and Implementation We designed a further extension of our kernel 2.1 Mutexes mutexes which considered gang scheduling, but Our original, single-processor kernel mutexes did not end up implementing it for reasons ex- were plained below. Most notably, we realized it may . go against the philosophy of gang scheduling if we were running under GSM and were to . Unfortunately, solving this problem would have required a significant redesign of our con- . currency primitives. It doesn't suffice to simply . This makes sense on a single-processor system, . We believe a robust solution would be to since 1. This section is paraphrased from Ben Blum's PhD disser- 2. This assumption holds because we use mutexes only for tation [2]. short, bounded-length critical sections. CARNEGIE MELLON UNIVERSITY - 15418: PARALLEL COMPUTER ARCHITECTURE AND PROGRAMMING 3 2.2.2 Partial locks . Since the scheduler lock isn't initially held by a newly-created thread, a context switch to . a new thread could be a problem. This is be- We decided that this level of code redesign is not cause worth it for the scale our project, though it would be necessary in a production system. This is mo- tivates the partial_lock and partial_unlock, 2.2 Scheduler lock and Context Switching which will . (It also turns out that partial locks can be used in conjunction with . barriers, which is nice.) . 2.3 Context Switching to New Thread . So, after we context switch to a new thread, the code that sets up the new thread for entering 2.2.1 Abstraction vs. Implementation the user space will perform a partial_unlock. To As a programmer who is not implementing avoid the race condition where a timer interrupt scheduler locks, one would only need to know could trigger a context switch before the partial that after a call to sched_lock, there needs to unlock is performed, we keep flag a thread if it is be a corresponding sched_unlock. This is a little new. If the flag is set, then the context switcher deceiving though: itself will force interrupts to be disabled. Later, Consider a single processor that uses the the code that sets up the new thread for entering scheduler lock: user space will resolve the interrupt flag (through iret). 2.3.1 A Failed Idea Ultimately, we decided to keep our general context switching logic roughly the same as in the . This is actually fine, since after com- original NSM-only sdgOS. pleting a context switch, the destination thread At one point, we tried a form of \optimistic" will call sched_unlock. context switching: Essentially, a thread could be sched_unlock will on behalf placed in the work queue while simultaneously of the source thread, running. The context switcher would then check . to see if the target thread was currently running In summary, after a context switch, the call on a different processor. If so, it would roll back to sched_unlock actually corresponds to the and find a different thread to run. This mech- sched_lock performed by the thread that previ- anism did work, but we decided that this form ously ran on the same processor. of context switching was more detrimental to the CARNEGIE MELLON UNIVERSITY - 15418: PARALLEL COMPUTER ARCHITECTURE AND PROGRAMMING 4 development process than helpful; since threads Figure 1 in the Appendix illustrates how the could be both running and runnable, debugging runlists work together. became a lot harder, and now the context switch The size of timeslices between NSM and GSM would contain a potentially O(n) operation. While can be configured easily. Within each schedul- this is not dissimilar to spinning in an idle thread, ing mode, every timer tick will trigger a context it is objectively more wasteful. Furthermore, an switch to a thread within the same scheduling O(n) operation can be very massive if there exist mode. For GSM, the currently active process is many runnable threads or running threads. This rotated once each time we switch away from GSM is very unscalable, so we discarded this idea after to NSM. It would work to rotate the currently a while. active process when switching from NSM to GSM as well, and makes no practical difference, as long 2.3.2 Timer as it is consistent. Initially, the bootstrapping processor (BSP, aka cpu0) would propagate timer interrupts to all 2.5 set sched mode Syscall the application processors (APs). This caused all the processors to attempt to context switch at the The set_sched_mode syscall suggests to the same time, increasing contention for the sched- scheduler what scheduling mode it would like to uler lock. Additionally, when running many short run under. Currently, we allow processes under the threads (something like slaughter print_basic normal scheduling mode to switch themselves over 5 5 0)3, there would be a lot of contention for to the gang scheduling mode. The other direction mutexes, which are used to kill threads. We re- (gang to normal mode) is unimplemented, but the duce contention by offsetting the APs' timers. concept is symmetric. Contention for locks is still a bottleneck in some Converting a process from one scheduling applications, but we did not explore optimizations mode to another is surprisingly nuanced. Below for this. we list some considerations: 1) Should we guarantee that all threads in 2.4 Work Queues the process are running under the new Most multi-processor systems appear to use scheduling mode immediately after the work stealing schedulers, where each processor syscall exits? has its own work queue. While this is possible 2) Should gang processes be allowed to fork? to incorporate into our design, we leave that as 3) When should the next barrier-and- future work to be done. context-switch be? Our design consists of two queues: one for NSM and one for GSM. The NSM queue is a basic After answering these questions, we decide that round robin queue. The GSM queue is a mutli- GSM would probably be simpler to implement as tiered queue, where the high tier corresponds to an approximation of gang scheduling, rather than processes. Each process then contains a queue of pure gang scheduling. threads that correspond to the process. The syscall itself works roughly as follows: We check if the requested scheduling mode is the same 3.

Rotating Between Scheduling Algorithms

Three Level Scheduling

Job Scheduling Strategies for Networks of Workstations

Balancing Efficiency and Fairness in Heterogeneous GPU Clusters for Deep Learning

CPU Scheduling

CPU Scheduling

Flexible Coscheduling: Mitigating Load Imbalance Title: and Improving Utilization of Heterogeneous Resources

Processor Scheduling

Gang Scheduling Java Applications with Tessellation

CPU Scheduling

Lecture 4: Scheduling

COS 318: Operating Systems CPU Scheduling

Technical Report Nr. 745