TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

TCSS 422: OPERATING SYSTEMS

Introduction to CPU Scheduling, Multi-Level Feedback Queue, Proportional Share Scheduling

Wes J. Lloyd School of Engineering and Technology University of Washington - Tacoma

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma

FEEDBACK FROM 1/16

 I am unclear about the Ch. 6 material  What is limited direct execution?  CPU modes: what is user mode?  CPU modes: what is kernel mode?  Three kinds of traps “switch” processing to kernel mode . (calls for I/O, kernel API calls, ) . Exception (error in code) . : Maskable vs. Unmaskable  What is Cooperative multi-tasking?  What is Preemptive multi-tasking?  What is a context switch?  CPU scheduling: what is a time quantum (slice)?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.2 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.1 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

FEEDBACK - 2

 What criteria is used to decide whether a context switch is appropriate following the end of a job’s time slice? . After time quantum timer interrupt always fires, OS regains control (context switch) to evaluate next scheduling task  Causes of context switches: . System call – in user mode context switches to kernel mode to perform a kernel (API) call. Example: I/O . Multi-tasking – process uses full time quantum (10 ms default), timer interrupt fires, OS scheduler regains control of CPU to evaluate next task . Exception – CPU exception running current task (e.g. memory access fault) TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.3 School of Engineering and Technology, University of Washington - Tacoma

FEEDBACK - 3

 Explain the difference between unmaskable (interrupt) and maskable (interrupt):  Define unmaskable  Define maskable

 What happens if a non-maskable interrupt lock in interrupted by a higher priority (interrupt) ? . Only maskable can be interrupted . Non maskable interrupt handles are not pre-emptible

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.4 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.2 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

FEEDBACK – 4

 What is “preemptibility” again? . OS: ability to be interrupted and replaced by a higher priority task

 What does “no ” mean for the Shortest Job First (SJF) CPU scheduler?

 Why is there no overhead / context switching for the FIFO scheduler? . Does the FIFO scheduler context switch? . Revisit example

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.5 School of Engineering and Technology, University of Washington - Tacoma

FEEDBACK - 5

 Where is the C tutorial?

 Link:  http://faculty.washington.edu/wlloyd/courses/tcss422/a ssignments/TCSS422_w2019_C_Tutorial.pdf

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.6 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.3 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

OBJECTIVES

 Active Reading Quiz 1 (Tuesday)  Assignment 0 / Linux Tutorial  C Tutorial  Assignment 1 – Soon…

 CPU Scheduling:  Chapter 7 – Scheduling Introduction  Chapter 8 – Multi-level Feedback Queue (MLFQ)  Chapter 9 – Proportional Share Scheduler Linux Completely Fair Scheduler (CFS)

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.7 School of Engineering and Technology, University of Washington - Tacoma

CHAPTER 7- SCHEDULING: INTRODUCTION

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma L5.8

Slides by Wes J. Lloyd L5.4 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

SCHEDULING METRICS

 Metrics: A standard measure to quantify to what degree a system possesses some property. Metrics provide repeatable techniques to quantify and compare systems.  Measurements are the numbers derived from the application of metrics

 Scheduling Metric #1: Turnaround time  The time at which the job completes minus the time at which the job arrived in the system

푻풕풖풓풏풂풓풐풖풏풅 = 푻풄풐풎풑풍풆풕풊풐풏 − 푻풂풓풓풊풗풂풍

 How is turnaround time different than execution time?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.9 School of Engineering and Technology, University of Washington - Tacoma

SCHEDULING METRICS - 2

 Scheduling Metric #2: Fairness . Jain’s fairness index . Quantifies if jobs receive a fair share of system resources

 n processes

 xi is time share of each process  worst case = 1/n  best case = 1

 Consider n=3, worst case = .333, best case=1

 With n=3 and x1=.2, x2=.7, x3=.1, fairness=.62  With n=3 and x1=.33, x2=.33, x3=.33, fairness=1

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.10 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.5 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

SCHEDULERS

 FIFO: first in, first out . Very simple, easy to implement

 Consider . 3 x 10sec jobs, arrival: A B C, duration 10 sec each

ퟏퟎ + ퟐퟎ + ퟑퟎ 푨풗풆풓풂품풆 풕풖풓풏풂풓풐풖풏풅 풕풊풎풆 = = ퟐퟎ 풔풆풄 ퟑ

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.11 School of Engineering and Technology, University of Washington - Tacoma

SJF: SHORTEST JOB FIRST

 Given that we know execution times in advance: . Run in order of duration, shortest to longest . Non preemptive scheduler . This is not realistic . Arrival: A B C, duration a=100 sec, b/c=10sec

ퟏퟎ + ퟐퟎ + ퟏퟐퟎ 푨풗풆풓풂품풆 풕풖풓풏풂풓풐풖풏풅 풕풊풎풆 = = ퟓퟎ 풔풆풄 ퟑ

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.12 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.6 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

SJF: WITH RANDOM ARRIVAL

 If jobs arrive at any time: duration a=100s, b/c=10s  A @ t=0sec, B @ t=10sec, C @ t=10sec

ퟏퟎퟎ + ퟏퟏퟎ − ퟏퟎ +(ퟏퟐퟎ − ퟏퟎ) 푨풗풆풓풂품풆 풕풖풓풏풂풓풐풖풏풅 풕풊풎풆 = = ퟏퟎퟑ. ퟑퟑ 풔풆풄 ퟑ

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.13 School of Engineering and Technology, University of Washington - Tacoma

STCF - 2

 Consider: duration a=100sec, b/c=10sec

. Alen=100 Aarrival=0

. Blen=10, Barrival=10, Clen=10, Carrival=10

(ퟏퟐퟎ − ퟎ)+ ퟐퟎ − ퟏퟎ +(ퟑퟎ − ퟏퟎ) 푨풗풆풓풂품풆 풕풖풓풏풂풓풐풖풏풅 풕풊풎풆 = = ퟓퟎ 풔풆풄 ퟑ

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.14 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.7 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

SCHEDULING METRICS - 3

 Scheduling Metric #3: Response Time  Time from when job arrives until it starts execution

푻풓풆풔풑풐풏풔풆 = 푻풇풊풓풔풕풓풖풏 − 푻풂풓풓풊풗풂풍

 STCF, SJF, FIFO . can perform poorly with respect to response time

What scheduling algorithm(s) can help minimize response time?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.15 School of Engineering and Technology, University of Washington - Tacoma

RR: ROUND ROBIN

 Run each job awhile, then switch to another distributing the CPU evenly (fairly)  Scheduling Quantum is called a time slice  Time slice mustRR is befair, but performs poorly on metrics a multiple of the such as turnaround time timer interrupt period.

Scheduling Quantum = 5 seconds

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.16 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.8 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

RR EXAMPLE

 ABC arrive at time=0, each run for 5 seconds OVERHEAD not considered

ퟎ + ퟓ + ퟏퟎ 푻 = = ퟓ풔풆풄 풂풗풆풓풂품풆 풓풆풔풑풐풏풔풆 ퟑ

ퟎ + ퟏ + ퟐ 푻 = = ퟏ풔풆풄 풂풗풆풓풂품풆 풓풆풔풑풐풏풔풆 ퟑ

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.17 School of Engineering and Technology, University of Washington - Tacoma

ROUND ROBIN: TRADEOFFS

Short Time Slice Long Time Slice

Fast Response Time Slow Response Time

High overhead from Low overhead from context switching context switching

 Time slice impact: .Turnaround time (for earlier example): ts(1,2,3,4,5)=14,14,13,14,10 .Fairness: round robin is always fair, J=1

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.18 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.9 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

SCHEDULING WITH I/O

 STCF scheduler . A: CPU=50ms, I/O=40ms, 10ms intervals . B: CPU=50ms, I/O=0ms . Consider A as 10ms subjobs (CPU, then I/O)

 Without considering I/O:

CPU utilization= 100/140=71%

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.19 School of Engineering and Technology, University of Washington - Tacoma

SCHEDULING WITH I/O - 2

 When a job initiates an I/O request . A is blocked, waits for I/O to compute, frees CPU . STCF scheduler assigns B to CPU  When I/O completes  raise interrupt . Unblock A, STCF goes back to executing A: (10ms sub-job)

Cpu utilization = 100/100=100%

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.20 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.10 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma L5.21

CHAPTER 8 – MULTI-LEVEL FEEDBACK QUEUE (MLFQ) SCHEDULER

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma L5.22

Slides by Wes J. Lloyd L5.11 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

MULTI-LEVEL FEEDBACK QUEUE

 Objectives: .Improve turnaround time: Run shorter jobs first

.Minimize response time: Important for interactive jobs (UI)

 Achieve without a priori knowledge of job length

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.23 School of Engineering and Technology, University of Washington - Tacoma

MLFQ - 2 Round-Robin within a Queue

 Multiple job queues  Adjust job priority based on observed behavior

 Interactive Jobs . Frequent I/O  keep priority high . Interactive jobs require fast response time (GUI/UI)

 Batch Jobs . Require long periods of CPU utilization . Keep priority low

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.24 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.12 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

MLFQ: DETERMINING JOB PRIORITY

 New arriving jobs are placed into highest priority queue

 If a job uses its entire time slice, priority is reduced (↓) . Jobs appears CPU-bound ( “batch” job), not interactive (GUI/UI)

 If a job relinquishes the CPU for I/O priority stays the same

MLFQ approximates SJF

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.25 School of Engineering and Technology, University of Washington - Tacoma

MLFQ: LONG RUNNING JOB

 Three-queue scheduler, time slice=10ms

Priority

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.26 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.13 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

MLFQ: BATCH AND INTERACTIVE JOBS

 Aarrival_time =0ms, Arun_time=200ms,

 Brun_time =20ms, Barrival_time =100ms

Priority

Scheduling multiple jobs (ms)

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.27 School of Engineering and Technology, University of Washington - Tacoma

MLFQ: BATCH AND INTERACTIVE - 2

 Continuous interactive job (B) with long running batch job (A) . Low response time is good for B . A continues to make progress

The MLFQ approach keeps interactive job(s) at the highest priority

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.28 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.14 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

MLFQ: ISSUES

 Starvation

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.29 School of Engineering and Technology, University of Washington - Tacoma

MLFQ: ISSUES - 2

 Gaming the scheduler . Issue I/O operation at 99% completion of the time slice . Keeps job priority fixed – never lowered

 Job behavioral change . CPU/batch process becomes an interactive process

Priority becomes stuck

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.30 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.15 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

RESPONDING TO BEHAVIOR CHANGE

Starvation

 Priority Boost . Reset all jobs to topmost queue after some time interval S

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.31 School of Engineering and Technology, University of Washington - Tacoma

RESPONDING TO BEHAVIOR CHANGE - 2

 With priority boost . Prevents starvation

With

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.32 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.16 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

KEY TO UNDERSTANDING MLFQ – PB

 Without priority boost:

 Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).  Rule 2: If Priority(A) = Priority(B), A & B run in RR.

 KEY: If time quantum of a higher queue is filled, then we don’t run any jobs in lower priority queues!!!

TCSS422: Operating Systems [Fall 2018] January 23, 2019 L5.33 School of Engineering and Technology, University of Washington - Tacoma

STARVATION EXAMPLE

 Consider 3 queues:  Q2 – HIGH PRIORITY – Time Quantum 10ms  Q1 – MEDIUM PRIORITY – Time Quantum 20 ms  Q0 – LOW PRIORITY – Time Quantum 40 ms

 Job A: 200ms no I/O  Job B: 5ms then I/O  Job C: 5ms then I/O

 Q2 fills up, Starvation starves Q1 & Q0

 A makes no progress

TCSS422: Operating Systems [Fall 2018] January 23, 2019 L5.34 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.17 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

PREVENTING GAMING

 Improved time accounting: . Track total job execution time in the queue . Each job receives a fixed time allotment . When allotment is exhausted, job priority is lowered

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.35 School of Engineering and Technology, University of Washington - Tacoma

MLFQ: TUNING

 Consider the tradeoffs: . How many queues? . What is a good time slice? . How often should we “Boost” priority of jobs? . What about different time slices to different queues?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.36 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.18 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

PRACTICAL EXAMPLE

 Oracle Solaris MLFQ implementation . 60 Queues  w/ slowly increasing time slice (high to low priority) . Provides sys admins with set of editable table(s) . Supports adjusting time slices, boost intervals, priority changes, etc.

 Advice . Provide OS with hints about the process . Nice command  Linux

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.37 School of Engineering and Technology, University of Washington - Tacoma

MLFQ RULE SUMMARY

 The refined set of MLFQ rules:  Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).  Rule 2: If Priority(A) = Priority(B), A & B run in RR.  Rule 3: When a job enters the system, it is placed at the highest priority.  Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced(i.e., it moves down on queue).  Rule 5: After some time period S, move all the jobs in the system to the topmost queue.

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.38 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.19 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma L5.39

EXAMPLE

 Question:  Given a system with a quantum length of 10 ms in its highest queue, how often would you have to boost jobs back to the highest priority level to guarantee that a single long-running (and potentially starving) job gets at least 5% of the CPU?

 Some combination of n short jobs runs for a total of 10 ms per cycle without relinquishing the CPU . E.g. 2 jobs = 5 ms ea; 3 jobs = 3.33 ms ea, 10 jobs = 1 ms ea . n jobs always uses full time quantum (10 ms) . Batch jobs starts, runs for full quantum of 10ms . All other jobs run and context switch totaling the quantum per cycle . If 10ms is 5% of the CPU, when must the priority boost be ??? . ANSWER  Priority boost should occur every 200ms

TCSS422: Operating Systems [Fall 2018] January 23, 2019 L5.40 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.20 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

CHAPTER 9 - PROPORTIONAL SHARE SCHEDULER

TCSS422: Operating Systems [Winter 2019] January 23, 2019 School of Engineering and Technology, University of Washington - Tacoma L5.41

PROPORTIONAL SHARE SCHEDULER

 Also called fair-share scheduler or lottery scheduler . Guarantees each job receives some percentage of CPU time based on share of “tickets” . Each job receives an allotment of tickets . % of tickets corresponds to potential share of a resource . Can conceptually schedule any resource this way . CPU, disk I/O, memory

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.42 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.21 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

LOTTERY SCHEDULER

 Simple implementation . Just need a random number generator . Picks the winning ticket . Maintain a data structure of jobs and tickets (list) . Traverse list to find the owner of the ticket . Consider sorting the list for speed

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.43 School of Engineering and Technology, University of Washington - Tacoma

LOTTERY SCHEDULER IMPLEMENTATION

1 // counter: used to track if we’ve found the winner yet 2 int counter = 0; 3 4 // winner: use some call to a random number generator to 5 // get a value, between 0 and the total # of tickets 6 int winner = getrandom(0, totaltickets); 7 8 // current: use this to walk through the list of jobs 9 node_t *current = head; 10 11 // loop until the sum of ticket values is > the winner 12 while (current) { 13 counter = counter + current->tickets; 14 if (counter > winner) 15 break; // found the winner 16 current = current->next; 17 } 18 // ’current’ is the winner: schedule it...

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.44 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.22 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

TICKET MECHANISMS

 Ticket currency / exchange . User allocates tickets in any desired way . OS converts user currency into global currency

 Example: . There are 200 global tickets assigned by the OS

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.45 School of Engineering and Technology, University of Washington - Tacoma

TICKET MECHANISMS - 2

 Ticket transfer . Temporarily hand off tickets to another process

 Ticket inflation . Process can temporarily raise or lower the number of tickets it owns . If a process needs more CPU time, it can boost tickets.

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.46 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.23 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

LOTTERY SCHEDULING

 Scheduler picks a winning ticket . Load the job with the winning ticket and run it

 Example: . Given 100 tickets in the pool . Job A has 75 tickets: 0 - 74 . Job B has 25 tickets: 75 – 99

Scheduled job:

 But what do we know about probability of a coin flip?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.47 School of Engineering and Technology, University of Washington - Tacoma

COIN FLIPPING

 Equality of distribution (fairness) requires a lot of flips!

Similarly, Lottery scheduling requires lots of “rounds” to achieve fairness.

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.48 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.24 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

LOTTERY FAIRNESS

 With two jobs . Each with the same number of tickets (t=100)

When the job length is not very long, average unfairness can be quite severe.

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.49 School of Engineering and Technology, University of Washington - Tacoma

LOTTERY SCHEDULING CHALLENGES

 What is the best approach to assign tickets to jobs? . Typical approach is to assume users know best . Users are provided with tickets, which they allocate as desired

 How should the OS automatically distribute tickets upon job arrival? . What do we know about incoming jobs a priori ? . Ticket assignment is really an open problem…

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.50 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.25 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

STRIDE SCHEDULER

 Addresses statistical probability issues with lottery scheduling

 Instead of guessing a random number to select a job, simply count…

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.51 School of Engineering and Technology, University of Washington - Tacoma

STRIDE SCHEDULER - 2

 Jobs have a “stride” value . A stride value describes the counter pace when the job should give up the CPU . Stride value is inverse in proportion to the job’s number of tickets (more tickets = smaller stride)

 Total system tickets = 10,000

. Job A has 100 tickets  Astride = 10000/100 = 100 stride

. Job B has 50 tickets  Bstride = 10000/50 = 200 stride

. Job C has 250 tickets  Cstride = 10000/250 = 40 stride

 Stride scheduler tracks “pass” values for each job (A, B, C)

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.52 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.26 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

STRIDE SCHEDULER - 3

 Basic algorithm: 1. Stride scheduler picks job with the lowest pass value 2. Scheduler increments job’s pass value by its stride and starts running 3. Stride scheduler increments a counter 4. When counter exceeds pass value of current job, pick a new job (go to 1)

 KEY: When the counter reaches a job’s “PASS” value, the scheduler passes on to the next job…

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.53 School of Engineering and Technology, University of Washington - Tacoma

STRIDE SCHEDULER - EXAMPLE

 Stride values .Tickets = priority to select job .Stride is inverse to tickets .Lower stride = more chances to run (higher priority)

Priority C stride = 40 A stride = 100 B stride = 200

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.54 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.27 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

STRIDE SCHEDULER EXAMPLE - 2

 Three-way tie: randomly pick job A (all pass values=0)  Set A’s pass value to A’s stride = 100  Increment counter until > 100 Tickets C = 250  Pick a new job: two-way tie A = 100 B = 50

Initial job selection is random. All @ 0

C has the most tickets and receives a lot of opportunities to run…

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.55 School of Engineering and Technology, University of Washington - Tacoma

STRIDE SCHEDULER EXAMPLE - 3

 We set A’s counter (pass value) to A’s stride = 100  Next scheduling decision between B (pass=0) and C (pass=0) . Randomly choose B Tickets  C has the lowest counter for next 3 rounds C = 250 A = 100 B = 50

C has the most tickets and is selected to run more often …

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.56 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.28 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

STRIDE SCHEDULER EXAMPLE - 4

 Job counters support determining which job to run next  Over time jobs are scheduled to run based on their priority represented as their share of tickets… Tickets  Tickets are analogous to job priority C = 250 A = 100 B = 50

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.57 School of Engineering and Technology, University of Washington - Tacoma

LINUX: COMPLETELY FAIR SCHEDULER (CFS)

 Linux ≥ 2.6.23: Completely Fair Scheduler (CFS)  Linux < 2.6.23: O(1) scheduler

 Every /process has a scheduling class (policy):  Normal classes: SCHED_OTHER (TS), SCHED_IDLE, SCHED_BATCH . TS = Time Sharing  Real-time classes: SCHED_FIFO (FF), SCHED_RR (RR)

 Show scheduling class and priority:  ps –elfc  ps ax -o pid,ni,cls,pri,cmd

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.58 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.29 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

COMPLETELY FAIR SCHEDULER - 2

 Loosely based on the stride scheduler  CFS models system as a Perfect Multi-Tasking System . In perfect system every process of the same priority (class) receive exactly 1/nth of the CPU time  Scheduling classes each have a runqueue . Groups process of same priority . Process priority groups use different sets of runqueues for priorities . Scheduler picks task with lowest accumulative runtime to run . Time quantum varies based on how many jobs in shared runqueue . Time quantum is proportional to system CPU load in the runqueue . No fixed time quantum (e.g. 10 ms) TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.59 School of Engineering and Technology, University of Washington - Tacoma

COMPLETELY FAIR SCHEDULER – 3

 Runqueues are stored using a linux red-black tree . Self balancing binary tree - nodes indexed by vruntime  Leftmost node has lowest vruntime (approx execution time)  Walking tree to find left most node is ~O(log N) for N nodes  Completed processes removed

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.60 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.30 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

COMPLETELY FAIR SCHEDULER - 4

 CFS tracks virtual run time in vruntime variable  The task on a given runqueue with the lowest vruntime is scheduled next  struct sched_entity contains vruntime parameter . Describes process execution time in nanoseconds . Value is not pure runtime, but weighted based on priority

. Perfect scheduler  achieve equal vruntime for all processes of same priority

 Key takeaway identifying the next job to schedule is really fast!

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.61 School of Engineering and Technology, University of Washington - Tacoma

CFS: JOB PRIORITY

 Time slice: Linux “Nice value” . Nice value predates the CFS scheduler . Top shows nice values . Process command (nice & priority): ps ax -o pid,ni,cmd,%cpu, pri

 Nice Values: from -20 to 19 . Lower is higher priority, default is 0 . Vruntime is a weighted time measurement . Priority weights the calculation of vruntime within a runqueue to give high priority jobs a boost. . Influences job’s position in rb-tree

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.62 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.31 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

CFS: TIME QUANTUM

 Scheduling quantum is calculated at runtime based on targeted latency and total number of running processes

 Will vary between:  cat /proc/sys/kernel/sched_min_granularity_ns (3 ms – minimum quantum)  cat /proc/sys/kernel/sched_latency_ns (24 ms – target quantum)

 Target quantum (latency): . Interval during which task should run at least once . Automatically increases as number of jobs increase

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.63 School of Engineering and Technology, University of Washington - Tacoma

CFS: TIME QUANTUM - 2

 How do we map a nice value to an actual CPU time quantum (timeslice) (ms)? What is the best mapping?

 O(1) scheduler (< 2.6.23) . tried to map nice value to timeslice (fixed allotment)

 Linux completely fair scheduler . Nice value suggests priority to assign runqueue for job . Time proportion varies based on # of jobs in runqueue . With fewer jobs in runqueue, time proportion is larger

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.64 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.32 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

COMPLETELY FAIR SCHEDULER - 5

 More information:

 Man page: “man sched” : Describes Linux scheduling API  http://manpages.ubuntu.com/manpages/bionic/man7/sched. 7.html

 https://www.kernel.org/doc/Documentation/scheduler/sched- design-CFS.txt  https://en.wikipedia.org/wiki/Completely_Fair_Scheduler

 See paper: The Linux Scheduler – a Decade of Wasted Cores  http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.65 School of Engineering and Technology, University of Washington - Tacoma

QUESTIONS

Slides by Wes J. Lloyd L5.33 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

COMPUTER BOOT SEQUENCE: OS WITH DIRECT EXECUTION

 What if programs could directly control the CPU / system?

OS Program 1. Create entry for process list 2. Allocate memory for program 3. Load program into memory 4. Set up stack with argc / argv 5. Clear registers 7. Run main() 6. Execute call main() 8. Execute return from main()

9. Free memory of process 10. Remove from process list

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.67 School of Engineering and Technology, University of Washington - Tacoma

COMPUTER BOOT SEQUENCE: OS WITH DIRECT EXECUTION

 What if programs could directly control the CPU / system?

OS Program 1. Create entry for process list 2. Allocate memory for program Without limits on running programs, 3. Load programthe OSinto wouldn’t memory be in control of anything 4. Set up stack withand argc would/ “just be a library” argv 5. Clear registers 7. Run main() 6. Execute call main() 8. Execute return from main()

9. Free memory of process 10. Remove from process list

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.68 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.34 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

DIRECT EXECUTION - 2

 With direct execution:

How does the OS stop a program from running, and switch to another to support time sharing?

How do programs share disks and perform I/O if they are given direct control? Do they know about each other?

With direct execution, how can dynamic memory structures such as linked lists grow over time?

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.69 School of Engineering and Technology, University of Washington - Tacoma

CONTROL TRADEOFF

 Too little control: . No security . No time sharing

 Too much control: . Too much OS overhead . Poor performance for compute & I/O . Complex (system calls), difficult to use

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.70 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.35 TCSS 422 A – Winter 2019 1/22/2019 School of Engineering and Technology

CONTEXT SWITCHING OVERHEAD

Overhead

Time

TCSS422: Operating Systems [Winter 2019] January 23, 2019 L5.71 School of Engineering and Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L5.36