CS 3204 CPU Oppgyerating Systems

Lecture 14 Part II Godmar Back

Case Study: 2.6 Linux Scheduler Linux Scheduler (2) (pre 2.6.23) 140 nice=19 • Variant of MLFQS Processes scheduled nice=0 • Instead of recomputation loop, recompute • 140 priorities based on 120 priority at end of each timeslice dynamic – 0-99 “realtime” priority 100 nice=-20 – dyn_prio = nice + interactivity bonus (-5…5) SCHED_OTHER – 100-140 nonrealtime • Interactivity bonus depends on sleep_ avg • Dynamic priority “Realtime” processes – measures time a was blocked computed from static scheduled based on priority (nice) plus static priority • 2 priority arrays (“active” & “expired”) in “interactivity bonus” SCHED_FIFO each runqueue (Linux calls ready queues SCHED_RR 0 “runqueue”)

CS 3204 Fall 2008 3 CS 3204 Fall 2008 4

Linux Scheduler (3) Linux Timeslice Computation struct prio_array { /* Per CPU runqueue */ unsigned int nr_active; struct runqueue { • Linux scales static priority to timeslice unsigned long bitmap[BITMAP_SIZE]; prio_array_t *active; struct list_head queue[MAX_PRIO]; prio_array_t *expired; – Nice [ -20 … 0 … 19 ] maps to }; prio_array_t arrays[2]; [800ms … 100 ms … 5ms] typedef struct prio_array prio_array_t; … } • Various tweaks: /* find the hig hes t-priitiority rea dthd*/dy */ idx = sched_find_first_bit(array->bitmap); – “interactive processes” are reinserted into queue = array->queue + idx; next = list_entry(queue->next, task_t, run_list); active array even after timeslice expires • Unless processes in expired array are starving – processes with long timeslices are round- • Finds highest-priority ready thread quickly • Switching active & expired arrays at end of epoch is robin’d with other of equal priority at sub- simple pointer swap (“O(1)” claim) timeslice granularity

CS 3204 Fall 2008 5 CS 3204 Fall 2008 6

1 Proportional Share Scheduling

• Aka “Fair-Share” Scheduling • Idea: number tickets between 1…N • None of algorithms discussed so far provide a – every process gets pi tickets according to importance direct way of assigning CPU shares – process 1 gets tickets [1… p1-1] – E.g., give 30% of CPU to process A, 70% to process – process 2 gets tickets [p1… p1+p2-1] and so on. B • ShdliScheduling dec iiision: • Proportional Share algorithms do by assigning – Hold a lottery and draw ticket, holder gets to run for “tickets” or “shares” to processes next time slice – Process get to use resource in proportion of their shares to total number of shares • Nondeterministic algorithm • Lottery Scheduling, Weighted Fair • Q.: how to implement priority donation? Queuing/Stride Scheduling [Waldspurger 1995]

CS 3204 Fall 2008 7 CS 3204 Fall 2008 8

Weighted Fair Queuing WFQ Example (A=3, B=2, C=1) Ready Queue is sorted by Virtual Finish Time • Uses ‘per process’ virtual time (Virtual Time at end of quantum if a process were scheduled) Who • Increments process’s virtual time by a “stride” Time Task A Task B Task C Ready Queue Runs One after each quantum, which is defined as 0 1/3 1/2 1 A (1/3) B (1/2) C (1) A -1 scheduling (process_share) epoch. 12/31/2 1 B (1/2) A (2/3) C (1) B A ran 3 out • Choose process with lowest virtual finishing of 6 quanta, time 2 2/3 1 1 A (2/3) C(1) B(1) A B 2 out of 6, C 1 out of 6. – ‘virtual finishing time’ is virtual time + stride 3111 C(1) B(1) A(1) C This • Also known as stride scheduling 41 1 2 B(1) A(1) C(2) B process will repeat, • Linux now implements a variant of 5 1 3/2 2 A(1) B(3/2) C(2) A yielding WFQ/Stride Scheduling as its “CFS” proportional completely fair scheduler 6 4/3 3/2 2 A (4/3) B(3/2) C(2) fairness.

CS 3204 Fall 2008 9 CS 3204 Fall 2008 10

WFQ (cont’d) Linux SMP Load Balancing

• WFQ requires a sorted ready queue static void double_rq_lock( • Runqueue is per CPU runqueue_t *rq1, – Linux now uses R/B tree runqueue_t *rq2) – Higher complexity than O(1) linked lists, but appears • Periodically, lengths of { manageable for real-world ready queue sizes if (rq1 == rq2) { • Unblocked processes that reenter the ready queue are runqueues on different spin_lock(&rq1->lock); assigned a virtual time reflecting the value that their virtual } else { time count er would h ave if they’d rece ive d CPU time CPU is compared if (rq1 < rq2) { proportionally spin_lock(&rq1->lock); • Accommodating I/O bound processes still requires fudging – Processes are migrated to spin_lock(&rq2->lock); – In strict WFQ, only way to improve latency is to set number of balance load } else { shares high – but this is disastrous if process is not truly I/O spin_lock(&rq2->lock); bound • Aside: Migrating requires spin_lock(&rq1->lock); – Linux uses “sleeper fairness,” to identify when to boost virtual } time; similar to the sleep average in old scheduler locks on both runqueues } }

CS 3204 Fall 2008 11 CS 3204 Fall 2008 12

2 Real-time Scheduling EDF – Example

• Real-time systems must observe not only execution time, but a deadline as well Task T C A41 – Jobs must finish by deadline B84 – But turn-around time is usually less important C123 • Common scenario are recurringgj jobs Assume deadline equals period (T). – E.g., need 3 ms every 10 ms (here, 10ms is the recurrence period T, 3 ms is the cost C) A • Possible strategies – RMA (Rate Monotonic) B • Map periods to priorities, fixed, static C – EDF (Earliest Deadline First) 05 10152025 • Always run what’s due next, dynamic Hyper-period CS 3204 Fall 2008 13 CS 3204 Fall 2008 14

EDF – Example EDF – Example

Task T C Task T C A41 A41 B84 B84 C123 C123 Assume deadline equals period (T). Assume deadline equals period (T).

A A

B B

C C

05 10152025 05 10152025

CS 3204 Fall 2008 15 CS 3204 Fall 2008 16

EDF – Example EDF – Example

Task T C Task T C A41 A41 B84 B84 C123 C123 Assume deadline equals period (T). Assume deadline equals period (T). Lexical order tie breaker (C > B > A)

A A

B B

C C

05 10152025 05 10152025

CS 3204 Fall 2008 17 CS 3204 Fall 2008 18

3 EDF – Example EDF – Example

Task T C Task T C A41 A41 B84 B84 C123 C123 Assume deadline equals period (T). Assume deadline equals period (T).

A A

B B

C C

05 10152025 05 10152025

CS 3204 Fall 2008 19 CS 3204 Fall 2008 20

EDF – Example EDF – Example

Task T C Task T C A41 A41 B84 B84 C123 C123 Assume deadline equals period (T). Assume deadline equals period (T).

A A

B B

C C

05 10152025 05 10152025 Pattern repeats

CS 3204 Fall 2008 21 CS 3204 Fall 2008 22

EDF Properties Scheduling Summary

• Feasibility test: • OS must schedule all resources in a system – CPU, Disk, Network, etc. • CPU Scheduling affects indirectly scheduling of • U = 100% in example other devices • Bound theoretical • Goals for general purpose schedulers: • Sufficient and necessary – Minimizing latency (avg. completion or waiting time) • Optimal – Maximing throughput – Provide fairness • In Practice: some theory, lots of tweaking

CS 3204 Fall 2008 23 CS 3204 Fall 2008 24

4