Goals for Today

CS194-24 • Tips for Programming in a Design Team Advanced Operating Systems • Synchronization (continued) Structures and Implementation – Lock Free Synchronization Lecture 9 – Monitors

How to work in a group / Interactive is important! Synchronization (finished) Ask Questions!

February 24th, 2014 Prof. John Kubiatowicz http://inst.eecs.berkeley.edu/~cs194-24 Note: Some slides and/or pictures in the following are adapted from slides ©2013

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.2

Recall: Synchronization Recall: Atomic Instructions • test&set (&address) { /* most architectures */ • Atomic Operation: an operation that always runs to result = M[address]; completion or not at all M[address] = 1; return result; – It is indivisible: it cannot be stopped in the middle and } state cannot be modified by someone else in the • swap (&address, register) { /* x86 */ middle temp = M[address]; – Fundamental building block – if no atomic operations, M[address] = register; then have no way for threads to work together register = temp; } • Synchronization: using atomic operations to ensure • compare&swap (&address, reg1, reg2) { /* 68000 */ cooperation between threads if (reg1 == M[address]) { M[address] = reg2; – For now, only loads and stores are atomic return success; } else { – We are going to show that its hard to build anything return failure; useful with only reads and writes } • Critical Section: piece of code that only one thread } • load-linked&store conditional(&address) { can execute at once. Only one thread at a time will /* R4000, alpha */ get into this section of code. loop: ll r1, M[address]; – Critical section is the result of mutual exclusion movi r2, 1; /* Can do arbitrary comp */ – Critical section and mutual exclusion are two ways of sc r2, M[address]; beqz r2, loop; describing the same thing. } 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.3 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.4 Recall: Portable Spinlock constructs in Linux Tips for Programming in a Project Team • Linux provides lots of synchronization constructs • Big projects require more than one – We will highlight them throughout the term person (or long, long, long time) • Example: Spin Lock support: Not recursive! – Big OS: thousands of person-years! – Only a lock on multiprocessors: Becomes simple preemption disable/enable on uniprocessors • It’s very hard to make software project teams work correctly #include DEFINE_SPINLOCK(my_lock); – Doesn’t seem to be as true of big construction projects spin_lock(&my_lock); » Empire state building finished in /* Critical section … */ one year: staging iron production spin_unlock(&my_lock); thousands of miles away • Disable interrupts and grab lock (while saving and restoring » Or the Hoover dam: built towns to state in case interrupts already disabled): hold workers – Is it OK to miss deadlines? DEFINE_SPINLOCK(my_lock); “You just have unsigned long flags; to get your » We make it free (slip days) synchronization right!” » Reality: they’re very expensive as spin_lock_irqsave(&my_lock, flags); /* Critical section … */ time-to-market is one of the most spin_unlock_irqrestore(&my_lock); important things!

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.5 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.6

Big Projects Techniques for Partitioning Tasks • What is a big project? • Functional – Time/work estimation is hard – Person A implements threads, Person B implements – are eternal optimistics semaphores, Person implements locks… (it will only take two days)! – Problem: Lots of communication across APIs » This is why we bug you about starting the project early » If B changes the API, A may need to make changes » Had a grad student who used to say he just needed » Story: Large airline company spent $200 million on a new “10 minutes” to fix something. Two hours later… scheduling and booking system. Two teams “working together.” After two years, went to merge software. • Can a project be efficiently partitioned? Failed! Interfaces had changed (documented, but no one – Partitionable task decreases in time as noticed). Result: would cost another $200 million to fix. you add people • Task – But, if you require communication: – Person A designs, Person B writes code, Person C tests » Time reaches a minimum bound – May be difficult to find right balance, but can focus on » With complex interactions, time increases! each person’s strengths (Theory vs systems hacker) – Mythical person-month problem: – Since Debugging is hard, Microsoft has two testers for » You estimate how long a project will take each » Starts to fall behind, so you add more people • Most Berkeley project teams are functional, but » Project takes even more time! people have had success with task-based divisions 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.7 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.8 Communication Coordination  • More people mean more communication • More people no one can make all meetings! – Changes have to be propagated to more people – They miss decisions and associated discussion – Think about person writing code for most – Example from earlier class: one person missed fundamental component of system: everyone depends meetings and did something group had rejected on them! – Why do we limit groups to 5 people? • Miscommunication is common » You would never be able to schedule meetings otherwise – “Index starts at 0? I thought you said 1!” – Why do we require 4 people minimum? » You need to experience groups to get ready for real world • Who makes decisions? • People have different work styles – Individual decisions are fast but trouble – Some people work in the morning, some at night – Group decisions take time – How do you decide when to meet or work together? – Centralized decisions require a big picture view (someone who can be the “system architect”) • What about project slippage? • Often designating someone as the system architect – It will happen, guaranteed! can be a good thing – Ex: phase 4, everyone busy but not talking. One person – Better not be clueless way behind. No one knew until very end – too late! – Better have good people skills • Hard to add people to existing group – Better let other people do work – Members have already figured out how to work together

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.9 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.10

How to Make it Work? Suggested Documents for You to Maintain • People are human. Get over it. – People will make mistakes, miss meetings, miss • Project objectives: goals, constraints, and priorities deadlines, etc. You need to live with it and adapt • Specifications: the manual plus performance specs – It is better to anticipate problems than clean up afterwards. – This should be the first document generated and the last one finished • Document, document, document – Consider your specifications as one possibility – Why Document? » Expose decisions and communicate to others • Meeting notes » Easier to spot mistakes early – Document all decisions » Easier to estimate progress – You can often cut & paste for the design documents – What to document? • Schedule: What is your anticipated timing? » Everything (but don’t overwhelm people or no one will read) – This document is critical! – Standardize! » One programming format: variable naming conventions, tab • Organizational Chart indents,etc. – Who is responsible for what task? » Comments (Requires, effects, modifies)—javadoc?

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.11 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.12 Use Software Tools Test Continuously

• Integration tests all the time, not at 11pm • Source revision control software on due date! – Git – check in frequently. Tag working code – Utilize Cucumber features to test frequently! – Easy to go back and see history/undo mistakes – Write dummy stubs with simple functionality – Work on independent branches for each feature » Let’s people test continuously, but more work » Merge working features into your master branch – Schedule periodic integration tests » Consider using “rebase” as well » Get everyone in the same room, check out code, build, and – Communicates changes to everyone test. » Don’t wait until it is too late! • Redmine » This is exactly what the autograder does! – Make use of Bug reporting and tracking features! • Testing types: – Use Wiki for communication between teams – Integration tests: Use of Cucumber with BDD – Also, consider setting up a forum to leave information – Unit tests: check each module in isolation (CUnit) for one another about the current state of the design – Daemons: subject code to exceptional cases • Use automated testing tools – Random testing: Subject code to random timing changes – Rebuild from sources frequently • Test early, test later, test again – Run Cucumber tests frequently – Tendency is to test once and forget; what if something » Use tagging features to run subsets of tests! changes in some other part of the code?

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.13 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.14

Administrivia Computers in the News: Net Neutrality? • Net Neutrality: • Recall: Midterm I: 2½ weeks from today! – “The principle that Internet service providers and – Wednesday 3/12 governments should treat all data on the Internet equally, not discriminating or charging differentially by user, content, – Intention is a 1.5 hour exam over 3 hours site, platform, application, type of attached equipment, and – No class on day of exam! modes of communication.” From Wikipedia • Midterm Timing: • Do we have Net Neutrality? – Probably 4:00-7:00PM in 320 Soda Hall – FCC has issued “rules” or “policies”, but these do not have a lot of teeth – Does this work? (Still confirming room) – Supporters want internet service providers to be declare as • Topics: everything up to the previous Monday “common carriers” – OS Structure, BDD, Process support, Synchronization, » A common carrier holds itself out to provide service to the general public without discrimination for the "public Memory Management, File systems convenience and necessity". • In the news: Netflix – Comcast and/or others reported to have dropped Netflix Bandwidth intentionally – Just announced: new “agreements” with Comcast and Verizon to get more bandwidth (i.e. pay more for access!)

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.15 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.16 Take 2: Mellor-Crummey-Scott Lock Mellor-Crummey-Scott Lock (con’t) • Another lock-free option – Mellor-Crummey Scott (MCS): public void unlock() { QNode qnode = myNode.get(); public void lock() { if (qnode.next == null) { QNode qnode = myNode.get(); if (compare&swap(tail,qnode,null)) qnode.next = null; return; QNode pred = swap(tail,qnode); // wait until predecessor fills in my next field if (pred != null) { while (qnode.next == null); qnode.locked = true; } pred.next = qnode; qnode.next.locked = false; while (qnode.locked); // wait for predecessor } } } pred 1 pred 2 pred 3 pred 1 pred 2 pred 3

tail next 1 next 2 next 3 tail next 1 next 2 next 3 unlocked unlockedlocked unlockedlocked unlocked unlockedlocked unlockedlocked

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.17 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.18

Mellor-Crummey-Scott Lock (Con’t) Recall: Portable Semaphores in Linux

• Nice properties of MCS Lock • Initialize general semaphore: – Never more than 2 processors spinning on one address struct semapore my_sem; sema_init(&my_sem, count); /* Initialize semaphore */ – Completely fair – once on queue, are guaranteed to get your turn in FIFO order • Initialize mutex (semaphore with count = 1) » Alternate release procedure doesn’t use compare&swap but doesn’t guarantee FIFO order static DECLARE_MUTEX(my_mutex); • Bad properties of MCS Lock • Acquire semaphore: – Takes longer (more instructions) than T&T&S if no down_interruptible(&my_sem); // Acquire sem, sleeps with contention // TASK_INTERRUPTIBLE state – Releaser may be forced to spin in rare circumstances down(&my_sem); // Acquire sem, sleeps with // TASK_UNINTERRUPTIBLE state • Hardware support? down_trylock(&my_sem); // Return ≠ 0 if lock busy – Some proposed hardware queueing primitives such as • Release semaphore: QOLB (Queue on Lock Bit) up(&my_sem); // Release lock – Not broadly available

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.19 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.20 Recall: Portable (native) mutexes in Linux Recall: Completion Patterns • Interface optimized for mutual exclusion • One use pattern that does not fit mutex pattern: DEFINE_MUTEX(my_mutex); /* Static definition */ – Start operation in another thread/hardware container struct mutex my_mutex; – Sleep until woken by completion of event mutex_init(&my_mutex); /* Dynamic mutex init */ • Can be implemented with semaphores – Start semaphore with count of 0 • Simple use pattern: – Immediate down() – puts parent to sleep mutex_lock(&my_mutex); – Woken with up() /* critical section … */ mutex_unlock(&my_mutex); • More efficient: use “completions”: • Constraints DEFINED_COMPLETION(); /* Static definition */ – Same thread that grabs lock must release it struct completion my_comp; – pProcess cannot exit while holding mutex init_completion(&my_comp); /* Dynamic comp init */ – Recursive acquisitions not allowed – Cannot use in interrupt handlers (might sleep!) • One or more threads to sleep on event: • Advantages wait_for_completion(&my_comp); /* put thead to sleep */ – Simpler interface – Debugging support for checking usage • Wake up threads (can be in interrupt handler!) • Prefer mutexes over semaphores unless mutex really doesn’t fit your pattern! complete(&my_comp);

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.21 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.22

Definition of Monitor Programming with Monitors • Semaphores are confusing because dual purpose: • Monitors represent the logic of the program – Both mutual exclusion and scheduling constraints – Wait if necessary – Cleaner idea: Use locks for mutual exclusion and – Signal when change something so any waiting threads condition variables for scheduling constraints can proceed • Monitor: a lock and zero or more condition variables • Basic structure of monitor-based program: for managing concurrent access to shared data lock while (need to wait) { Check and/or update – Use of Monitors is a programming paradigm condvar.wait(); state variables • Lock: provides mutual exclusion to shared data: } Wait if necessary unlock – Always acquire before accessing shared data structure do something so no need to wait – Always release after finishing with shared data • Condition Variable: a queue of threads waiting for lock something inside a critical section condvar.signal(); Check and/or update – Key idea: allow sleeping inside critical section by state variables atomically releasing lock at time we go to sleep unlock – Contrast to semaphores: Can’t wait inside critical section

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.23 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.24 Readers/Writers Problem Basic Readers/Writers Solution W • Correctness Constraints: – Readers can access database when no writers – Writers can access database when no readers or writers R – Only one thread manipulates state variables at a time R • Basic structure of a solution: R – Reader() Wait until no writers Access data base Check out – wake up a waiting writer – Writer() • Motivation: Consider a shared database Wait until no active readers or writers Access database – Two classes of users: Check out – wake up waiting readers or writer » Readers – never modify database – State variables (Protected by a lock called “lock”): » Writers – read and modify database » int AR: Number of active readers; initially = 0 » int WR: Number of waiting readers; initially = 0 – Is using a single lock on the whole database sufficient? » int AW: Number of active writers; initially = 0 » Like to have many readers at the same time » int WW: Number of waiting writers; initially = 0 » Only one writer at a time » Condition okToRead = NIL » Conditioin okToWrite = NIL

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.25 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.26

Code for a Reader Code for a Writer Reader() { Writer() { // First check self into system // First check self into system lock.Acquire(); lock.Acquire(); while ((AW + AR) > 0) { // Is it safe to write? while ((AW + WW) > 0) { // Is it safe to read? WW++; // No. Active users exist WR++; // No. Writers exist okToWrite.wait(&lock); // Sleep on cond var okToRead.wait(&lock); // Sleep on cond var WW--; // No longer waiting WR--; // No longer waiting } } AW++; // Now we are active! AR++; // Now we are active! lock.release(); lock.release(); // Perform actual read/write access AccessDatabase(ReadWrite); // Perform actual read-only access AccessDatabase(ReadOnly); // Now, check out of system lock.Acquire(); // Now, check out of system AW--; // No longer active lock.Acquire(); if (WW > 0){ // Give priority to writers AR--; // No longer active okToWrite.signal(); // Wake up one writer if (AR == 0 && WW > 0) // No other active readers } else if (WR > 0) { // Otherwise, wake reader okToWrite.signal(); // Wake up one writer okToRead.broadcast(); // Wake all readers } lock.Release(); lock.Release(); } } 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.27 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.28 pThreads Monitors (Mutex + Condition Vars) C-Language Support for Synchronization

• To create a mutex: • C language: Pretty straightforward synchronization pthread_mutex_t amutex = PTHREAD_MUTEX_INITIALIZER; – Just make sure you know all the code paths out of a pthread_mutex_init(&amutex, NULL); critical section int Rtn() { Proc A Stack growth • To use it: lock.acquire(); int pthread_mutex_lock(amutex); … Proc B int pthread_mutex_unlock(amutex); if (exception) { Calls setjmp lock.release(); return errReturnCode; Proc C • To create condition variables: } lock.acquire … pthread_cond_t myconvar = PTHREAD_COND_INITIALIZER; Proc D pthread_cond_init(mycondvar, NULL); lock.release(); return OK; • To use them: } Proc E Calls longjmp pthread_cond_wait(mycondvar,amutex); – Watch out for setjmp/longjmp! pthread_cond_signal(mycondvar); » Can cause a non-local jump out of procedure pthread_cond_broadcast(mycondvar); » In example, procedure E calls longjmp, poping stack back to procedure B » If Procedure C had lock.acquire, problem!

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.29 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.30

C++ Language Support for Synchronization (con’t) Problem: Busy-Waiting for Lock • Must catch all exceptions in critical sections • Positives for this solution – Catch exceptions, release lock, and re-throw exception: – Machine can receive interrupts void Rtn() { – User code can use this lock lock.acquire(); – Works on a multiprocessor try { … • Negatives DoFoo(); – This is very inefficient because the busy-waiting … thread will consume cycles waiting } catch (…) { // catch exception – Waiting thread may take cycles away from thread lock.release(); // release lock throw; // re-throw the exception holding lock (no one wins!) } – Priority Inversion: If busy-waiting thread has higher lock.release(); priority than thread holding lock  no progress! } • Priority Inversion problem with original Martian rover void DoFoo() { … • For semaphores and monitors, waiting thread may if (exception) throw errException; wait for an arbitrary length of time! … – Thus even if busy-waiting was OK for locks, definitely } not ok for other primitives – Even Better: auto_ptr facility. See C++ Spec. – Exam solutions should not have busy-waiting! » Can deallocate/free lock regardless of exit method

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.31 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.32 Busy-wait vs Blocking What about barriers? • Busy-wait: I.e. spin lock • Barrier – global (/coordinated) synchronization – Keep trying to acquire lock until read – simple use of barriers -- all threads hit the same one – Very low latency/processor overhead! work_on_my_subgrid(); barrier(); – Very high system overhead! read_neighboring_values(); » Causing stress on network while spinning barrier(); » Processor is not doing anything else useful – barriers are not provided in all thread libraries • Blocking: • How to implement barrier? – If can’t acquire lock, deschedule process (I.e. unload state) – Global counter representing number of threads still – Higher latency/processor overhead (1000s of cycles?) waiting to arrive and parity representing phase » Takes time to unload/restart task » Initialize counter to zero, set parity variable to “even” » Notification mechanism needed » Each thread that enters saves parity variable and – Low system overheadd • Atomically increments counter if even » No stress on network • Atomically decrements counter if odd » Processor does something useful » If counter not at extreme value spin until parity changes • Hybrid: • i.e. Num threads if “even” or zero if “odd” » Else, flip parity, exit barrier – Spin for a while, then block – Better for large numbers of processors – implement – 2-competitive: spin until have waited blocking time atomic counter via combining tree

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.33 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.34

Summary • Important concept: Atomic Operations – An operation that runs to completion or not at all – These are the primitives on which to construct various synchronization primitives • Semaphores: Like integers with restricted interface – Two operations: » P(): Wait if zero; decrement when becomes non-zero » V(): Increment and wake a sleeping task (if exists) » Can initialize value to any non-negative value – Use separate semaphore for each constraint • Monitors: A lock plus one or more condition variables – Always acquire lock before accessing shared data – Use condition variables to wait inside critical section » Three Operations: Wait(), Signal(),andBroadcast() • Readers/Writers – Readers can access database when no writers – Writers can access database when no readers – Only one thread manipulates state variables at a time

2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.35