Advanced Operating Systems Structures and Implementation
Total Page:16
File Type:pdf, Size:1020Kb
Goals for Today CS194-24 • Tips for Programming in a Design Team Advanced Operating Systems • Synchronization (continued) Structures and Implementation – Lock Free Synchronization Lecture 9 – Monitors How to work in a group / Interactive is important! Synchronization (finished) Ask Questions! February 24th, 2014 Prof. John Kubiatowicz http://inst.eecs.berkeley.edu/~cs194-24 Note: Some slides and/or pictures in the following are adapted from slides ©2013 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.2 Recall: Synchronization Recall: Atomic Instructions • test&set (&address) { /* most architectures */ • Atomic Operation: an operation that always runs to result = M[address]; completion or not at all M[address] = 1; return result; – It is indivisible: it cannot be stopped in the middle and } state cannot be modified by someone else in the • swap (&address, register) { /* x86 */ middle temp = M[address]; – Fundamental building block – if no atomic operations, M[address] = register; then have no way for threads to work together register = temp; } • Synchronization: using atomic operations to ensure • compare&swap (&address, reg1, reg2) { /* 68000 */ cooperation between threads if (reg1 == M[address]) { M[address] = reg2; – For now, only loads and stores are atomic return success; } else { – We are going to show that its hard to build anything return failure; useful with only reads and writes } • Critical Section: piece of code that only one thread } • load-linked&store conditional(&address) { can execute at once. Only one thread at a time will /* R4000, alpha */ get into this section of code. loop: ll r1, M[address]; – Critical section is the result of mutual exclusion movi r2, 1; /* Can do arbitrary comp */ – Critical section and mutual exclusion are two ways of sc r2, M[address]; beqz r2, loop; describing the same thing. } 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.3 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.4 Recall: Portable Spinlock constructs in Linux Tips for Programming in a Project Team • Linux provides lots of synchronization constructs • Big projects require more than one – We will highlight them throughout the term person (or long, long, long time) • Example: Spin Lock support: Not recursive! – Big OS: thousands of person-years! – Only a lock on multiprocessors: Becomes simple preemption disable/enable on uniprocessors • It’s very hard to make software project teams work correctly #include <linux/spinlock.h> DEFINE_SPINLOCK(my_lock); – Doesn’t seem to be as true of big construction projects spin_lock(&my_lock); » Empire state building finished in /* Critical section … */ one year: staging iron production spin_unlock(&my_lock); thousands of miles away • Disable interrupts and grab lock (while saving and restoring » Or the Hoover dam: built towns to state in case interrupts already disabled): hold workers – Is it OK to miss deadlines? DEFINE_SPINLOCK(my_lock); “You just have unsigned long flags; to get your » We make it free (slip days) synchronization right!” » Reality: they’re very expensive as spin_lock_irqsave(&my_lock, flags); /* Critical section … */ time-to-market is one of the most spin_unlock_irqrestore(&my_lock); important things! 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.5 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.6 Big Projects Techniques for Partitioning Tasks • What is a big project? • Functional – Time/work estimation is hard – Person A implements threads, Person B implements – Programmers are eternal optimistics semaphores, Person C implements locks… (it will only take two days)! – Problem: Lots of communication across APIs » This is why we bug you about starting the project early » If B changes the API, A may need to make changes » Had a grad student who used to say he just needed » Story: Large airline company spent $200 million on a new “10 minutes” to fix something. Two hours later… scheduling and booking system. Two teams “working together.” After two years, went to merge software. • Can a project be efficiently partitioned? Failed! Interfaces had changed (documented, but no one – Partitionable task decreases in time as noticed). Result: would cost another $200 million to fix. you add people • Task – But, if you require communication: – Person A designs, Person B writes code, Person C tests » Time reaches a minimum bound – May be difficult to find right balance, but can focus on » With complex interactions, time increases! each person’s strengths (Theory vs systems hacker) – Mythical person-month problem: – Since Debugging is hard, Microsoft has two testers for » You estimate how long a project will take each programmer » Starts to fall behind, so you add more people • Most Berkeley project teams are functional, but » Project takes even more time! people have had success with task-based divisions 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.7 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.8 Communication Coordination • More people mean more communication • More people no one can make all meetings! – Changes have to be propagated to more people – They miss decisions and associated discussion – Think about person writing code for most – Example from earlier class: one person missed fundamental component of system: everyone depends meetings and did something group had rejected on them! – Why do we limit groups to 5 people? • Miscommunication is common » You would never be able to schedule meetings otherwise – “Index starts at 0? I thought you said 1!” – Why do we require 4 people minimum? » You need to experience groups to get ready for real world • Who makes decisions? • People have different work styles – Individual decisions are fast but trouble – Some people work in the morning, some at night – Group decisions take time – How do you decide when to meet or work together? – Centralized decisions require a big picture view (someone who can be the “system architect”) • What about project slippage? • Often designating someone as the system architect – It will happen, guaranteed! can be a good thing – Ex: phase 4, everyone busy but not talking. One person – Better not be clueless way behind. No one knew until very end – too late! – Better have good people skills • Hard to add people to existing group – Better let other people do work – Members have already figured out how to work together 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.9 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.10 How to Make it Work? Suggested Documents for You to Maintain • People are human. Get over it. – People will make mistakes, miss meetings, miss • Project objectives: goals, constraints, and priorities deadlines, etc. You need to live with it and adapt • Specifications: the manual plus performance specs – It is better to anticipate problems than clean up afterwards. – This should be the first document generated and the last one finished • Document, document, document – Consider your Cucumber specifications as one possibility – Why Document? » Expose decisions and communicate to others • Meeting notes » Easier to spot mistakes early – Document all decisions » Easier to estimate progress – You can often cut & paste for the design documents – What to document? • Schedule: What is your anticipated timing? » Everything (but don’t overwhelm people or no one will read) – This document is critical! – Standardize! » One programming format: variable naming conventions, tab • Organizational Chart indents,etc. – Who is responsible for what task? » Comments (Requires, effects, modifies)—javadoc? 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.11 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.12 Use Software Tools Test Continuously • Integration tests all the time, not at 11pm • Source revision control software on due date! – Git – check in frequently. Tag working code – Utilize Cucumber features to test frequently! – Easy to go back and see history/undo mistakes – Write dummy stubs with simple functionality – Work on independent branches for each feature » Let’s people test continuously, but more work » Merge working features into your master branch – Schedule periodic integration tests » Consider using “rebase” as well » Get everyone in the same room, check out code, build, and – Communicates changes to everyone test. » Don’t wait until it is too late! • Redmine » This is exactly what the autograder does! – Make use of Bug reporting and tracking features! • Testing types: – Use Wiki for communication between teams – Integration tests: Use of Cucumber with BDD – Also, consider setting up a forum to leave information – Unit tests: check each module in isolation (CUnit) for one another about the current state of the design – Daemons: subject code to exceptional cases • Use automated testing tools – Random testing: Subject code to random timing changes – Rebuild from sources frequently • Test early, test later, test again – Run Cucumber tests frequently – Tendency is to test once and forget; what if something » Use tagging features to run subsets of tests! changes in some other part of the code? 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.13 2/24/14 Kubiatowicz CS194-24 ©UCB Fall 2014 Lec 9.14 Administrivia Computers in the News: Net Neutrality? • Net Neutrality: • Recall: Midterm I: 2½ weeks from today! – “The principle that Internet service providers and – Wednesday 3/12 governments should treat all data on the Internet equally, not discriminating or charging differentially by user, content, – Intention is a 1.5 hour exam over 3 hours site, platform, application, type of attached equipment, and – No class on day of exam! modes of communication.” From Wikipedia • Midterm Timing: • Do we have Net Neutrality? – Probably 4:00-7:00PM in 320 Soda Hall – FCC has issued “rules” or “policies”, but these do not have a lot of teeth – Does this work? (Still confirming room) – Supporters want internet service providers to be declare as • Topics: everything up to the previous Monday “common carriers” – OS Structure, BDD, Process support, Synchronization, » A common carrier holds itself out to provide service to the general public without discrimination for the "public Memory Management, File systems convenience and necessity".