OS, Base & Bounds, Virtual Memory Intro

OS, Base & Bounds, Virtual Memory Intro Instructor: Steven Ho Review • Warehouse Scale Computers – Supports many of the applications we have come to depend on – Software must cope with failure, load variation, and latency/bandwidth limitations – Hardware sensitive to cost and energy efficiency • Request Level Parallelism – High request volume, each largely independent – Replication for better throughput, availability • MapReduce – Convenient data-level parallelism on large dataset across large number of machines – Spark is a framework for executing MapReduce algorithms 7/30/2018 CS61C Su18 - Lecture 22 2 Word Count in Spark’s Python API // RDD: primary abstraction of a distributed collection of items file = sc.textFile(“hdfs://…”) // Two kinds of operations: // Actions: RDD → Value // Transformations: RDD → RDD // e.g. flatMap, Map, reduceByKey file.flatMap(lambda line: line.split()) .map(lambda word: (word, 1)) .reduceByKey(lambda a, b: a + b) 7/30/2018 CS61C Su18 - Lecture 22 3 MapReduce Word Count Example • Map phase: (doc name, doc contents) → list(word, count) // “I do I learn”” → [(“I”,1),(“do”,1),(“I”,1),(“learn”,1)] map(key, value): for each word w in value: emit(w, 1) • Reduce phase: (word, list(count)) → (word, count_sum) // (“I”, [1,1]) → (“I”,2) reduce(key, values): result = 0 for each v in values: result += v emit(key, result) 7/30/2018 CS61C Su18 - Lecture 22 4 Word Count in Spark’s Python API // RDD: primary abstraction of a distributed collection of items file = sc.textFile(“hdfs://…”) // Two kinds of operations: // Actions: RDD → Value // Transformations: RDD → RDD // e.g. flatMap, Map, reduceByKey file.flatMap(lambda line: line.split()) .map(lambda word: (word, 1)) .reduceByKey(lambda a, b: a + b) 7/30/2018 CS61C Su18 - Lecture 22 5 MapReduce Word Count Example • Map phase: (doc name, doc contents) → list(word, count) // “I do I learn”” → [(“I”,1),(“do”,1),(“I”,1),(“learn”,1)] map(key, value): for each word w in value: emit(w, 1) • Reduce phase: (word, list(count)) → (word, count_sum) // (“I”, [1,1]) → (“I”,2) reduce(key, values): result = 0 for each v in values: result += v emit(key, result) 7/30/2018 CS61C Su18 - Lecture 22 6 Word Count in Spark’s Python API // RDD: primary abstraction of a distributed collection of items file = sc.textFile(“hdfs://…”) // Two kinds of operations: // Actions: RDD → Value // Transformations: RDD → RDD // e.g. flatMap, Map, reduceByKey file.flatMap(lambda line: line.split()) .map(lambda word: (word, 1)) .reduceByKey(lambda a, b: a + b) 7/30/2018 CS61C Su18 - Lecture 22 7 MapReduce Word Count Example • Map phase: (doc name, doc contents) → list(word, count) // “I do I learn”” → [(“I”,1),(“do”,1),(“I”,1),(“learn”,1)] map(key, value): for each word w in value: emit(w, 1) • Reduce phase: (word, list(count)) → (word, count_sum) // (“I”, [1,1]) → (“I”,2) reduce(key, values): result = 0 for each v in values: result += v b emit(key, result) a 7/30/2018 CS61C Su18 - Lecture 22 8 Agenda • OS Intro • Administrivia • OS Boot Sequence and Operation • Multiprogramming/time-sharing • Introduction to Virtual Memory 7/30/2018 CS61C Su18 - Lecture 22 9 CS61C so far… C Programs #include <stdlib.h> RISC-V Assembly int fib(int n) { return Project 3 fib(n-1) + .foo fib(n-2); lw t0, 4(s0) } CPU addi t1, t0, 3 beq t1, t2, foo nop Labs Project 2 Project 1 Caches Memory 7/30/2018 CS61C Su18 - Lecture 22 10 So how is this any different? Screen Keyboard Storage 7/30/2018 CS61C Su18 - Lecture 22 11 Adding I/O C Programs #include <stdlib.h> RISC-V Assembly int fib(int n) { return Project 3 fib(n-1) + .foo fib(n-2); lw t0, 4(s0) } CPU addi t1, t0, 3 beq t1, t2, foo nop Project 2 Project 1 Screen Keyboard Storage Caches I/O (Input/Output) Memory 7/30/2018 CS61C Su18 - Lecture 22 12 Raspberry Pi ($40 on Amazon) Serial I/O CPU+$s, etc. (USB) Storage I/O Memory (Micro SD Card) Network I/O Screen I/O (Ethernet) (HDMI) 7/30/2018 CS61C Su18 - Lecture 22 13 It’s a real computer! 7/30/2018 CS61C Su18 - Lecture 22 14 But wait… • That’s not the same! When we run Venus, it only executes one program and then stops. • When I switch on my computer, I get this: Yes, but that’s just software! The Operating System (OS) 7/30/2018 CS61C Su18 - Lecture 22 15 Well, “just software” • The biggest piece of software on your machine? • How many lines of code? These are guesstimates: 84 million lines of code! Codebases (in millions of lines of code). CC BY-NC 3.0 — David McCandless © 2015 http://www.informationisbeautiful.net/visualizations/million-lines-of-code/ 7/30/2018 CS61C Su18 - Lecture 22 16 What does the OS do? • One of the first things that runs when your computer starts (right after firmware/bootloader) • Loads, runs and manages programs: – Multiple programs at the same time (time-sharing) – Isolate programs from each other (isolation) – Multiplex resources between applications (e.g., devices) • Services: File System, Network stack, etc. • Finds and controls all the devices in the machine in a general way (using “device drivers”) 7/30/2018 CS61C Su18 - Lecture 22 17 Agenda • OS Intro • Administrivia • OS Boot Sequence and Operation • Multiprogramming/time-sharing • Introduction to Virtual Memory 7/30/2018 CS61C Su18 - Lecture 22 18 Administrivia • Proj4 due on Friday (8/03) – Project Party tonight, Soda 405/411, 4-6pm • HW6 due tonight, HW7 released • Guerilla Session on Wed. @Cory 540AB • The final will be 8/09 7-10PM @VLSB 2040/2060! • We’re almost there! :D 7/30/2018 CS61C Su18 - Lecture 22 19 Agenda • OS Intro • Administrivia • OS Boot Sequence and Operation • Multiprogramming/time-sharing • Introduction to Virtual Memory 7/30/2018 CS61C Su18 - Lecture 22 20 What happens at boot? • When the computer switches on, it does the same as Venus: the CPU executes instructions from some start address (stored in Flash ROM) CPU Memory mapped 0x2000: addi t0, x0, 0x1000 lw t0, 4(s0) … (Code to copy firmware into regular memory and jump into it) PC = 0x2000 (some default value) Address Space 7/30/2018 CS61C Su18 - Lecture 22 21 What happens at boot? • When the computer switches on, it does the same as Venus: the CPU executes instructions from some start address (stored in Flash ROM) 1. BIOS: Find a storage 4. Init: Launch an application device and load first that waits for input in loop sector (block of data) (e.g., Terminal/Desktop/... 2. Bootloader (stored on, e.g., disk): Load the OS kernel from disk into a location in memory 3. OS Boot: Initialize and jump into it. services, drivers, etc. 7/30/2018 CS61C Su18 - Lecture 22 22 Launching Applications • Applications are called “processes” in most OSs. • Created by another process calling into an OS routine (using a “syscall”, more details later). – Depends on OS, but Linux uses fork (see OpenMP threads) to create a new process, and execve to load application. • Loads executable file from disk (using the file system service) and puts instructions & data into memory (.text, .data sections), prepare stack and heap. • Set argc and argv, jump into the main function. 7/30/2018 CS61C Su18 - Lecture 22 23 Supervisor Mode • If something goes wrong in an application, it can crash the entire machine. What about malware, etc.? • The OS may need to enforce resource constraints to applications (e.g., access to devices). • To protect the OS from the application, CPUs have a supervisor mode bit (also need isolation, more later). – You can only access a subset of instructions and (physical) memory when not in supervisor mode (user mode). – You can change out of supervisor mode using a special instruction, but not into it (unless there is an interrupt). 7/30/2018 CS61C Su18 - Lecture 22 24 Syscalls • How to switch back to OS? OS sets timer interrupt, when interrupts trigger, drop into supervisor mode. • What if we want to call into an OS routine? (e.g., to read a file, launch a new process, send data, etc.) – Need to perform a syscall: set up function arguments in registers, and then raise software interrupt – OS will perform the operation and return to user mode • This way, the OS can mediate access to all resources, including devices, the CPU itself, etc. 7/30/2018 CS61C Su18 - Lecture 22 25 Syscalls in Venus • Venus provides many simple syscalls using the ecall RISC-V instruction • How to issue a syscall? – Place the syscall number in a0 – Place arguments to the syscall in the a1 register – Issue the ecall instruction • This is how your RISC-V code has been able to produce output all along • ecall details depend on the ABI (Application Binary Interface) 7/30/2018 CS61C Su18 - Lecture 22 26 Example Syscall • Let’s say we want to print an integer stored in s3: Print integer is syscall #1 li a0, 1 add a1, s3, x0 ecall 7/30/2018 CS61C Su18 - Lecture 22 27 Venus’s Environmental Calls 7/30/2018 CS61C Su18 - Lecture 22 28 Agenda • OS Intro • Administrivia • OS Boot Sequence and Operation • Multiprogramming/time-sharing • Introduction to Virtual Memory 7/30/2018 CS61C Su18 - Lecture 22 29 Multiprogramming • OS runs multiple applications at the same time. • But not really (unless have a core per process) • Switches between processes very quickly. This is called a “context switch”. • When jumping into process, set timer interrupt. – When it expires, store PC, registers, etc. (process state). – Pick a different process to run and load its state. – Set timer, change to user mode, jump to the new PC. • Deciding what process to run is called scheduling.

Load more