CS162 Operating Systems and Systems Programming Lecture 17 Data Storage to File Systems October 29, 2019 Prof. David Culler http://cs162.eecs.Berkeley.edu Read: A&D Ch 12, 13.1-3.2 Recall: OS Storage abstractions Key Unix I/O Design Concepts • Uniformity – everything is a file – file operations, device I/O, and interprocess communication through open, read/write, close – Allows simple composition of programs • find | grep | wc … • Open before use – Provides opportunity for access control and arbitration – Sets up the underlying machinery, i.e., data structures • Byte-oriented – Even if blocks are transferred, addressing is in bytes What’s below the surface ?? • Kernel buffered reads – Streaming and block devices looks the same, read blocks yielding processor to other task • Kernel buffered writes Application / Service – Completion of out-going transfer decoupled from the application, File descriptor number allowing it to continue - an int High Level I/O streams • Explicit close Low Level I/O handles 9/12/19 The cs162file fa19 L5system abstraction 53 Syscall registers • File File System descriptors File Descriptors – Named collection of data in a file system • a struct with all the info I/O Driver Commands and Data Transfers – POSIX File data: sequence of bytes • about the files Disks, Flash, Controllers, DMA Could be text, binary, serialized objects, … – File Metadata: information about the file • Size, Modification Time, Owner, Security info • Basis for access control • Directory – “Folder” containing files & Directories – Hierachical (graphical) naming • Path through the directory graph 9/12/19 cs162 fa19 L5 57 • Uniquely identifies a file or directory – /home/ff/cs162/public_html/fa14/index.html – Links and Volumes (later) 10/29/19 CS162 © UCB Fa199/12/19 cs162 fa19 L5 Lec 17.2 28 Review: Memory-Mapped Display Controller • Memory-Mapped: – Hardware maps control registers and display 0x80020000 Graphics memory into physical address space Command » Addresses set by HW jumpers or at boot time Queue 0x80010000 – Simply writing to display memory (also called Display the “frame buffer”) changes image on screen Memory » Addr: 0x8000F000 — 0x8000FFFF 0x8000F000 – Writing graphics description to cmd queue » Say enter a set of triangles describing some 0x0007F004 Command scene 0x0007F000 Status » Addr: 0x80010000 — 0x8001FFFF – Writing to the command register may cause on- board graphics hardware to do something » Say render the above scene Physical » Addr: 0x0007F004 Address • Can protect with address translation Space 10/29/19 CS162 © UCB Fa19 Lec 17.3 Review: Transferring Data To/From Controller • Programmed I/O: – Each byte transferred via processor in/out or load/store – Pro: Simple hardware, easy to program – Con: Consumes processor cycles proportional to data size • Direct Memory Access: – Give controller access to memory bus – Ask it to transfer 1 data blocks to/from memory directly • Sample interaction 2 with DMA controller (from OSC book): 3 10/29/19 CS162 © UCB Fa19 Lec 17.4 Review: Transferring Data To/From Controller • Programmed I/O: – Each byte transferred via processor in/out or load/store – Pro: Simple hardware, easy to program – Con: Consumes processor cycles proportional to data size • Direct Memory Access: – Give controller access to memory bus – Ask it to transfer data blocks to/from memory directly 6 • Sample interaction with DMA controller 5 (from OSC book): 4 10/29/19 CS162 © UCB Fa19 Lec 17.5 Recall: I/O Device Notifying the OS • The OS needs to know when: – The I/O device has completed an operation – The I/O operation has encountered an error • I/O Interrupt: – Device generates an interrupt whenever it needs service – Pro: handles unpredictable events well – Con: interrupts relatively high overhead • Polling: – OS periodically checks a device-specific status register » I/O device puts completion information in status register – Pro: low overhead – Con: may waste many cycles on polling if infrequent or unpredictable I/O operations • Actual devices combine both polling and interrupts – For instance – High-bandwidth network adapter: » Interrupt for first incoming packet » Poll for following packets until hardware queues are empty 10/29/19 CS162 © UCB Fa19 Lec 17.6 Recall: Device Drivers • Device Driver: Device-specific code in the kernel that interacts directly with the device hardware – Supports a standard, internal interface – Same kernel I/O system can interact easily with different device drivers – Special device-specific configuration supported with the ioctl() system call • Device Drivers typically divided into two pieces: – Top half: accessed in call path from system calls » implements a set of standard, cross-device calls like open(), close(), read(), write(), ioctl(), strategy() » This is the kernel’s interface to the device driver » Top half will start I/O to device, may put thread to sleep until finished – Bottom half: run as interrupt routine » Gets input or transfers next block of output » May wake sleeping threads if I/O now complete 10/29/19 CS162 © UCB Fa19 Lec 17.7 Recall: Kernel Device Structure The System Call Interface Process Memory Device Filesystems Networking Management Management Control Concurrency, Files and dirs: Virtual TTYs and Connectivity multitasking memory the VFS device access File System Types Network Architecture Subsystem Memory Device Dependent Manager Control Code Block IF drivers Devices 10/29/19 CS162 © UCB Fa19 Lec 17.8 Review: Life Cycle of An I/O Request User Program Kernel I/O Subsystem Device Driver Top Half Device Driver Bottom Half Device Hardware 10/29/19 CS162 © UCB Fa19 Lec 17.9 Basic I/O Performance Concepts • Response Time or Latency: Time to perform an operation(s) • Bandwidth or Throughput: Rate at which operations are performed (op/s) – Files: MB/s, Networks: Mb/s, Arithmetic: GFLOP/s • Start up or “Overhead”: time to initiate an operation • Most I/O operations are roughly linear in b bytes – Latency(b) = Overhead + b / TransferCapacity 10/29/19 CS162 © UCB Fa19 Lec 17.10 Example (Fast Network or SSD) • Consider: 1 Gb/s link (B = 125 MB/s) w/ startup cost S = 1 ms – Latency(b) = S + b/B – Bandwidth = b/(S + b/B) = B*b/(B*S + b) = B/(B*S/b + 1) 10/29/19 CS162 © UCB Fa19 Lec 17.11 Example (Fast Network or SSD) • Consider a 1 Gb/s link (B = 125 MB/s) w/ startup cost S = 1 ms – Half-power Bandwidth Þ B/(B*S/b + 1) = B/2 – Half-power point occurs at b=S*B= 125,000 bytes 10/29/19 CS162 © UCB Fa19 Lec 17.12 Example: at 10 ms startup (like Disk) Performance)of)gbps)link)with)10)ms)startup) 18,000"" 50"" 16,000"" 45"" 40"" 14,000"" 35"" 12,000"" 30"" 10,000"" 25"" 8,000"" 20"" Latency)(us)) 6,000"" 15"" Bandwidth)(mB/s)) 4,000"" 10"" Half-power b = 1,250,000 bytes! 2,000"" 5"" 0"" 0"" 0"" 50,000""100,000""150,000""200,000""250,000""300,000""350,000""400,000""450,000""500,000"" Length)(b)) 10/29/19 CS162 © UCB Fa19 Lec 17.13 What Determines Peak BW for I/O ? • Bus Speed – PCI-X: 1064 MB/s = 133 MHz x 64 bit (per lane) – ULTRA WIDE SCSI: 40 MB/s – Serial Attached SCSI & Serial ATA & IEEE 1394 (firewire): 1.6 Gb/s full duplex (200 MB/s) – USB 3.0 – 5 Gb/s – Thunderbolt 3 – 40 Gb/s • Device Transfer Bandwidth – Rotational speed of disk – Write / Read rate of NAND flash – Signaling rate of network link • Whatever is the bottleneck in the path… 10/29/19 CS162 © UCB Fa19 Lec 17.14 The Amazing Magnetic Disk • Unit of Transfer: Sector Spindle – Ring of sectors form a track Head Arm – Stack of tracks form a cylinder Surface Sector – Heads position on cylinders Platter Surface Track • Disk Tracks ~ 1µm (micron) wide Arm Assembly – Wavelength of light is ~ 0.5µm – Resolution of human eye: 50µm – 100K tracks on a typical 2.5” disk • Separated by unused guard regions – Reduces likelihood neighboring tracks are corrupted during writes (still a small Motor Motor non-zero chance) 10/29/19 CS162 © UCB Fa19 Lec 17.15 The Amazing Magnetic Disk • Track length varies across disk – Outside: More sectors per track, Spindle Head Arm higher bandwidth – Disk is organized into Surface Sector regions of tracks with Platter same # of sectors/track Surface Track – Only outer half of radius is used Arm Assembly » Most of the disk area in the outer regions of the disk • Disks so big that some companies (like Google) reportedly only use part of disk for active data – Rest is archival data Motor Motor 10/29/19 CS162 © UCB Fa19 Lec 17.16 Shingled Magnetic Recording (SMR) • Overlapping tracks yields greater density, capacity • Restrictions on writing, complex DSP for reading • Examples: Seagate (8TB), Hitachi (10TB) 10/29/19 CS162 © UCB Fa19 Lec 17.17 Magnetic Disks - Performance Track • Cylinders: all the tracks under the Sector head at a given point on all surface Head • Read/write data is a three-stage process: Cylinder – Seek time: position the head/arm over the proper track Platter – Rotational latency: wait for desired sector to rotate under r/w head – Transfer time: transfer a block of bits (sector) under r/w head Seek time = 4-8ms One rotation = 1-2ms (3600-7200 RPM) 10/29/19 CS162 © UCB Fa19 Lec 17.18 Magnetic Disks - Queuing Track • Cylinders: all the tracks under the Sector head at a given point on all surface Head • Read/write data is a three-stage process: Cylinder Platter – Seek time: position the head/arm over the proper track – Rotational latency: wait for desired sector to rotate under r/w head – Transfer time: transfer a block of bits (sector) under r/w head Disk Latency = Queueing Time + Controller time + Seek Time + Rotation Time + Xfer Time Controller Hardware Request Software Result Media Time Queue (Seek+Rot+Xfer) (Device Driver) 10/29/19 CS162 © UCB Fa19 Lec 17.19 Typical Numbers for Magnetic Disk Parameter Info / Range Space/Density Space: 14TB (Seagate), 8 platters, in 3½ inch form factor! Areal Density: ≥ 1Terabit/square inch! (PMR, Helium, …) Average seek time Typically 4-6 milliseconds.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages62 Page
-
File Size-