TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma

TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma TCSS 422: OPERATING SYSTEMS OBJECTIVES ° Introduction to RAID File Systems and ° File systems – structure RAID ° File systems – inodes Wes J. Lloyd ° File systems - indexing Institute of Technology University of Washington - Tacoma TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.2 Institute of Technology, University of Washington - Tacoma RAID RAID CONTROLLER ° Redundant array of inexpensive disks (RAID) ° Special hardware RAID controllers offer ° Enable multiple disks to be grouped together to: ° Microcontroller ° Provides firmware to direct RAID operation ° Provide the illusion of one giant disk ° Volatile memory ° For performance improvements ° Striping: For mirrored disks we can increase read speeds splitting ° Non-volatile memory read transactions to run in parallel across two physical disks ° Buffers writes in case of power loss ° For redundancy ° Specialized logic to perform parity calculations ° Mirroring: duplication of data TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.3 December 7, 2016 L22.4 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma RAID LEVEL 0 - STRIPING RAID LEVEL 1 - MIRRORING ° RAID Level 0: Simplest form ° RAID 1 tolerates HDD failure ° Stripe blocks across disk in a round-robin fashion ° Two copies of each block across disks ° Excellent performance and capacity ° Capacity ° Capacity is equal to the sum of all disks ° Performance ° R/W are distributed in round-robin fashion across all disks ° Reliability ° RAID 10 (RAID 1 + RAID 0): Mirror then stripe data ° No redundancy ° RAID 01 (RAID 0 + RAID 1): Stripe then mirror data TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.5 December 7, 2016 L22.6 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma Slides by Wes J. Lloyd L22.1 TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma RAID 1 - EVALUATION RAID 5 – PARITY DISK ° Capacity: RAID 1 is expensive ° Raid 5 – trades off space requirement for redundancy ° The useful capacity is n/2 ° In a 5-disk array, you can only recover from the loss of 1 HDD ° Writes rotate across bit, distributing a parity bit ° Reliability: RAID-1 does well ° To rebuild data blocks you only need data from 4 disks ° Can tolerate the loss of disk(s) ° Any drive can fail, as long as it is only 1 ° Up to n/2 disk failures can be tolerated depending on which RAID fails ° Only need: 3 blocks + 1 parity block -or- ° Performance: RAID-1 is slow at writing 4 blocks ° Must wait for writes to complete to all disk(s) °RAID is not a backup! TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.7 December 7, 2016 L22.8 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma RAID 5 – EVALUATION RAID COMPARISON ° Capacity: Useful capacity is (n-1) disks ° A HDD must be dedicated as a parity disk ° Performance ° Writes are very slow: roughly = n/4 ° Reads are equivalent to a single disk ° Reliability ° In RAID 5, a disk may fail, and the RAID keeps running ° Rebuilds are slow !!! ° Depending on disk size 8-24 hours is not unheard of ° RAID 6: Adds a second parity disk for increased resilience TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.9 December 7, 2016 L22.10 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma FILE SYSTEMS ° Implemented by the OS as pure software ° Provide: ° Data structures: to describe disk content ° Arrays of blocks, index-nodes, trees FILESYSTEMS ° Access methods: provides mapping for OS calls open(), read(), write(), etc. ° Which structures are read? written? For each call? ° How efficiently does the structure support file operations? ° Many available file systems (A-Z) TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 December 7, 2016 L22.12 Institute of Technology, University of Washington - Tacoma L22.11 Institute of Technology, University of Washington - Tacoma Slides by Wes J. Lloyd L22.2 TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma FILE SYSTEMS - 2 FILE SYSTEM ORGANIZATION ° Numerous file systems abound (A-Z) ° Disk is divided into blocks ° ADFA, AdvFS, AFS, AFS, AosFS, AthFS, BFS, BFS, Btrfs, CFS, ° Block size supported by most HDDs is 512 bytes CMDFS, CP/M, DDFS, DTFS, DOS 3.x, EAFS, EDS, ext, etx2, etx3, ext4, ext3cow, FAT, VFAT, FATX, FFS, Fossil, Files-11, ° Typical FS block size is 4 KB Felx, HFS, HPFS, HTFS, IceFS, ISO 9660, JFS, JXFS, Lisa FS, LFS, MFS, Minix FS, NILFS, NTFS, NetWare FS, OneFS, OFS, OS- 9, PFS, ProDOS, Qnx5fs, Qnx6fs, ReFS, ReiserFS, Reiser4, ° An instance of a file system is typically called a partition Reliance, Reliance Nitro, RFS, S51K, SkyFS, SFS, Soup (Apple), SpadFS, STL, TRFS, Tux3, UDF, UFS, UFS2, VxFS, VLIR, WAFL, ° A single physical disk can have multiple partitions (file XFS, FS, ZFS systems) TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.13 December 7, 2016 L22.14 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma FILE SYSTEM EXAMPLE FILE SYSTEM STORAGE ° Consider a 64 block (4096KB block size) ° File system is stored using blocks on the disk disk, aka. a 256 KB disk ° This is considered a “reserved” region of the disk ° Corruption of the reserved region can destroy the file ° Legacy low density 5-¼“ floppy had tables causing data on the disk to by unaddressable 160KB single side, 360KB double sided capacity ° File system tracks: ° Which blocks comprise a file ° Where the blocks reside (are they contiguous?) ° The size of files ° The owner of files ° File permissions (e.g. R/W/X) TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.15 December 7, 2016 L22.16 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma FILE SYSTEM LOCATION FILE SYSTEM EXAMPLES - 2 ° Below the reserved region is at the front on a 64-block disk ° Consider 256kb disk, with 56 free data blocks partition ° i-node size 256 bytes each ° There are 56 blocks to track using “index-nodes” (inodes) ° 4KB block can contain 16 inodes ° Minimum of 4 – 4KB blocks required ° Here reserve 5 – 4KB blocks for files ° Provides some spare inodes TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.17 December 7, 2016 L22.18 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma Slides by Wes J. Lloyd L22.3 TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma FILE SYSTEM - FREE LIST SUPERBLOCK ° Allocation structures: ° Contains information about the file system, “S”: ° “Free list” of free inodes and blocks ° How many inodes? ° Example stores free list using bitmaps ° How many data blocks? ° Array of bits indicate if inode or FS block is in use (0/1) ° Location of inode table ° Inode bitmap: 80 bits for inode table ° File system identity code(s) ° Data bitmap: 56 bits for data blocks TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.19 December 7, 2016 L22.20 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma INODE EXAMPLE INODE EXAMPLE - 2 ° Every inode has an inode number (index value) ° Disks are addressed by sectors, not bytes ° Based on inode number, disk location can be calculated ° Disk stores large number of addressable sectors ° Example: inodeWhat number=32 is the inode location? ° Offset into 12KBinode region + (32 = 32x 256) ° Size of inode=256 bytes ° Inode number12KB x inode + 8KB size = =8192 20 KB ° Inode location = inode start addr + (inode no. x inode size) TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.21 December 7, 2016 L22.22 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma INODE EXAMPLE - 3 INODES – EXT2 LINUX FS ° Inodes store all information about a file: ° File type (e.g. directory, file, other) ° Size, and the number of blocks allocated to a file on disk ° R/W/X permissions ° Time information TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.23 December 7, 2016 L22.24 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma Slides by Wes J. Lloyd L22.4 TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma MULTI-LEVEL INDEX MULTI-LEVEL INDEX - 2 ° Inodes use multi-level index ° First level: include 12-direct block pointers ° Second level: include 1 indirect block pointer ° Points to an entire block (4096 KB / 4 bytes) = 1,024 block pointers ° Indirect pointer : one level ° Maximum file size ° (12 + 1,024) * 4KB = (1,036 x 4KB) = 4,144 KB TCSS422: Operating Systems [Fall 2016] TCSS422: Operating Systems [Fall 2016] December 7, 2016 L22.25 December 7, 2016 L22.26 Institute of Technology, University

TCSS 422: Operating Systems [Fall 2016] 12/7/2016 Institute of Technology, UW-Tacoma

CERIAS Tech Report 2017-5 Deceptive Memory Systems by Christopher N

Ext4 File System and Crash Consistency

CS 152: Computer Systems Architecture Storage Technologies

Filesystem Considerations for Embedded Devices ELC2015 03/25/15

State-Of-The-Art Garbage Collection Policies for NILFS2

The Ongoing Evolution of Ext4 File System

NOVA: the Fastest File System for Nvdimms

Provenance-Based Computing

I/O Stack Optimization for Smartphones

Android Boot Optimization on Dra7xx Devices (Rev. A)

(12) United States Patent (10) Patent No.: US 9,460,110 B2 Klughart (45) Date of Patent: *Oct

Filesystem Support for Continuous Snapshotting