TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

TCSS 422: OPERATING SYSTEMS OBJECTIVES

° Chapter 38, 39

File Systems and ° Introduction to RAID RAID ° File systems – structure

Wes J. Lloyd ° File systems – Institute of Technology University of Washington - Tacoma ° File systems - indexing

TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.2 Institute of Technology, University of Washington - Tacoma

RAID RAID CONTROLLER

° Redundant array of inexpensive disks (RAID) ° Special hardware RAID controllers offer

° Enable multiple disks to be grouped together to: ° Microcontroller ° Provides firmware to direct RAID operation ° Provide the illusion of one giant disk ° Volatile memory ° For performance improvements ° Striping: For mirrored disks we can increase read speeds splitting ° Non-volatile memory read transactions to run in parallel across two physical disks ° Buffers writes in case of power loss

° For redundancy ° Specialized logic to perform parity calculations ° Mirroring: duplication of data

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.3 March 8, 2017 L18.4 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

RAID LEVEL 0 - STRIPING RAID LEVEL 1 - MIRRORING

° RAID Level 0: Simplest form ° RAID 1 tolerates HDD failure ° Stripe blocks across disk in a round-robin fashion ° Two copies of each block across disks ° Excellent performance and capacity

° Capacity ° Capacity is equal to the sum of all disks ° Performance ° R/W are distributed in round-robin fashion across all disks ° Reliability ° RAID 10 (RAID 1 + RAID 0): Mirror then stripe data ° No redundancy ° RAID 01 (RAID 0 + RAID 1): Stripe then mirror data

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.5 March 8, 2017 L18.6 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.1 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

RAID 1 - EVALUATION RAID 5 – PARITY DISK

° Capacity: RAID 1 is expensive ° Raid 5 – trades off space requirement for redundancy ° The useful capacity is n/2 ° In a 5-disk array, you can only recover from the loss of 1 HDD ° 5 disk RAID 5: Capacity if 80% of 5 disks ° Reliability: RAID-1 does well ° Writes rotate across bit, distributing a parity bit ° Can tolerate the loss of disk(s) ° Up to n/2 disk failures can be tolerated depending on which RAID ° To rebuild data blocks you only need data from 4 disks fails ° Any drive can fail, as long as it is only 1 ° Only need: ° Performance: RAID-1 is slow at writing 3 blocks + 1 parity block ° Must wait for writes to complete to all disk(s) -or- 4 blocks °RAID is not a backup!

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.7 March 8, 2017 L18.8 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

RAID 5 – EVALUATION RAID COMPARISON

° Capacity: Useful capacity is (n-1) disks ° A HDD must be dedicated as a parity disk

° Performance ° Writes are very slow: roughly = n/4 ° Reads are equivalent to a single disk

° Reliability ° In RAID 5, a disk may fail, and the RAID keeps running ° Rebuilds are slow !!! ° Depending on disk size 8-24 hours is not unheard of

° RAID 6: Adds a second parity disk for increased resilience

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.9 March 8, 2017 L18.10 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE SYSTEMS

° Implemented by the OS as pure software

° Provide: ° Data structures: to describe disk content ° Arrays of blocks, index-nodes, trees FILESYSTEMS ° Access methods: provides mapping for OS calls open(), read(), write(), etc. ° Which structures are read? written? For each call? ° How efficiently does the structure support file operations?

° Many available file systems (A-Z)

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 March 8, 2017 L18.12 Institute of Technology, University of Washington - Tacoma L18.11 Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.2 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

FILE SYSTEMS - 2 ORGANIZATION

° Numerous file systems abound (A-Z) ° Disk is divided into blocks

° ADFA, AdvFS, AFS, AFS, AosFS, AthFS, BFS, BFS, , CFS, ° Block size supported by most HDDs is 512 bytes CMDFS, CP/M, DDFS, DTFS, DOS 3.x, EAFS, EDS, ext, etx2, etx3, , , FAT, VFAT, FATX, FFS, , Files-11, ° Typical FS block size is 4 KB Felx, HFS, HPFS, HTFS, IceFS, ISO 9660, JFS, JXFS, Lisa FS, LFS, MFS, Minix FS, NILFS, NTFS, NetWare FS, OneFS, OFS, OS- 9, PFS, ProDOS, Qnx5fs, Qnx6fs, ReFS, ReiserFS, , ° An instance of a file system is typically called a partition Reliance, Reliance Nitro, RFS, S51K, SkyFS, SFS, (Apple), SpadFS, STL, TRFS, , UDF, UFS, UFS2, VxFS, VLIR, WAFL, ° A single physical disk can have multiple partitions (file XFS, FS, ZFS systems)

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.13 March 8, 2017 L18.14 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE SYSTEM EXAMPLE FILE SYSTEM STORAGE

° Consider a 64 block (4096KB block size) ° File system is stored using blocks on the disk disk, aka. a 256 KB disk ° This is considered a “reserved” region of the disk ° Corruption of the reserved region can destroy the file ° Legacy low density 5-¼“ floppy had tables causing data on the disk to by unaddressable 160KB single side, 360KB double sided capacity ° File system tracks: ° Which blocks comprise a file ° Where the blocks reside (are they contiguous?) ° The size of files ° The owner of files ° File permissions (e.g. R/W/X)

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.15 March 8, 2017 L18.16 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE SYSTEM LOCATION FILE SYSTEM EXAMPLES - 2

° Below the reserved region is at the front on a 64-block disk ° Consider 256kb disk, with 56 free data blocks partition ° i-node size 256 bytes each ° There are 56 blocks to track using “index-nodes” (inodes) ° 4KB block can contain 16 inodes ° Minimum of 4 – 4KB blocks required ° Here reserve 5 – 4KB blocks for files ° Provides some spare inodes

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.17 March 8, 2017 L18.18 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.3 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

FILE SYSTEM - FREE LIST SUPERBLOCK

° Allocation structures: ° Contains information about the file system, “S”: ° “Free list” of free inodes and blocks ° How many inodes? ° Example stores free list using bitmaps ° How many data blocks? ° Array of bits indicate if or FS block is in use (0/1) ° Location of inode table ° Inode bitmap: 80 bits for inode table ° File system identity code(s) ° Data bitmap: 56 bits for data blocks

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.19 March 8, 2017 L18.20 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

INODE EXAMPLE INODE EXAMPLE - 2

° Every inode has an inode number (index value) ° Disks are addressed by sectors, not bytes ° Based on inode number, disk location can be calculated ° Disk stores large number of addressable sectors ° Example: inodeWhat number=32 is the inode location? ° Offset into 12KBinode region + (32 = 32x 256) ° Size of inode=256 bytes ° Inode number12KB x inode + 8KB size = =8192 20 KB ° Inode location = inode start addr + (inode no. x inode size)

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.21 March 8, 2017 L18.22 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

INODE EXAMPLE - 3 INODES – FS

° Inodes store all information about a file: ° File type (e.g. , file, other) ° Size, and the number of blocks allocated to a file on disk ° R/W/X permissions ° Time information

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.23 March 8, 2017 L18.24 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.4 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

MULTI-LEVEL INDEX MULTI-LEVEL INDEX - 2

° Inodes use multi-level index ° First level: include 12-direct block pointers ° Second level: include 1 indirect block pointer ° Points to an entire block (4096 KB / 4 bytes) = 1,024 block pointers

° Indirect pointer : one level ° Maximum file size ° (12 + 1,024) * 4KB = (1,036 x 4KB) = 4,144 KB

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.25 March 8, 2017 L18.26 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

MULTI-LEVEL INDEX - 3 EXTENTS

° Double indirect pointer ° Extents have a pointer with a stored length ° First level: include 12-direct block pointers ° Each file has multiple extents ° Second level: include 1 indirect block pointer ° A single would require contiguous file allocation ° Third level: include 1 indirect block pointer ° Maximum file size: ° In contrast to block pointers ° 12 + 1,024 + (1,024 x 1,024) * 4KB ° Extents conserve space better than multi-level indexes, but ° 1,049,612 x 4KB = 4,198,448 KB are less agile at representing file allocations scattered across ° > 4GB the disk ° Multi-level indexes excel for scattered file allocations ° Triple indirect pointer ° File indexing presents a space vs. flexibility tradeoff ° Maximum file size: ° 12 + 1,024 + (1024 2) + (1024 3) * 4KB = > 4TB

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.27 March 8, 2017 L18.28 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE INDEXING COMMON FILE CHARACTERISTICS

° Multi-level indexing ° Ext2,

° Extents ° Ext4 (default Ubuntu 16.04), XFS (default CentOS 7) ° NTFS, Btrfs (b-tree fs)

° Exhaustive file systems feature comparison ° https://en.wikipedia.org/wiki/Comparison_of_file_systems

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.29 March 8, 2017 L18.30 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.5 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

DIRECTORIES FILE I/O - READ

° Directory contains file name and i number (index) ° Consider reading a file called “/foo/bar” ° Extra files for the parent dirdirdir and pwdpwdpwd ° Can store dirs as linear list, often stored in inodes ° Traverse starting at root “/” (inumber = 2) to find file ° XFS uses B-trees to eliminate sequential search of filenames for duplicates when creating a new file ° Read each inode to dereference file block location on disk

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.31 March 8, 2017 L18.32 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE I/O – READ OPERATIONS FILE I/O - WRITE

° 3 block file: 11 reads, 3 writes (last access time) ° At least Five I/Os to update an existing file ° one to read the data bitmap ° one to write the bitmap (to reflect its new state to disk) ° two more to read and then write the inode ° one to write the actual block itself.

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.33 March 8, 2017 L18.34 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

FILE I/O – WRITE - 2 FREE SPACE MANAGEMENT

° Free Lists ° Linked list of free blocks ° Head node tracks first free block, each subsequent block is linked with a pointer

° Bitmaps ° Bit-wise arrays of free blocks

° B-trees (XFS) ° Represents free list in a more compact form, with better search performance

° Free list design impacts efficiency of finding free blocks

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.35 March 8, 2017 L18.36 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

Slides by Wes J. Lloyd L18.6 TCSS 422: Operating Systems [Winter 2017] 03/08/2017 Institute of Technology, UW-Tacoma

CACHING READS AND WRITES FILE CACHING

° Two approaches to cache allocation ° Subsequent file opens to a cache file can eliminate reads ° Static partitioning ° Benefits of write caching ° Allocate a fixed size cache at system boot time ° Batch updates together to reduce HDD requests ° For example: dedicate 10% of memory for disk R/W cache ° Writes can be scheduled intelligently in the future ° Some writes can be avoided altogether For example: short lived tmp files ° Dynamic partitioning ° ° Linux has a unified page cache ° Typical write buffering is from 5 to 30 seconds ° Pages are cached to a unified page cache for multiple purposes ° Memory virtualization pages ° Risk of data loss ° Inodes, disk pages ° Fsync(): force synchronization to disk ° Some apps such as database use to ensure immediate writes

TCSS422: Operating Systems [Winter 2017] TCSS422: Operating Systems [Winter 2017] March 8, 2017 L18.37 March 8, 2017 L18.38 Institute of Technology, University of Washington - Tacoma Institute of Technology, University of Washington - Tacoma

QUESTIONS

TCSS422: Operating Systems [Winter 2017] November 30, 2016 Institute of Technology, University of Washington - Tacoma L21.39

Slides by Wes J. Lloyd L18.7