File Systems: Fundamentals File Operations ¾ Create, Open, Read, Write, Seek, Delete, …

Files What is a file? ¾ A named collection of related information recorded on secondary storage (e.g., disks) File attributes ¾ Name, type, location, size, protection, creator, creation time, last- modified-time, … File Systems: Fundamentals File operations ¾ Create, Open, Read, Write, Seek, Delete, … How does the OS allow users to use files? ¾ “Open” a file before use ¾ OS maintains an open file table per process, a file descriptor is an index into this file. ¾ Allow sharing by maintaining a system-wide open file table 1 2 Fundamental Duality of File Systems Block vs. Sector Metadata The operating system may choose to use a larger ¾ The index node (inode) is the fundamental data structure block size than the sector size of the physical disk. ¾ The superblock also has important file system metadata, like block Each block consists of consecutive sectors. Why? size ¾ A larger bloc k s ize increases the trans fer e ffic iency (w hy ?) Data ¾ It can be convenient to have block size match (a multiple of) ¾ The contents that users actually care about the machine's page size (why?) Files ¾ Contain data and have metadata like creation time, length, etc. Some systems allow transferring of many sectors Directories between interrupts. ¾ Map file names to inode numbers Some systems interrupt after each sector operation (rare these days) ¾ “consecutive” sectors may mean “every other physical sector” to allow time for CPU to start the next transfer before the head moves over the desired sector 3 4 File System Functionality and Implementation File System Properties File system functionality: Most files are small. ¾ Pick the blocks that constitute a file. ¾ Need strong support for small files. Must balance locality with expandability. ¾ Block size can’t be too big. MtMust manage free space. Some files are very large. ¾ Provide file naming organization, such as a hierarchical ¾ Must allow large files (64-bit file offsets). name space. ¾ Large file access should be reasonably efficient. Most systems fit the following profile: File system implementation: 1. Most files are small ¾ File header (descriptor, inode): owner id, size, last modified time, and location of all data blocks. 2. Most disk space is taken up by large files. OS should be able to find metadata block number N without a 3. I/O operations target both small and large files. disk access (e.g., by using math or cached data structure). --> The per-file cost must be low, but large files must also have ¾ Data blocks. good performance. Directory data blocks (human readable names, permissions) File data blocks (data). ¾ Superblocks, group descriptors, other metadata… 5 6 How do we find and organize files on the disk? The information that we need: If my file system only has lots of big video files what file header points to data blocks block size do I want? fileID 0, Block 0 --> Disk block 19 fileID 0, Block 1 --> Disk block 4,528 … 1. Large Key performance issues: 2. Small 1. We need to support sequential and random access. 2. What is the right data structure in which to maintain file location information? 3. How do we lay out the files on the physical disk? 7 8 File Allocation Methods File Allocation Methods Contiguous allocation Linked allocation I I File header specifies starting block & length Files stored as a linked list of blocks Placement/Allocation policies File header contains a pointer to the first and last file ¾ First-fit, best-fit, ... blocks Pluses Minuses Pluses Minuses ¾ Best file read ¾ Easy to create, grow & shrink files ¾ Impossible to do true ¾ Fragmentation! random access performance ¾ No external fragmentation ¾ Problems with file growth ¾ Reliability ¾ Efficient sequential & Pre-allocation? random access Break one link in the chain On-demand allocation? and... 9 10 File Allocation Methods File Allocation Methods Linked allocation – File Allocation Table (FAT) (Win9x, OS2) Direct allocation Maintain linked list in a separate table ¾ A table entry for each block on disk I ¾ Each table entry in a file has a pointer to the next entry in that file (with a special “eof” marker) ¾ A “0” in the table entry Î free block File header points to each data block Comparison with linked allocation ¾ If FAT is cached Î better sequential and random access performance How much memory is needed to cache entire FAT? 400GB disk , 4KB/block Î 100M entries in FAT Î 400MB Pluses MiM nuses Solution approaches ¾ Easy to create, grow & ¾ Inode is big or variable size Allocate larger clusters of storage space shrink files ¾ How to handle large files? Allocate different parts of the file near each other Î better locality ¾ Little fragmentation for FAT ¾ Supports direct access 11 12 File Allocation Methods Indexed Allocation Indexed allocation Handling large files I IB Linked index blocks (IB+IB+…) I IB IB IB Create a non-data block for each file called the index block ¾ A list of pointers to file blocks File header contains the index block Multilevel index blocks (IB*IB*…) Pluses MiM nuses ¾ Easy to create, grow & ¾ Overhead of storing index IB IB IB IB shrink files when files are small I ¾ Little fragmentation ¾ How to handle large files? ¾ Supports direct access 13 14 Multi-level Indirection in Unix File header contains 13 pointers Why bother with index blocks? ¾ 10 pointes to data blocks; 11th pointer Æ indirect block; 12th pointer th ¾ A. Allows greater file size. Æ doubly-indirect block; and 13 pointer Æ triply-indirect block ¾ ¾ BFB. Fas ter to crea te fil es. Implications ¾ C. Simpler to grow files. ¾ Upper limit on file size (~2 TB) ¾ D. Simpler to prepend and append to files. ¾ Blocks are allocated dynamically (allocate indirect blocks only for ¾ E. Scott Summers is the X-men’s Cyclops large files) Features ¾ Pros Simple Files can easily expand Small files are cheap ¾ Cons Large files require a lot of seek to access indirect blocks 15 16 Indexed Allocation in UNIX Multilevel, indirection, index blocks 10 Data Blocks 1st Level Inode Indirection Block n How big is an inode? Data ¾ A. 1 byte Blocks ¾ B16btB. 16 bytes n2 ¾ C. 128 bytes IB Data ¾ D. 1 KB nd IB Blocks 2 Level ¾ E. 16 KB Indirection Block IB IB n3 Data Blocks IB IB IB IB 3rd Level Indirection Block IB IB IB IB 17 18 Allocate from a free list Free list representation Represent the list of free blocks as a bit vector: Need a data block 111111111111111001110101011101111... ¾ Consult list of free data blocks ¾ If bit i = 0 then block i is free, if i = 1 then it is allocated Need an inode ¾ Consult a list of free inodes Simple to use and vector is compact: 1TB disk with 4KB blocks is 2^28 bits or 32 MB Why do inodes have their own free list? If free sectors are uniformly distributed across the disk then ¾ A. Because they are fixed size the expected number of bits that must be scanned before ¾ B. Because they exist at fixed locations finding a “0” is n/r ¾ C. Because there are a fixed number of them where n = total number of blocks on the disk, r = number of free blocks If a disk is 90% full, then the average number of bits to be scanned is 10, independent of the size of the disk 19 20 Other Free List Representations Naming and Directories In-situ linked lists Files are organized in directories ¾ Directories are themselves files D ¾ Contain <name, pointer to file header> table Only OS can modify a directory ¾ Ensure integrity of the mapping Grouped lists ¾ Application programs can read directory (e.g., ls) Directory operations: D G ¾ List contents of a directory Next ¾ Search (find a file) group Linear search block Binary search Hash table ¾ Create a file ¾ Delete a file Allocated block Empty block 21 22 Directory Hierarchy and Traversal Every directory has an inode Directories are often organized in a hierarchy ¾ A. True Directory traversal: ¾ BFlB. False ¾ How do you find blocks of a file? Let’s start at the bottom Find file header (inode) – it contains pointers to file blocks Given only the inode number (inumber) the OS can To find file header (inode), we need its I-number To find I-number, read the directory that contains the file find the inode on disk But wait, the directory itself is a file ¾ A. True Recursion !! ¾ Example: Read file /A/B/C ¾ B. False C is a file B/ is a direc tory tha t con ta ins the I-numbffilCber for file C A/ is a directory that contains the I-number for file B How do you find I-number for A? “/” is a directory that contains the I-number for file A What is the I-number for “/”? In Unix, it is 2 23 24 Directory Traversal (Cont’d.) Naming and Directories Once you have the file header, you can access all blocks within How many disk accesses are needed to access file /A/B/C? a file 1. Read I-node for “/” (root) from a fixed location ¾ How to find the file header? Inode number + layout. 2. Read the first data block for root 3. Read the I-node for A Where are file headers stored on disk? 4. Read the first data block of A ¾ In early Unix: 5. Read the I-node for B Special reserved array of sectors 6. Read the first data block of B Files are referred to with an index into the array (I-node number) 7. Read I-node for C Limitations: (1) Header is not near data; (2) fixed size of array Æ fixed 8. Read the first data block of C number of files on disk (determined at the time of formatting the disk) ¾ Berkeley fast file system (FFS): Distribute file header array across cylinders.

Load more