COS 318: Operating Systems File Structure

11/11/19 Where Are We? • Covered: COS 318: Operating Systems ● Management of CPU & concurrency ● Management of main memory & virtual memory File Structure ● Management of I/O devices • Currently --- File Systems ● This lecture: File Structure Jaswinder Pal Singh and a Fabulous Course Staff Computer Science Department • Then: Princeton University • Naming and directories (http://www.cs.princeton.edu/courses/cos318/) • Efficiency and performance • Reliability and protection 1 2 The File System Abstraction File System • Open, close, read, write … named files, arranged in folders or directories ◆ Naming ● File name and directory File naming Physical Reality File System Abstraction ◆ File access File access ● Read, write, other operations block oriented byte oriented (char stream) Buffer cache ◆ Buffer cache ● Reduce client/server disk I/Os Disk allocation Management physical sector #’s named files ◆ Disk allocation Disk Drivers no protection users protected from ◆ Layout, mapping files to blocks each other ◆ Security, protection, reliability, durability data might be corrupted robust to machine failures ◆ Management tools if machine crashes 4 3 4 1 11/11/19 Topics Typical File Attributes ◆ File system structure • Name ◆ Disk allocation and i-nodes • Type – needed for systems that support different types ◆ Directory and link implementations • Location – pointer to file location on device. ◆ Physical layout for performance • Size – current file size. • Protection – controls who can read, write, execute • Time, date, and user identification – data for protection, security, and usage monitoring • Information about files are kept in the directory structure, which is maintained on the disk 2 5 6 Master Boot Record Typical Layout of a Disk Partition • Starts at first sector of disk ◆ Boot block Boot block ● Code to load and boot OS • End of record lists the partitions on the disk ● Every partition can have a different file system ◆ Super-block defines a file system Superblock ● File system info: type, no of blocks, ... • Upon boot: ● File metadata area File metadata ● BIOS reads in and executes MBR (i-nodes in Unix) ● Finds active disk partition from MBR ● Information about / ptr to free blocks ● First block of active partition (boot block) is loaded and executed ● Location of descriptor of root directory Directory data ● That loads in the OS from that partition ◆ File metadata ● Each descriptor describes a file • What does partition and file layout on it look like? ◆ Directories File data ● Directory data (directory and file names) ◆ File data ● Data blocks 2 3 7 8 2 11/11/19 File Types – Name, Extension Typical File Operations • Create File Type Usual extension Function Executable exe, com, bin or ready-to-run machine- • Write none language program Object obj, o complied, machine • Read language, not linked Source code c, p, pas, 177, source code in various • Reposition within file – file seek asm, a languages Batch bat, sh commands to the • Delete command interpreter • Truncate Text txt, doc textual data documents Word processor wp, tex, rrf, etc. various word-processor • Open(Fi) – search the directory structure on disk for formats entry Fi, and move the content of entry to memory. Library lib, a libraries of routines Print or view ps, dvi, gif ASCII or binary file • Close (Fi) – move the content of entry Fi in memory to Archive arc, zip, tar related files grouped directory structure on disk. into one file, sometimes compressed. 9 10 Open A File: Open(fd, name, access) Translating from user to system view • ◆ Various checking (directory and file name lookup, authenticate) User wants to read 10 bytes from file starting at byte 2? ◆ Copy the file descriptors into the in-memory data structure ● Seek byte 2, fetch the block, read 10 bytes ◆ Create an entry in the open file table (system wide) ◆ Create an entry in PCB • User wants to write 10 bytes to file starting at byte 2? File system ◆ Return user a pointer to “file descriptor” info ● Seek byte 2, fetch the block, write 10 bytes, write out block Process File control Open-file table descriptors File • Everything inside file system is in whole size blocks block (system-wide) (Metadata) metadata ● Even getc and putc buffers 4096 bytes Directories • From now on, file is collection of blocks. Open file File data . pointer . array . 5 11 13 3 11/11/19 File usage patterns File system design constraints • How do users access files? • For small files: ● Sequential: bytes read in order ● Small enough blocks for storage efficiency ● “Random”: read/write element out of middle of file ● Files used together should be stored together ● Content-based access: find me next byte starting with “COS318” • How are files used? • For large files: ● Most files are small ● Contiguous allocation for sequential access ● Large files use up most of the disk space ● Efficient lookup for random access ● Most transfers are small • ● Large files account for most of the bytes transferred May not know at file creation whether file will become small or large • Bad news ● Need everything to be efficient 14 15 File system design Data structures for disk management • A file header for each file (part of the file meta-data) • Data structures ● Disk sectors associated with each file ● Directories: file name -> file metadata • Store directories as files • A data structure to track free space on disk ● File metadata: used to find file data blocks of the file ● Bit map ● Free map: list of free disk blocks • 1 bit per block (sector) • blocks numbered in cylinder-major order, why? • How do we organize these data structures? ● Linked list ● Others? • What about allocation for the blocks associated with a file? 16 18 4 11/11/19 Contiguous Allocation Linked Files ◆ Allocate contiguous blocks of storage ◆ File structure (Alto) File header ● Bitmap: find N contiguous 0’s 3 ● File metadata points to 1st block on storage ● Linked list: find a region (size >= N) ● A block points to the next ◆ File metadata ● Last block has a NULL pointer ● First block in file ◆ Pros ● Number of blocks . ● Can grow files dynamically ◆ Pros ● File data tracked similarly to free ● Fast sequential access list of blocks ● Easy random access File A File B ● Doesn’t waste space ◆ Cons null ◆ Cons ● External fragmentation ● Random access: bad (what if file C needs 4 blocks) ● Unreliable: losing a block means ● Hard to grow files losing the rest 7 8 19 20 Linked files (cont’d) File Allocation Table (FAT) • Idea is to keep the linked list metadata foo 217 (pointers) in memory, rather than on disk • Allocation table at beginning of each volume 0 ◆ N entries for N blocks ◆ Want to keep it in memory • File structure (MS-DOS) 217 619 ● A file is a linked list of blocks ● File metadata points to first block of file 399 EOF ● The entry of first block points to next, … • Pros ● Simple 619 399 • Cons ● Random access: still not good ● Wastes space - table for each file expensive to keep in memory FAT Allocation Table 8 21 22 5 11/11/19 DEMOS (Cray-1) Single-level Indexed File File metadata ◆ Idea • User declares max size size0 ● Try contiguous allocation size1 • File header holds array of pointers ● Allow non-contiguous to disk blocks Disk size9 File header ◆ File structure blocks ● Small file metadata has 10 (base,size) pointers • Pros: ● Big file has 10 indirect pointers ● Can grow up to a limit ◆ Pros & Cons size0 ● Random access is fast size1 ● Can grow ● No external fragmentation ● Fragmentation size9 • Cons: size0 ● Clumsy to grow beyond limit size1 size0 size1 ● Still lots of seeks size9 size9 11 25 26 Single-level indexed files (cont’d) Multi-level Indexed Files ! outer-index index table file 27 28 6 11/11/19 Hybrid Multi-level Indexed Files (Unix) Original Unix i-node ◆ 13 Pointers in a header ◆ Mode: file type, protection bits, setuid, setgid bits ● 10 direct pointers data ◆ Link count: no. of directory entries pointing to this file ● 11: 1-level indirect ◆ Uid: uid of the file owner ● 12: 2-level indirect data ◆ Gid: gid of the file owner ● 13: 3-level indirect 1 2 . ◆ File size ◆ Pros & Cons 11 . 12 . data ◆ Times (access, modify, change) ● In favor of small files 13 . ● Can grow . ● Limit is 16G . data ◆ 10 pointers to data blocks ● Can have lots of seeking ◆ Single indirect pointer . data ◆ Double indirect pointer ◆ Triple indirect pointer 12 13 29 30 Extents Naming Files ◆ An extent is a variable number of blocks Block offset Can name files via: ◆ Main idea length ◆ Index (i-node number): Not easy for users to specify ● A file is a number of extents ● XFS uses 8Kbyte blocks Starting block ◆ Text name: Need to map it to index ● Max extent size is 2M blocks ◆ Icon: Need to map it to index or to text and then to index ◆ Index nodes need to have ● Block offset . ● Length ◆ Directories ● Starting block ◆ Table of file name, file index pairs • Microsoft NTFS, Linux EXT4, … ◆ Map name to file index (where to find the header) ◆ Pros: little metadata, fast seq access, can grow over time, less ◆ A directory is itself stored as a file fragmentation ◆ Cons: external fragmentation still problem 14 15 31 32 7 11/11/19 Naming Tricks Directory Organization Examples • Bootstrapping: Where do you start looking? ● Root directory ◆ Flat (CP/M) ● inode #2 on the system ● All files are in one directory ● 0 and 1 used for other purposes ◆ Hierarchical (Unix) • Special names: ● /u/cos318/foo ● Root directory: “/” (bootstrap name system for users) ● Directory is stored in a file containing (name, i-node) pairs ● ● Current directory: “.” The name can be either a file or a directory

Load more