Files Viewed on Different OS’S? • What Is a File System from the Programmer’S Viewpoint? – You Mostly Know This, but We’Ll Review the Main Points
Total Page:16
File Type:pdf, Size:1020Kb
File Systems Chapter 4 1 What do we need to know? • How are files viewed on different OS’s? • What is a file system from the programmer’s viewpoint? – You mostly know this, but we’ll review the main points. • How are file systems put together? – How is the disk laid out for directories? For files? What kind of memory structures are needed? • What do some real file systems look like? – cp/m, ms-dos (fat-12/16/32), ntfs, nfs, ext2, … • What directions are file systems going? 2 1 Long-term Information Storage 1. Must store large amounts of data 2. Information stored must survive the termination of the process using it 3. Multiple processes must be able to access the information concurrently 3 File Naming Issues • Character Set • Length • Extensions 4 2 File Naming Typical file extensions. 5 File Structure There are lots of files types. Here are three: – byte sequence, record sequence, tree 6 3 Sample Files (a) An executable file (b) An archive 7 File Access • Sequential access – read all bytes/records from the beginning – cannot jump around, could rewind or back up – convenient when medium was mag tape • Random access – bytes/records read in any order – essential for data base systems – read can be … • move file marker (seek), then read or … • read and then move file marker 8 4 File Attributes Possible file attributes 9 File Operations 1. Create 7. Append 2. Delete 8. Seek 3. Open 9. Get attributes 4. Close 10. Set Attributes 5. Read 11. Rename 6. Write 10 5 An Example Program Using Unix File System Calls (1/2) 11 An Example Program Using File System Calls (2/2) 12 6 Memory-Mapped Files (a) Segmented process before mapping files into its address space (b) Process after mapping existing file abc into one segment creating new segment for xyz 13 Directories Single-Level Directory Systems • A single level directory system – contains 4 files – owned by 3 different people, A, B, and C 14 7 Two-level Directory Systems Letters indicate owners of the directories and files 15 Hierarchical Directory Systems A hierarchical directory system 16 8 Directory Operations 1. Create 5. Readdir 2. Delete 6. Rename 3. Opendir 7. Link 4. Closedir 8. Unlink 17 File System Implementation A possible file system layout 18 9 Implementing Files (1) (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed 19 Implementing Files (2) Storing a file as a linked list of disk blocks 20 10 Implementing Files (3) File Allocation Table (FAT) uses a linked list in memory 21 Implementing Files (4) Combination of Direct and Indirect Block Pointers Note: This is a simplified version of Unix i-node 22 11 The UNIX V7 File System A UNIX i-node 23 Implementing Directories (1) (a) A simple directory fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node 24 12 Implementing Directories (2) • Two ways of handling long file names in directory – (a) In-line – (b) In a heap 25 Linking (1) File system containing a file that is “shared” between two directories 26 13 Links (2) (a) Situation prior to linking (b) After the link is created (c) After the original owner removes the file 27 Disk Space Management (1) Block size • Dark line (left hand scale) gives data rate of a disk • Dotted line (right hand scale) gives disk space efficiency • All files here are 2KB 28 14 Disk Space Management (2) (a) Storing the free list on a linked list (b) A bit map 29 File System Checking • Possible results while running fsck (a) consistent (b) missing block (c) duplicate block in free list (d) duplicate data block 30 15 File System Performance (1) The block cache data structures 31 File System Writes • Unix – “Critical Blocks” are written immediately – Data blocks are written periodically or when the block is removed from the block cache • MSDOS – Uses “Write-through cache”. All writes are immediate. 32 16 Read Ahead • When block N is requested, the file system can issue a read for block N+1 also. • What if the file is not being read sequentially? – Initially assume it is, but monitor disk access and set a flag to non-sequential if needed. This can be used to disable read ahead. 33 File System Performance (2) • I-nodes placed at the start of the disk • Disk divided into cylinder groups – each with its own blocks and i-nodes 34 17 Log-Structured File Systems • With CPUs faster, memory larger – disk caches can also be larger – increasing number of read requests can come from cache – thus, most disk accesses will be writes • LSF Strategy structures entire disk as a log – have all writes initially buffered in memory – periodically write these to the end of the disk log – when file opened, locate i-node, then find blocks 35 Journaling • What happens when you remove a file? – Remove the directory entry – Release the i-node – Free the disk blocks • What happens if there is a crash after the first or second steps? • How can you minimize the damage? 36 18 The CP/M File System (1) Memory layout of CP/M 37 The CP/M File System (2) The CP/M directory entry format 38 19 File Allocation Table (FAT) Partition Layout • Partion layout: – Boot block – FAT – FAT copy – Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories and files 39 FAT Table Sizes • FAT-12 – 212 clusters – Cluster Size: 512 Byte to 8KB – Partitions size up to 32MB (4K clusters * 8KB / cluster) – Windows default for volumes < 16MB, such as floppies • FAT-16 – 216 clusters – Cluster Size: 512 Byte to 64KB – Partitions size up to 4GB (64K clusters * 64KB / cluster) • FAT-32 – 228 clusters – Cluster Size: 512 Byte to 32KB – Partitions size in principle up to 8TB, but Windows will only create FAT-32 partitions up to 32GB. • Note that all FAT systems reserve the first two and last sixteen clusters in a partition, so actual partition sizes are slightly smaller than listed above. 40 20 Directory Entries Original MS-DOS directory entry: Directory entry used in Windows: Bytes 41 The Windows 98 File System An example of how a long name is stored in Windows 98 42 21 UNIX File System Disk layout in classical UNIX systems 43 The UNIX File System A UNIX V7 directory entry (old) 44 22 The UNIX V7 File System A UNIX i-node 45 UNIX File System Directory entry fields. Attributes in the i-node 46 23 The UNIX File System Path Names 47 The UNIX File System Some important directories found in most UNIX systems 48 24 Pathnames • Absolute Pathname – Begins at the root directory : / – Contains all sub-directory names separated by slashes – Filename – ~jsterling • ~ is recognized as a short-cut for the absolute path to the home directory. • Relative Pathname – Does not begin with the root directory. – Starts in the current working directory. Sometimes the current working directory is shown explicitly using: . – May move up the file tree using .. to refer to a parent directory. 49 The UNIX File System The steps in looking up /usr/ast/mbox 50 25 The UNIX File System • Before linking. • After linking. (a) Before linking. (b) After linking Note that hard links Note that soft links (aka symbolic links) • must refer to other files in the same file system. • contain only a pathname. • are not permitted to refer to directories. • result in a “dangling pointer” if the original • are indistinguishable from the original file name. filename is deleted. 51 The UNIX File System • Separate file systems • After mounting (a) (b) (a) Before mounting. (b) After mounting 52 26 UNIX File System (3) The relation between the file descriptor table, the open file description and the i-node 53 The Linux File System • Super block: – # of inodes, # of blocks, etc. • Group Descriptor: – # of free i-nodes, # of free blocks, # of directories • Bitmaps: – Each is one block long 54 27 Record Locking in Unix • Can lock any range of bytes in a file • Can be multiple locks overlapping on a file • Locks can be – Exclusive (write): No other process can have a lock on the range. – Shared (read): Other locks may exist on the same bytes. • A failed lock can be made to block or not (choice of system call). • A process’s locks are released when a) The process terminates b) The process closes the file. Even if the process had it open more than once simultaneously! • Locks are not inherited across fork. • Locks can carry across exec. • Deadlock possibility: – Competing locks could in principle result in deadlock, – but this is prevented in Unix. 55 System Calls for File Management • s is an error code • fd is a file descriptor • position is a file offset 56 28 The stat System Call • Mode: includes type and protection • Inode # • Device • Link count • Owner’s ID • Group ID • File Size in bytes • Access Time • Modification Time • Status change Time • Blocksize • Block count (512 byte blocks) Note that not all fields are stored in the i-nodes, themselves. From Solaris Man Page 57 System Calls for Directory Management • s is an error code • dir identifies a directory stream • dirent is a directory entry 58 29 UNIX File System (4) • A BSD directory with three files • The same directory after the file voluminous has been removed 59 Unix File Protection • Divide the world into categories (owner / group / world) and specifying read / write / execute access for each.