THE BUFFER CACHE – Can Contain “Dirty” Blocks (With Modifications Not on Disk)

THE BUFFER CACHE – Can Contain “Dirty” Blocks (With Modifications Not on Disk)

Recall: Components of a File System CS162 File path Operating Systems and Systems Programming Directory Structure Lecture 21 File Header One Block = multiple sectors Filesystems 3: Filesystem Case Studies (Con’t), File number Structure Ex: 512 sector, 4K block Buffering, Reliability, and Transactions “inumber” Data blocks November 9th, 2020 Prof. John Kubiatowicz “inode” http://cs162.eecs.Berkeley.edu … 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.2 Recall: FAT Properties Recall: Multilevel Indexed Files (Original 4.1 BSD) • File is collection of disk blocks • Sample file in multilevel indexed format: FAT Disk Blocks (Microsoft calls them “clusters”) – 10 direct ptrs, 1K blocks • FAT is array of integers mapped 1-1 0: 0: File number – How many accesses for block #23? with disk blocks (assume file header accessed on open)? – Each integer is either: 31: File 31, Block 0 » Two: One for indirect block, one for data » Pointer to next block in file; or File 31, Block 1 – How about block #5? » End of file flag; or » One: One for data » Free block flag – Block #340? • File Number is index of root » Three: double indirect block, of block list for the file File 31, Block 3 indirect block, and data free – Follow list to get block # • UNIX 4.1 Pros and cons – Directory must map name to block – Pros: Simple (more or less) number at start of file File 31, Block 2 Files can easily expand (up to a point) Small files particularly cheap and easy • But: Where is FAT stored? – Cons: Lots of seeks – Beginning of disk, before the data blocks N-1: N-1: Very large files must read many indirect block (four I/Os per block!) – Usually 2 copies (to handle errors) memory 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.3 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.4 Fast File System (BSD 4.2, 1984) • Same inode structure as in BSD 4.1 – same file header and triply indirect blocks like we just studied – Some changes to block sizes from 1024 4096 bytes for performance • Paper on FFS: “A Fast File System for UNIX” – Marshall McKusick, William Joy, Samuel Leffler and Robert Fabry – Off the “resources” page of course website – Take a look! • Optimization for Performance and Reliability: – Distribute inodes among different tracks to be closer to data – Uses bitmap allocation in place of freelist CASE STUDY: – Attempt to allocate files contiguously – 10% reserved disk space BERKELEY FAST FILE SYSTEM (FFS) – Skip-sector positioning (mentioned later) 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.5 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.6 FFS Changes in Inode Placement: Motivation FFS Locality: Block Groups • In early UNIX and DOS/Windows’ FAT file system, headers stored in special • The UNIX BSD 4.2 (FFS) distributed the header information array in outermost cylinders (inodes) closer to the data blocks – Fixed size, set when disk is formatted – Often, inode for file stored in same “cylinder group” » At formatting time, a fixed number of inodes are created as parent directory of the file » Each is given a unique number, called an “inumber” – makes an “ls” of that directory run very fast • File system volume divided into set of block groups • Problem #1: Inodes all in one place (outer tracks) – Close set of tracks – Head crash potentially destroys all files by destroying inodes – Inodes not close to the data that the point to • Data blocks, metadata, and free space interleaved within block group » To read a small file, seek to get header, seek back to data – Avoid huge seeks between user data and system structure • Problem #2: When create a file, don’t know how big it will become (in UNIX, most writes are by appending) • Put directory and its files in common block group – How much contiguous space do you allocate for a file? – Makes it hard to optimize for performance 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.7 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.8 FFS Locality: Block Groups (Con’t) UNIX 4.2 BSD FFS First Fit Block Allocation • First-Free allocation of new file blocks – To expand file, first try successive blocks in bitmap, then choose new range of blocks – Few little holes at start, big sequential runs at end of group – Avoids fragmentation – Sequential layout for big files • Important: keep 10% or more free! – Reserve space in the Block Group • Summary: FFS Inode Layout Pros – For small directories, can fit all data, file headers, etc. in same cylinder no seeks! – File headers much smaller than whole block • Fills in the small holes at the start of block group (a few hundred bytes), so multiple headers fetched from disk at same time – Reliability: whatever happens to the disk, you can find many of the files • Avoids fragmentation, leaves contiguous free space at end (even if directories disconnected) 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.9 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.10 Attack of the Rotational Delay UNIX 4.2 BSD FFS • Problem 3: Missing blocks due to rotational delay •Pros – Issue: Read one block, do processing, and read next block. In meantime, disk has continued turning: missed next block! Need 1 revolution/block! – Efficient storage for both small and large files Skip Sector – Locality for both small and large files – Locality for metadata and data – No defragmentation necessary! Track Buffer (Holds complete track) • Cons – Solution1: Skip sector positioning (“interleaving”) – Inefficient for tiny files (a 1 byte file requires both an inode and a data » Place the blocks from one file on every other block of a track: give time for processing block) to overlap rotation » Can be done by OS or in modern drives by the disk controller – Inefficient encoding when file is mostly contiguous on disk – Solution 2: Read ahead: read next block right after first, even if application hasn’t asked – Need to reserve 10-20% of free space to prevent fragmentation for it yet » This can be done either by OS (read ahead) » By disk itself (track buffers) - many disk controllers have internal RAM that allows them to read a complete track • Modern disks + controllers do many things “under the covers” – Track buffers, elevator algorithms, bad block filtering 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.11 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.12 Administrivia Linux Example: Ext2/3 Disk Layout • Disk divided into block groups – Provides locality – Each group has two block-sized bitmaps (free blocks/inodes) – Block sizes settable at format time: 1K, 2K, 4K, 8K… • Actual inode structure similar to 4.2 BSD • Midterm 2: Graded! – with 12 direct pointers – Mean: 55.34, Stdev 15.09 • Ext3: Ext2 with Journaling – Historical offset: +26 – Several degrees of protection with • No Class on Wednesday! comparable overhead – It is a holiday – We will talk about Journalling later • If you have any group issues going on, make sure you: – Make sure that your TA understands what is happing – Make sure that you reflect these issues on your group evaluations • Example: create a file1.dat under /dir1/ in Ext3 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.13 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.14 Recall: Directory Abstraction Hard Links • Hard link • Directories are specialized files – Mapping from name to file number in the directory structure – Contents: List of pairs <file name, file number> /usr – First hard link to a file is made when file created /usr • System calls to access directories – Create extra hard links to a file with the link() system call – open / creat traverse the structure – Remove links with unlink() system call – mkdir /rmdir add/remove entries /usr/lib /usr/lib /usr/lib4.3 • When can file contents be deleted? /usr/lib4.3 – link / unlink (rm) – When there are no more hard links to the file • libc support – Inode maintains reference count for this purpose – DIR * opendir (const char *dirname) – struct dirent * readdir (DIR *dirstream) – int readdir_r (DIR *dirstream, struct dirent *entry, /usr/lib4.3/foo /usr/lib4.3/foo struct dirent **result) 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.15 11/9/20 Kubiatowicz CS162 © UCB Fall 2020 Lec 21.16 Soft Links (Symbolic Links) Directory Traversal • What happens when we open /home/cs162/stuff.txt? inode • Soft link or Symbolic Link or Shortcut • “/” - inumber for root inode configured into kernel, say 2 block 49358 – Directory entry contains the path and name of the file 2 – Read inode 2 from its position in inode array on disk “home”:8086 – Map one name to another name – Extract the direct and indirect block pointers 732 block 12132 – Determine block that holds root directory (say block 49358) • Contrast these two different types of directory entries: – Read that block, scan it for “home” to get inumber for this “stuff.txt”:9909 – Normal directory entry: <file name, file #> directory (say 8086) 8086 block 7756 – Symbolic link: <file name, dest. file name> • Read inode 8086 for /home, extract its blocks, read block (say 7756), scan it for “cs162” to get its inumber (say 732) “cs162”:732 • Read inode 732 for /home/cs162, extract its blocks, read • OS looks up destination file name each time program accesses block (say 12132), scan it for “stuff.txt” to get its inumber, Blocks of source file name say 9909 9909 stuff.txt – Lookup can fail (error result from open) • Read inode 9909 for /home/cs162/stuff.txt • Set up file description to refer to this inode so reads / • Unix: Create soft links with symlink syscall write can access the data blocks referenced by its direct and indirect pointers 2 9099 “home”:8086 • Check permissions on

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us