CS 423 – Operating Systems Design

Lecture 18 – File Systems and their Management and Optimization

Klara Nahrstedt Fall 2011

Based on slides by YY Zhou and Andrew S. Tanenbaum

CS 423 - Fall 2011 Overview

 Administrative announcements ◦ MP2 interviews today ◦ Homework 1 – posted today October 3 ◦ Homework 1 - deadline October 10 in class  File Systems ◦ Log-Structured File Systems ◦ Journaling File Systems  Disk Space Management  Backups  File System Consistency  File System Performance  Summary

CS 423 - Fall 2011 Log-Structured File Systems  CPUs getting faster + Disks getting bigger and cheaper  Disk seek time is not improving => performance bottleneck  Utilize fast CPU and large RAM/disk caches  Satisfy all read requests directly from FS cache with no disk access needed.  Deal with “small writes” because: ◦ Consider creating a new file  To write this file, i-node for the directory, directory block, i-node for the file and the file itself must be written.  While the writes can be delayed, doing so exposes the file system to serious consistency problems if a crash occurs before the writes are done  Hence i-nodes writes are generally done immediately.

CS 423 - Fall 2011 Log-structured FS

 Solution: ◦ LFS – Log-structured FS  Idea: Structure whole disk as a log  Process: ◦ All pending writes are buffered in memory and collected in a single segment ◦ Periodically (or when needed) they are written to the disk as a single contiguous segment at the end of the log. ◦ i-node map, indexed by i-number, is maintained  Entry I in this map points to i-node I on the disk  Map is kept on disk, but it is also cached  Opening a file now consists of ◦ using the map to locate the i-node for the file ◦ Once the i-node has been located, the addresses of blocks can be found from it.  LFS has cleaner thread that spends its time scanning log circularly to compact it since after some time not all blocks are used CS 423 - Fall 2011

Journaling File Systems

 Idea: ◦ keep log of what FS is going to do before it does it ◦ if the system crashes before it can do its planned work, upon rebooting the system can look in the log to see what was going on at the time of crash and finish the job.  Solution: ◦ JFS – Journaling File Systems  and Microsoft NTFS

CS 423 - Fall 2011 JFS - Example

 Consider removing file operation 1. Remove file from its directory 2. Release i-node to the pool of free i-nodes 3. Return all disk blocks to the pool of free disk blocks  Suppose the first step completes and then system crashes  i-node and file blocks will not be accessible from any file, but will also not be available for reassignment – decrease of available resources  If the crash occurs after the second step, only blocks are lost.  JFS does:  Write log entry of the three steps to be completed  Write log entry to disk  Only after the log entry has been written, do the individual steps  JSF only works ◦ if the logged operations are idemponent CS 423 - Fall 2011 Disk Space Management (1)

Disk Block Size Decision – Small Blocks or Large Blocks? Trade-offs between space efficiency on the disk and access time (data rates)

CS 423 - Fall 2011 Disk Space Management (2)

If we have small disk block sizes, we get high space efficiency (no wastage), But low performance (data rates); With large block sizes, we get high data rates (high performance) , but Low space utilization CS 423 - Fall 2011 Free Space Management

 Bit vector ◦ A bit map is kept of free blocks ◦ Each bit in a vector represents one block ◦ If the block is free, the bit is zero ◦ Simple to find n consecutive free blocks ◦ Overhead is bit map ◦ Example BSD file system

CS 423 - Fall 2011 Free Space Management

 Free list ◦ Keep a linked list of free blocks ◦ Not very efficient because linked list needs traversal ◦ Example system V R1

CS 423 - Fall 2011 Free Space Management

 Linked list of indices ◦ A linked list of index blocks is kept ◦ Each index block contains addresses of free blocks and a ◦ Pointer to the next index block  A large number of free blocks can be found quickly

CS 423 - Fall 2011 Free Space Management

 Linked list of contiguous blocks that are free ◦ The free list node consists of a pointer and the number of free blocks starting from that address ◦ Blocks are joined together into larger blocks as necessary

CS 423 - Fall 2011 Free Space Management (Example)

CS 423 - Fall 2011 Free Space Management Issues

(a) Almost-full block of pointers to free disk blocks in RAM - three blocks of pointers on disk (b) Result of freeing a 3-block file (c) Alternative strategy for handling 3 free blocks - shaded entries are pointers to free disk blocks

CS 423 - Fall 2011 Disk Quota Management

Quotas for keeping track of each user’s disk use

CS 423 - Fall 2011 File System Reliability (1)

File that has not changed  A file system to be dumped (Logical Dump of directories/files) ◦ squares are directories, circles are files’ shaded items, modified since last dump; each directory & file labeled by i-node number  Bit maps used by the logical dumping algorithm CS 423 - Fall 2011 File System Reliability (2)

 Dump Algorithm: ◦ Phase 1 – for each modified file, its i-node is marked in the bitmap and each directory is also marked (whether or not it has been modified) ◦ Phase 2 – recursively walk the tree again, unmarking any directories that have no modified files or directories in them or under them ◦ Phase 3 – scan i-nodes in numerical order and dump all directories that are marked for dumping ◦ Phase 4 – scan i-nodes in numerical order and dump files that are marked for dumping

CS 423 - Fall 2011 File System Consistency

 fsck utility in UNIX  Block Consistency – two tables ◦ 1 table – keep track of assigned blocks ◦ 2 table – keept track of free blocks

CS 423 - Fall 2011 File System Consistency

 File system states ◦ (a) consistent ◦ After crash (b) missing block  Cause no harm, but waste space and reduce capacity of disk  Solution: just add missing block to the free list ◦ After crash (c) duplicate block in free list  Happens only if we use list for free list (not bitmap)  Solution: rebuild the free list ◦ After crash (d) duplicate data block  Solution: allocate a free block, copy contents of block 5 and insert the copy into one of the files, error should be reported to allow user inspect the damage

CS 423 - Fall 2011 File System Performance

 Access to disk – much slower than access to memory ◦ Read a memory word – 10 nsec ◦ Read from hard disk with 10MBps – 5-10 msec  Methods to speed up ◦ Cache data in memory ◦ Use block read ahead method ◦ Reduce disk arm motion

CS 423 - Fall 2011 File System Performance (Caching)

 If cache is full, use replacement techniques ◦ LRU – Least Recently Used ◦ FIFO  Cache here is a collection of blocks  Important method for multimedia playback CS 423 - Fall 2011 File System Performance (Block Read Ahead)

 Block read ahead method means ◦ Try to get blocks into the cache before they are needed ◦ Increase hit rate by prefetching anticipated blocks  Approach: ◦ User requests ‘k’ block ◦ FS gets ‘k’ block ◦ FS checks of ‘k+1’ block is in cache, if not, FS will get ‘k+1’ block from disk anticipating that it will be needed in the future  Advantage: if user needs ‘k+1’ block, the access is fast – great method for video playback  Disadvantage: if user does not need ‘k+1’ block, extra unnecessary work has been done  Recommendation: ◦ FS keeps track of access patterns to open files  Sequential access mode  Random access mode ◦ FS may use a bit associated with each file to keep track of access pattern (1 for sequential access, 0 for random access)

CS 423 - Fall 2011 File System Performance (Reduce Disk Arm Motion)

 I-nodes placed at the start of the disk  Disk divided into cylinder groups ◦ each with its own blocks and i-nodes

CS 423 - Fall 2011 Questions

 Which of the following free space management schemes allows a large number of free blocks to be found quickly? ◦ Bit vector ◦ Free list ◦ Linked list with indices  Consider a system in which free space is kept in a free space list. If the pointer to the free-space list is lost, the system cannot reconstruct the free space list: ◦ Is this true or false?

CS 423 - Fall 2011 Conclusion

 Performance Optimization of File Systems is crucial  Pay attention to ◦ Block sizes ◦ Placement of i-nodes ◦ Free space management ◦ File system reliability ◦ File system performance (caching, prefetching, ….)

CS 423 - Fall 2011