File Systems

Chapter 4

1

What do we need to know?

• How are files viewed on different OS’s? • What is a from the programmer’s viewpoint? – You mostly know this, but we’ll review the main points. • How are file systems put together? – How is the disk laid out for directories? For files? What kind of memory structures are needed? • What do some real file systems look like? – cp/m, ms-dos (fat-12/16/32), , nfs, , … • What directions are file systems going?

2

1 Long-term Information Storage

1. Must store large amounts of data

2. Information stored must survive the termination of the process using it

3. Multiple processes must be able to access the information concurrently

3

File Naming Issues

• Character Set • Length • Extensions

4

2 File Naming

Typical file extensions.

5

File Structure

There are lots of files types. Here are three: – byte sequence, record sequence, tree

6

3 Sample Files

(a) An executable file (b) An archive 7

File Access • Sequential access – read all bytes/records from the beginning – cannot jump around, could rewind or back up – convenient when medium was mag tape • Random access – bytes/records read in any order – essential for data base systems – read can be … • move file marker (seek), then read or … • read and then move file marker

8

4 File Attributes

Possible file attributes 9

File Operations

1. Create 7. Append 2. Delete 8. Seek 3. Open 9. Get attributes 4. Close 10. Set Attributes 5. Read 11. Rename 6. Write

10

5 An Example Program Using File System Calls (1/2)

11

An Example Program Using File System Calls (2/2)

12

6 Memory-Mapped Files

(a) Segmented process before mapping files into its address space (b) Process after mapping existing file abc into one segment creating new segment for xyz 13

Directories Single-Level Directory Systems

• A single level directory system – contains 4 files – owned by 3 different people, A, B, and C

14

7 Two-level Directory Systems

Letters indicate owners of the directories and files

15

Hierarchical Directory Systems

A hierarchical directory system

16

8 Directory Operations

1. Create 5. Readdir 2. Delete 6. Rename 3. Opendir 7. Link 4. Closedir 8. Unlink

17

File System Implementation

A possible file system layout

18

9 Implementing Files (1)

(a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed

19

Implementing Files (2)

Storing a file as a linked list of disk blocks

20

10 Implementing Files (3)

File Allocation Table (FAT) uses a linked list in memory 21

Implementing Files (4)

Combination of Direct and Indirect Pointers Note: This is a simplified version of Unix i-node 22

11 The UNIX V7 File System

A UNIX i-node

23

Implementing Directories (1)

(a) A simple directory fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node 24

12 Implementing Directories (2)

• Two ways of handling long file names in directory – (a) In-line

– (b) In a heap 25

Linking (1)

File system containing a file that is “shared” between two directories 26

13 Links (2)

(a) Situation prior to linking (b) After the link is created (c) After the original owner removes the file

27

Disk Space Management (1)

Block size

• Dark line (left hand scale) gives data rate of a disk • Dotted line (right hand scale) gives disk space efficiency

• All files here are 2KB 28

14 Disk Space Management (2)

(a) Storing the free list on a linked list (b) A bit map 29

File System Checking

• Possible results while running (a) consistent (b) missing block (c) duplicate block in free list (d) duplicate data block 30

15 File System Performance (1)

The block cache data structures

31

File System Writes

• Unix – “Critical Blocks” are written immediately – Data blocks are written periodically or when the block is removed from the block cache • MSDOS – Uses “Write-through cache”. All writes are immediate.

32

16 Read Ahead

• When block N is requested, the file system can issue a read for block N+1 also. • What if the file is not being read sequentially? – Initially assume it is, but monitor disk access and set a flag to non-sequential if needed. This can be used to disable read ahead.

33

File System Performance (2)

• I-nodes placed at the start of the disk • Disk divided into cylinder groups – each with its own blocks and i-nodes 34

17 Log-Structured File Systems

• With CPUs faster, memory larger – disk caches can also be larger – increasing number of read requests can come from cache – thus, most disk accesses will be writes

• LSF Strategy structures entire disk as a log – have all writes initially buffered in memory – periodically write these to the end of the disk log – when file opened, locate i-node, then find blocks

35

Journaling

• What happens when you remove a file? – Remove the directory entry – Release the i-node – Free the disk blocks • What happens if there is a crash after the first or second steps? • How can you minimize the damage?

36

18 The CP/M File System (1)

Memory layout of CP/M 37

The CP/M File System (2)

The CP/M directory entry format

38

19 (FAT) Partition Layout

• Partion layout: – Boot block – FAT – FAT copy – Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories and files

39

FAT Table Sizes

• FAT-12 – 212 clusters – Cluster Size: 512 Byte to 8KB – Partitions size up to 32MB (4K clusters * 8KB / cluster) – Windows default for volumes < 16MB, such as floppies • FAT-16 – 216 clusters – Cluster Size: 512 Byte to 64KB – Partitions size up to 4GB (64K clusters * 64KB / cluster) • FAT-32 – 228 clusters – Cluster Size: 512 Byte to 32KB – Partitions size in principle up to 8TB, but Windows will only create FAT-32 partitions up to 32GB. • Note that all FAT systems reserve the first two and last sixteen clusters in a partition, so actual partition sizes are slightly smaller than listed above.

40

20 Directory Entries

Original MS-DOS directory entry:

Directory entry used in Windows:

Bytes

41

The Windows 98 File System

An example of how a long name is stored in Windows 98

42

21

Disk layout in classical UNIX systems

43

The UNIX File System

A UNIX V7 directory entry (old)

44

22 The UNIX V7 File System

A UNIX i-node

45

UNIX File System

Directory entry fields.

Attributes in the i-node

46

23 The UNIX File System

Path Names 47

The UNIX File System

Some important directories found in most UNIX systems

48

24 Pathnames

• Absolute Pathname – Begins at the root directory : / – Contains all sub-directory names separated by slashes – Filename – ~jsterling • ~ is recognized as a short-cut for the absolute path to the home directory. • Relative Pathname – Does not begin with the root directory. – Starts in the current working directory. Sometimes the current working directory is shown explicitly using: . – May move up the file tree using .. to refer to a parent directory.

49

The UNIX File System

The steps in looking up /usr/ast/mbox 50

25 The UNIX File System

• Before linking. • After linking.

(a) Before linking. (b) After linking

Note that hard links Note that soft links (aka symbolic links) • must refer to other files in the same file system. • contain only a pathname. • are not permitted to refer to directories. • result in a “dangling pointer” if the original • are indistinguishable from the original file name. filename is deleted. 51

The UNIX File System

• Separate file systems • After mounting

(a) (b)

(a) Before mounting. (b) After mounting 52

26 UNIX File System (3)

The relation between the file descriptor table, the open file description and the i-node 53

The File System

• Super block: – # of , # of blocks, etc. • Group Descriptor: – # of free i-nodes, # of free blocks, # of directories • Bitmaps: – Each is one block long

54

27 Record Locking in Unix

• Can lock any range of bytes in a file • Can be multiple locks overlapping on a file • Locks can be – Exclusive (write): No other process can have a lock on the range. – Shared (read): Other locks may exist on the same bytes. • A failed lock can be made to block or not (choice of system call). • A process’s locks are released when a) The process terminates b) The process closes the file. Even if the process had it open more than once simultaneously! • Locks are not inherited across . • Locks can carry across exec. • Deadlock possibility: – Competing locks could in principle result in deadlock, – but this is prevented in Unix.

55

System Calls for File Management

• s is an error code • fd is a file descriptor • position is a file offset

56

28 The stat System Call

• Mode: includes type and protection • # • Device • Link count • Owner’s ID • Group ID • File Size in bytes • Access Time • Modification Time • Status change Time • Blocksize • Block count (512 byte blocks)

Note that not all fields are stored in the i-nodes, themselves.

From Solaris Man Page 57

System Calls for Directory Management

• s is an error code • identifies a directory stream • dirent is a directory entry 58

29 UNIX File System (4)

• A BSD directory with three files • The same directory after the file voluminous

has been removed 59

Unix File Protection

• Divide the world into categories (owner / group / world) and specifying read / write / execute access for each. • Add read/write permissions for the owner/user and the group with: – ug+rw – u for user/owner; g for group; o for other/world. – + means add and – means remove. – r for read; w for write; x for execute. • Adding / removing files requires write permission on their directory! • Setuid – Executable files may have their bit setuid set to indicate that they execute with the permission of their owner. – An example is the program to change the password. Requires write access to the password file.

60

30 NTFS Goals

• File / disk security • Disk quotas • File compression • Encryption

61

NTFS Features

• 64-bit cluster indices • Logging of metadata • Multiple data streams • Unicode-based names • Hard links • Sparse Files • Dynamic bad-cluster remapping

62

31 NTFS Master File Table (MFT)

• Can be anywhere. Boot sector says where • Contains up to 248 records • First 16 records reserved for Metadata, describing – MFT itself – copy of the MFT – logging file – Root directory – Bitmap of free/used blocks – Bootstrap code – Bad blocks – Quotas, etc.

63

NTFS MFT Record

• Describes one file or directory • Has a length of 1KB • Header – Magic number – Sequence number – Reference count – Size – Etc. • Attribute / value pairs – Eg. Name, additional MFT records, if directory then how the entries are stored. • May need more than one to describe a file.

64

32 NTFS MFT Data Attributes

• Data Attribute keeps track of runs of consecutive blocks • There may be holes. • Each entry consist of a header and a sequence of runs. • If file is small, data may be stored in MFT Record

65

NTFS Small Directory

• A small directory is an MFT record containing directory entries. • The entries contain the file’s name and the MFT index of the file (plus some additional flags). • Note: Large directories are stored as B+ trees.

66

33 NTFS File Name Lookup

• The file name above is C:\maria\web.htm • The filename lookup first prepends \?? to the name. • The name “\??\C:” is a to an object that points to the root directory of the C: drive • From there the filename lookup is similar to the process in Unix, accept that MFT index numbers replace the i-node numbers.

67

NTFS File Protection

• Access Control Lists – Discretionary (DACL) says who has what permissions – System (SACL) says whose actions get audited.

68

34 (NFS)

• Goal: to allow file system across a network to appear as one logical whole • Client – Server Model • Server “exports” one or more directories • Client 1 is replacing its /bin directory • Client 1 is also mounting the projects directory as /usr/ast/ work. • Client 2 is mounting the projects directory as /mnt.

69

NFS Protocols

• Mounting – Client sends pathname to server, requesting mount permission. – If the directory is listed as exported then server returns a file handle. – Remote directories might be mounted at boot time or automounted. • Directory and file access – Most Unix system calls supported. – Stateless connection. • Server does not support open • Instead Client sends a lookup which returns a handle. • Each read has to say where to start reading from. • File locks not supported.

70

35 NFS Implementation

• System call layer – Open, read, close, etc. • Layer – Maintain table of open files, using v-nodes – v-node may point to a standard i- node or to an NFS Client r-node. • NFS Client Code – On mount, create r-node for NFS directory. – On open, issue lookup and create r- node. – Transfers in 8KB chunks. – Client does read-ahead. – Client caching used for efficiency • Block discarded after 3 sec for data and 30 sec. for directories. • Writes synced after 30 seconds.

71

Example File Systems CD-ROM File Systems

• ISO9660 – Designed for lowest common denominator in 1988 (i.e., msdos). – Header contains 16 bytes that are up for grabs (e.g., bootstrap block) – Multi-byte numeric field in directories are presented twice, once in big endian and once in little endian. – File names were 8.3 + version • Rock Ridge extension allowed various Unix features, such as – file protections, – longer names, – more time stamps, – arbitrary depth hierarchies • Joliet extension (from ) added unicode.

72

36