<<

5/15/2017

File Systems: Semantics & Structure What is a File

11A. File Semantics • a file is a named collection of information 11B. Namespace Semantics • primary roles of : 11C. File Representation – to store and retrieve data – to manage the media/space where data is stored 11D. Free Space Representation • typical operations: 11E. Namespace Representation – where is the first block of this file 11L. – where is the next block of this file 11F. File System Integration – where is block 35 of this file – allocate a new block to the end of this file – free all blocks associated with this file

File Systems Semantics and Structure 1 File Systems Semantics and Structure 2

Data and Metadata Sequential Stream Access

• File systems deal with two kinds of information int infd = open(“abc”, O_RDONLY); int outfd = open(“xyz”, O_WRONLY+O_CREATE, 0666); • Data – the contents of the file if (infd >= 0 && outfd >= 0) { – e.g. instructions of the program, words in the letter int count = (infd, buf, sizeof buf); Metadata – Information about the file • while( count > 0 ) { e.g. how many are there, when was it created – write(outfd, buf, count); sometimes called attributes – count = read(infd, inbuf, BUFSIZE); • both must be persisted and protected } – stored and connected by the file system close(infd); close(outfd); }

File Systems Semantics and Structure 3 File Systems Semantics and Structure 4

Random Access Consistency Model void *readSection(int fd, struct hdr *index, int section) { struct hdr *head = &hdr[section]; • When do new readers see results of a write? off_t offset = head->section_offset; – read-after-write size_t len = head->section_length; • as soon as possible, data-base semantics void *buf = malloc(len); if (buf != NULL) { • this commonly called “POSIX consistency” lseek(fd, offset, SEEK_SET); – read-after-close (or sync/commit) if ( read(fd, buf, len) <= 0) { free(buf); • only after writes are committed to storage buf = NULL; – open-after-close (or sync/commit) } • each open sees a consistent snapshot } return(buf); – explicitly versioned files } • each open sees a named, consistent snapshot

File Systems Semantics and Structure 5 File Systems Semantics and Structure 6

1 5/15/2017

File Attributes – basic properties Extended File Types and Attributes

• thus far we have focused on a simple model • extended protection information – a file is a "named collection of data blocks" – e.g. access control lists • in most OS files have more state than this • resource forks – file (regular file, , device, IPC port, ...) – e.g. configuration data, fonts, related objects – file length (may be excess space at end of last block)) • application defined types – ownership and protection information – e.g. load modules, HTML, e-, MPEG, ... – system attributes (e.g. hidden, archive) – creation , modification time, last accessed time • application defined properties • typically stored in structure – e.g. compression scheme, encryption algorithm, ...

File Systems Semantics and Structure 7 File Systems Semantics and Structure 8

Databases Object Stores simplified file systems, cloud storage • a tool managing business critical data • – optimized for large but infrequent transfers • table is equivalent of a file system • bucket is equivalent of a file system • data organized in rows and columns – a bucket contains named, versioned objects – row indexed by unique key • objects have long names in a flat name space – columns are named fields within each row – object names are unique within a bucket • support a rich set of operations • an object is a blob of immutable bytes – multi-object, read/modify/write transactions – get … all or part of the object – SQL searches return consistent snapshots – put … new version, there is no append/update – insert/delete row/column operations – delete

File Systems Semantics and Structure 9 File Systems Semantics and Structure 10

Key-Value Stores File Names and Name Binding

• smaller and faster than an SQL database • file system knows files by internal descriptors – optimized for frequent small transfers • users know files by names • table is equivalent of a file system – names more easily remembered than disk addresses – a table is a collection of key/value pairs – names can be structured to organize millions of files • keys have long names in a flat name space • file system responsible for name-to-file mapping – associating names with new files – key names are unique within a table – changing names associated with existing files value is a (typically 64-64MB) string • – allowing users to search the name space – get/put (entire value) • there are many ways to structure a name space – delete

File Systems Semantics and Structure 11 File Systems Semantics and Structure 12

2 5/15/2017

What is in a Name? Flat Name Spaces

directory • there is one naming context per file system /home /mark /TODO. txt – all file names must be unique within that context suffix separator base name • all files have exactly one true name – these names are probably very long • suffixes and file types • file names may have some structure – file-to-application binding often based on suffix e.g. CAC101.CS111.SECTION1.SLIDES.LECTURE_13 • defined by system configuration registry – • configured per user, or per directory – this structure may be used to optimize searches – suffix may define the file type (e.g. Windows) – the structure is very useful to users – suffix may only be a hint (magic # defines type) – the structure has no meaning to the file system

File Systems Semantics and Structure 13 File Systems Semantics and Structure 14

Hierarchical Namespaces A rooted directory tree

• directory root – a file containing references to other files – it can be used as a naming context user_1 user_2 user_3 • each has a current • names are interpreted relative to directory file_a dir_a file_b file_c dir_a • nested directories can a tree (/user_1/file_a) (/user_1/dir_a) (/user_2/file_b) (/user_3/file_c) (/user_3/dir_a) – file name is a through that tree – directory tree expands from a root node file_a file_b (/user_1/dir_a/file_a) (/user_3/dir_a/file_b) • fully qualified names begin from the root – may actually form a directed graph

File Systems Semantics and Structure 15 File Systems Semantics and Structure 16

True Names vs. Path Names Hard Links: example

• Some file systems have “true names” root • DOS and ISO9660 have a single “path name” user_1 user_3 – files are described by directory entries

– data is referred to by exactly one directory entry dir_a file_c – each file has only one (character string) name file_a /user_3/dir_a/file_b /user_1/file_a • Unix (and ) … have named links file_b – files are described by I-nodes (w/unique I#) – directories associate names with I-node numbers – many directory entries can refer to same I-node Both names now refer to the same I-node

File Systems Semantics and Structure 17 File Systems Semantics and Structure 18

3 5/15/2017

Unix-style Hard Links Symbolic Links: example

• all protection information is stored in the file root ln –s /user_3/dir_a/file_b /user_1/file_a – file owner sets file protection (e.g. read-only) user_1 user_3 – all links provide the same access to the file – anyone with read access to file can create new link file_a dir_a file_c – but directories are protected files too

• not everyone has read or search access to every directory /user_3/dir_a/file_b file_b • all links are equal – there is nothing special about the owner‘s link – file is not deleted until no links remain to file – reference count keeps track of references /user_1/file_a is now a macro for /user_3/dir_a/file_b

File Systems Semantics and Structure 19 File Systems Semantics and Structure 20

Symbolic Links Generalized Directories: Issues

• another type of special file • Can there be multiple links to a single file? – an indirect reference to some other file – who can create new references to a file? – contents is a path name to another file – if one reference is deleted, does the file go away? • recognizes symbolic links – can the file's owner cancel other peoples' access? – automatically opens associated file instead • Can clients work their way back up the tree? – if file is inaccessible or non-existent, the open fails • Is the namespace truly a tree • is not a reference to the I-node – or is it an a-cyclic directed graph – symbolic links will not prevent deletion – or an arbitrary directed graph – do not guarantee ability to follow the specified path – Internet URLs are similar to symbolic links • Does namespace span multiple file systems?

File Systems Semantics and Structure 21 File Systems Semantics and Structure 22

File System Goals File System Structure • ensure the privacy and integrity of all files • disk volumes are divided into fixed-sized blocks • efficiently implement name-to-file binding – many sizes are used: 512, 1024, 2048, 4096, 8192 ... – file associated with this name • most of them will store user data – list the file names in this part of the name space • some will store organizing “meta-data” • efficiently manage data associated w/each file – description of the file system (e.g. layout and state) – file control blocks to describe individual files – return data at offset X in file Y – lists of free blocks (not yet allocated to any file) – write data Z at offset X in file Y • all operating systems have such data structures • manage attributes associated w/each file – different OS and FS often have very different goals – what is the length of file Y – these result in very different implementations – change owner/protection of file Y to be X

File Systems Semantics and Structure 23 File Systems Semantics and Structure 24

4 5/15/2017

Unix System 5 – Structure File Descriptor Structures

block 0 boot block • all file systems have file descriptor structures super block size and number of I-nodes are block 1 • contain all info about file UNIX I-node block specified in super block – type (e.g. file, directory, pipe) type protection block 2 owner group I-node #1 (traditionally) describes the – ownership and protection # links I-nodes – size (in bytes) last access time – other attributes last written time available data blocks begin immediately after the last I-node update time end of the I-nodes. blocks – location of data blocks data block pointers … • descriptor location/# is file’s true name

File Systems Semantics and Structure 25 File Systems Semantics and Structure 26

Ways to Manage Allocated Space Unix I-nodes and block pointers

block pointers tripple double-Indirect Indirect blocks data block s • a single large, contiguous (in I-node) 1st 1st – one pointer per file, very efficient I/O 2nd nd 3rd 2 hard to extend, external fragmentation, coalescing 4th ... – th 5 10 th 6th • a linked lists of blocks 7th 11 th 8th ... 9th ... – one pointer per file, one per extent 10 th 1034 th th 11 1035 th – potentially long searches 12 th ... 13 th ...... th • N block pointers per file 2058 2059 th – limits maximum file size to N blocks ...... – but maybe some blocks contain pointers F1

File Systems Semantics and Structure 27 File Systems Semantics and Structure 28

(Unix I-node block mapping) I-nodes – performance

• I-node contains 13 block pointers • I-node is in memory whenever file is open – first 10 point to first 10 blocks of file • first ten blocks can be found with no I/O – 11th points to an indirect block (e.g. 4k bytes = 1k blocks) • after that, we must read indirect blocks – 12th points to a double indirect block (w/1k indirect blocks) – the real pointers are in the indirect blocks – 13th points to a triple indirect block (w/1k double indirs) – sequential file processing will keep referencing it • assuming 4k bytes per block and 4-bytes per pointer – 10 direct blocks = 10 * 4K bytes = 40K bytes – block I/O will keep it in the buffer cache – indirect block = 1K * 4K = 4M bytes • 1-3 extra I/O operations per thousand pages – double indirect = 1K * 4M = 4G bytes – any block can be found with 3 or fewer reads – triple indirect = 1K * 4G = 4T bytes (finite, but large) • index blocks can support "sparse" files • block # width determines max file system size

File Systems Semantics and Structure 29 File Systems Semantics and Structure 30

5 5/15/2017

DOS FAT – Volume Structure Clusters in a DOS FAT File

boot block directory entry block 0 512 name: myfile.txt cluster size and FAT length are specified 1 x BIOS parameter Each FAT entry block 1 512 in the BPB length: 1500 bytes 2 block (BPB) x corresponds to a 1st cluster: 3 3 4 cluster, and block 2 512 File 4 contains the 5 Allocation cluster #3 number of the Table 5 data clusters begin immediately after first 512 bytes of file -1 next cluster. (FAT) 6 the end of the FAT 0 -1 = End of File cluster #4 cluster #1 root directory begins in the first data 0 = free cluster (root directory) cluster second 512 bytes of file

cluster #2 cluster #5 … last 476 bytes of file

File Systems Semantics and Structure 31 File Systems Semantics and Structure 32

(DOS FAT File Systems – Overview) FAT – Performance/Capabilities

• DOS file systems divide space into "clusters" • to find a particular block of a file – cluster size (multiple of 512) fixed for each file system – get number of first cluster from directory entry – clusters are numbered 1 though N – follow chain of pointers through FAT • File control structure points to first cluster of file • entire File Allocation Table is kept in memory • File Allocation Table (FAT), one entry per cluster – no disk I/O is required to find a cluster – has number of next cluster in file – for very large files the search can still be long – 0 -> cluster is not allocated • no support for "sparse" files – -1 -> end of file – if a file has a block n, it must have all blocks < n • width of FAT determines max file system size

File Systems Semantics and Structure 33 File Systems Semantics and Structure 34

Free Space Maintenance Bit Map Free Lists • file system manager manages the free space free block bit map: one bit per block • get/release chunk should be fast operations 0 01 0 1 1 … – they are extremely frequent – we'd like to avoid doing I/O as much as possible • unlike memory, it matters what chunk we choose – best to allocate new space in same cylinder as file block #1 block #2 block #3 block #4 block #5 block #6 – user may ask for contiguous storage (in use) (in use) (free) (in use) (free) (free) • file system free-list organization must address actual data blocks – speed of allocation and de-allocation – ability to allocate contiguous or near-by space BSD file systems use bit-maps to keep track of both free blocks and free I-nodes in each – ability to coalesce and de-fragment cylinder group

File Systems Semantics and Structure 35 File Systems Semantics and Structure 36

6 5/15/2017

(free space bit-maps) FFS I-nodes and Free Lists

block pointers • fixed sized blocks simplify free-lists (in I-node) 1st 1st tag">C.G. – equal sized blocks do not require size information 2nd nd summary 3rd 2 all blocks are fungible (modulo performance) 4th ... free – th 5 10 th I-node 6th map • bit maps are a very efficient representation 7th 11 th 8th ... 9th ... – minimal space to store the map 10 th 1034 th th free 11 1035 th – very code and cache efficient to search 12 th ... block 13 th ... map ... th • bit maps enable efficient allocation 2058 2059 th – easy to find chunks in a desired area ...... – easy to coalesce adjacent chunks

File Systems Semantics and Structure 37 File Systems Semantics and Structure 38

FFS create(/foo/bar) FFS 3 x write(bar, one block)

data I-node root foo bar root foo bar bar bar bitmap bitmap I-node I-node I-node data data data[0] data[1] data[2] data I-node root foo bar root foo bar bar bar bitmap bitmap I-node I-node I-node data data data[0] data[1] data[2] find bar[0] read search / read new block read new block write search / read

search foo read write bar[1] write search foo read update I-node write find bar[1] read new I-node read new block read new I-node write new block write update foo write write data write new I-node read update I-node write new I-node write find bar[2] read new block read update foo write new block write write bar[2] write update I-node write File Systems Semantics and Structure 39 File Systems Semantics and Structure 40

FAT Free Space FAT Free Space

boot BIOS File Allocation data clusters … • can search for free clusters in desired cylinder block parms Table – we can map clusters to cylinders • the BIOS Parameter Block describes the device geometry – look at first-cluster of file to choose desired cylinder ## ## ## ## -10 0 ## 0 ## … – start search at first cluster of desired cylinder – examine each FAT entry until we find a free one Each FAT entry corresponds to a cluster, and contains the • if no free clusters, we must garbage collect number of the next cluster. recursively search all directories for existing files A value of zero indicates a cluster that is not allocated to any – file, and is therefore free. – enumerate all of the clusters in each file – any clusters not found in search can be marked as free

File Systems Semantics and Structure 41 File Systems Semantics and Structure 42

7 5/15/2017

Clusters in a DOS FAT File (Extending a DOS/FAT file)

directory entry File Allocation Table • note cluster number of current last cluster in file name: myfile.txt 1 x Each FAT entry • search FAT to find a free cluster length: 1500 bytes 2 x corresponds to a 1st cluster: 3 – free clusters are indicated by a FAT entry of zero 3 4 cluster, and 4 contains the 5 – look for a cluster in same cylinder as previous cluster cluster #3 5 number of the -1 next cluster. – put -1 in FAT entry to indicate that this is the new EOF first 512 bytes of file 6 0 -1 = End of File – this has side effect of marking new cluster as not free cluster #4 0 = free cluster second 512 bytes of file • chain new cluster on to end of the file – put number of new cluster into FAT entry for last cluster #5 cluster last 476 bytes of file

File Systems Semantics and Structure 43 File Systems Semantics and Structure 44

Directories are usually files UNIX Directories

root directory, I-node #1 • directories are a special type of file I-node # file name – used by OS to map file names into the associated files 1 . 1 .. • a directory contains multiple directory entries 9 user_1 – each directory entry describes one file and its name 31 user_2 114 user_3 • user applications are allowed to read directories directory /user_3, I-node #114 – to get information about each file I-node # file name – to find out what files exist 114 . 1 .. • only Operating System is allowed to write them 194 dir_a – the file system depends on the integrity of directories 307 file_c

A2 File Systems Semantics and Structure 45 File Systems Semantics and Structure 46

(Example: UNIX Directories) Hard Links - example I-node #1, root directory

• file names separated by slashes 1 . – e.g. /user_3/dir_a/file_b 1 .. I-node #9, directory 9 user_1

• the actual file descriptors are the I-nodes 9 . 31 user_2 – directory entries only point to I-nodes 1 .. 114 user_3 – association of a name with an I-node is called a "link" 118 dir_a 29 file_a I-node #114, directory – multiple directory entries can point to the same I- 114 . node I-node #29, file 1 .. • contents of a Unix directory entry 194 dir_a 29 file_c – name (relative to this directory) – pointer to the I-node of the associated file

File Systems Semantics and Structure 47 File Systems Semantics and Structure 48

8 5/15/2017

Hard Links and De-allocation Symbolic Links - example I-node #1, root directory

• there can be many links to a single I-node I-node #9, directory 1 . 1 .. – all links are equivalent 9 . 9 user_1 1 .. – no link enjoys a preferential (e.g. master) status 31 user_2 118 dir_a 114 user_3 • file exists as long as at least one link exists 46 file_a

• most easily implemented w/reference counts I-node #114, directory – increment with every link operation 114 . 1 .. I-node #46, symlink I-node #29, file – decrement with every unlink operation 194 dir_a – delete file when reference count goes to zero /user_3/file_c 29 file_c • could be implemented with garbage collection

File Systems Semantics and Structure 49 File Systems Semantics and Structure 50

DOS Directories (Example: DOS Directories) root directory, starting in cluster #1 • File & directory names separated by back-slashes file nametype length … 1st cluster – e.g. \user_3\dir_a\file_b user_1 DIR 256 bytes … 9 user_2 DIR 512 bytes … 31 • directory entries are the file descriptors user_3 DIR 284 bytes … 114 – as such, only one entry can refer to a particular file • contents of a DOS directory entry Directory /user_3, starting in cluster #114 name (relative to this directory) file nametype length … 1st cluster –

.. DIR 256 bytes … 1 – type (ordinary file, directory, ...) dir_a DIR 512 bytes … 62 – location of first cluster of file file_c FILE 1824 bytes … 102 – length of file in bytes – other privacy and protection attributes

File Systems Semantics and Structure 51 File Systems Semantics and Structure 52

Unix File System Mounts Unix - Mounted File Systems

• goal root file system

many file systems appear to be one giant mount filesystem2 on /export/user1 mount filesystem3 on /export/user2 – users need not be aware of file system boundaries mount filesystem4 on /opt /export /opt /bin • mechanism user1 user2 – mount device on directory – creates a warp from the named directory to the top of the file system on the specified device – any file name beneath that directory is interpreted

relative to the root of the mounted file system file system 2 file system 3 file system 4

File Systems Semantics and Structure 53 File Systems Semantics and Structure 54

9 5/15/2017

File Systems and Disks Partitioning

• independence of multiple disks • divide physical disk into multiple logical disks – they can be turned on and off independently – perhaps in disk driver, perhaps through meta-driver – they can be backed-up and restored independently – rest of system sees partitions as separate devices – some are physically removable (diskette, , Flash) • typical motivations • file system spanning multiple (non RAID) disks is risky – permit multiple OS to coexist on a single disk – losing one disk could lose parts of many files • e.g. a notebook that can boot either Windows or Linux – better to lose all of some files and none of others – fire-walls for installation, back-up and recovery – disks can be checked, repaired, restored independently • e.g. separate personal files from the installed OS file system • do put multiple file systems on a single disk – fire-walls for free-space – partitioning a physical disk into multiple logical disks • running out of space on one file system doesn't affect others

File Systems: Semantics and Structure 55 File Systems: Semantics and Structure 56

Disk Partitioning Mechanisms example: Disk Partitioning

• some designed for use by a particular OS physical sector 0 () – e.g. Linux LVM partitions, understood by GRUB PBR disk linux • some designed to support multiple OS bootstrap partition program – e.g. DOS FDISK partitions, understood by BIOS A start end type PBR • there may be hierarchical partitioning FDISK 1 00:01:00 99:7:63 linux Windows partition 0 100:1:00 149:7:63 NTFS 0 150:1:00 199:7:63 OS X partition table 0 0 0 0 – e.g. logical volumes within an FDISK partition PBR OS X • should be possible to boot from any partition Note that the first sector of each logical partition partition also contains a Partition Boot – direct from BIOS, or w/help from L2 bootstrap Record, will be used to boot the operating system for that partition. E1

File Systems: Semantics and Structure 57 File Systems: Semantics and Structure 58

A Simple Bootstrap Procedure Open Files – Levels of Indirection

1. BIOS tries designated devices in a configurable order stdin stdin stdin open-file references stdout stdout stdout (UNIX user file descriptor) 2. Read MBR (Block 0) from chosen device and check for stderr stderr stderr in process descriptor validity, and branch to it. 3. Scan FDISK table to find ACTIVE partition, read PBR B1 offset offset offset offset offset open file instance (block 0) from chosen partition, and branch to it. options options options options options I-node ptr I-node ptr I-node ptr I-node ptr I-node ptr descriptors 4. PBR loader finds and loads 2 nd level bootstrap (e.g. GRUB), and branches to it. nd I-node I-node I-node I-node in-memory file descriptors 5. 2 level bootstrap (often a very sophisticated (UNIX struct ) program) finds and loads operating system. B2

6. all early-stage loading is done by calls to BIOS on-disk file descriptors I-node I-node I-node I-node I-node (UNIX struct dinode)

File Systems Semantics and Structure 60

10 5/15/2017

(Open Files – Levels of Indirection) File Systems

• open file references (UNIX user file descriptors) • file systems implemented on top of block I/O – array to associate open file index numbers w/files – should be independent of underlying devices • open file descriptors (UNIX file structures) • all file systems perform same basic functions – describes an open instance (session) of a file – map names to files • current offset, access (read/write), lock status – map into • in-memory file descriptors (UNIX I-nodes) – manage free space and allocate it to files – copy of on-disk file description – create and destroy files • on-disk file descriptors (UNIX dinodes) – get and set file attributes – file description (ownership, protection, etc) – manipulate the file name space – location (on disk) of the file's data • different implementations and options

File Systems Semantics and Structure 61 File Systems Semantics and Structure 62

Files: Layers of implementation (integration) Layer system calls • federation layer to generalize file systems file directory file operations operations I/O – permits rest of OS to treat all file systems as the same – support dynamic addition of new file systems virtual file system integration layer device socket • plug-in interface or file system implementations I/O I/O UNIX FS FS DOS DOS FS CD CD FS … … – DOS FAT, Unix, EXT3, ISO 9660, network, etc. – each file system implemented by a plug-in module – all implement same basic methods block I/O • create, delete, open, close, link, unlink, interfaces (disk-ddi) • get/put block, get/set attributes, read directory, etc implementation is hidden from higher level clients CD disk diskette flash • drivers drivers drivers drivers – all clients see are the standard methods and properties

File Systems Semantics and Structure 63 File Systems Semantics and Structure 64

Device Independent Block I/O Assignments

• simplifying abstraction – better than generic disks • Lab • an LRU buffer cache for disk data – Get started on Project 4B – hold frequently used data until it is needed again • Reading (73pp, last big reading assignment) – hold pre-fetched read-ahead data until it is requested A/D 41 FFS implementation • buffers for data re-blocking – – adapting file system block size to device block size – A/D 42 Crash-consistency – adapting file system block size to user request sizes – A/D 43 Logging File Systems • automatic buffer management – A/D 44 Data Integrity – allocation, deallocation – A/D Appx I (6-10) SSD – automatic write-back of changed buffers –

File Systems Semantics and Structure 65 File Systems Semantics and Structure 66

11 5/15/2017

Indexed Sequential Files

• record structured files for database like use – records may be fixed or variable length • all records have one or more keys Supplementary Slides – records can be accessed by their key – sample key: student ID number • new file update operations – insert record, delete record • performance challenges – efficient insertion, key searches

File Systems Semantics and Structure 67 File Systems Semantics and Structure 68

What is an Index MVS – Volume Structure

boot block type 4 DSCB block 0 512 • a table of contents for another file type 5 DSCB misc DSCB – lists keys, and pointers to associated records block 1 512 Volume misc DSCB – built by examining the real data file Table of Contents … • enables much faster access to desired records (VTOC) misc DSCB misc DSCB – search for records in index rather than file – index is much shorter, and sorted by key(s) Data Set Control Block types available 1. describes file & 1 st three extents • index is sometimes called an inverted file space … 3. describes additional file extents – files are accessed by record location, to find the data 4. describes volume parameters – index is accessed by data key, to find record location 5. describes free space chunks

File Systems Semantics and Structure 69 File Systems Semantics and Structure 70

MVS Files and Data Extents (MVS Volumes – Overview) type 1 Data Set Control Block • Data Set Control Blocks at start of volume name: myfile.txt large chunk of data each extent is the same – describe volume parameters size, typically several other attributes large chunk of data contiguous tracks. 1st extent – describe allocated files and allocated space 2nd extent large chunk of data – describe unallocated space 3rd extent continuted extents • file space E1 type 3 Data Set Control Block – allocated to files in variable length “extents” File attributes, which are 4th extent (extent size is determined when file is created) specified when the file is 5th extent large chunk of data created, include the size • file index blocks – 1 DSCBs … and alignment of the large chunk of data extents. – describe file attributes (and unit of space allocation) – points at first three extents, and extra extent DSCB

File Systems Semantics and Structure 71 File Systems Semantics and Structure 72

12 5/15/2017

MVS – Performance/Capabilities Indexed Sequential Files

• extent based allocation • record structured files for database like use – user is required to specify extent size and alignment – records may be fixed or variable length – enables large, efficient, contiguous, allocation • all records have one or more keys – potentially very efficient file reads and writes – records can be accessed by their key – large extents may mean large internal fragmentation – sample key: student ID number – variable size extents means external fragmentation new file update operations • finding the n'th extent involves following DSCBs • – if VTOC is in memory, this needn't involve I/O – insert record, delete record • no support for sparse files • performance challenges – efficient insertion, key searches

File Systems Semantics and Structure 73 File Systems Semantics and Structure 74

Virtual Sequential Access Method What is an Index (VSAM) • a table of contents for another file • flexible and very efficient data base access – lists keys, and pointers to associated records – designed for large commercial applications – built by examining the real data file • multiple access modes • enables much faster access to desired records – entry sequenced, relative record number – search for records in index rather than file – key sequenced (primary and secondary keys) index is much shorter, and sorted by key(s) – – linear (byte stream) • index is sometimes called an inverted file • multiple record types – files are accessed by record location, to find the data – fixed length, variable length – index is accessed by data key, to find record location – compressed, spanned (very long records)

File Systems Semantics and Structure 75 File Systems Semantics and Structure 76

MVS/VSAM Extents & Control Areas (MVS VSAM Files) type 1 DSCB • Data Organization Control Area #1 name, attributes Control Interval #1 – each contiguous extent is a "Control Area" Control Interval #2 each "Control" Area is divided into "Control Intervals" 1st extent Control Interval #3 – 2nd extent Control Area #2 – each "Control Interval" contains (keyed) data records rd 3 extent Control Interval #4 – each record has a key, records appear in key-order continuted extents Control Interval #5 Control Interval #6 • Index Structure Control Area #3 Control Interval #7 – indices are maintained as separate (inverted) files Control Interval #8 Control Interval #9 – index contains one record for each Control Interval … • location (within VSAM file) of control interval • number of highest key stored in that control interval

File Systems Semantics and Structure 77 File Systems Semantics and Structure 78

13 5/15/2017

VSAM Control Intervals & Records MVS VSAM files – rationale Control Interval

Index key D record D • Control Areas key CI key E record E – all records in CA can be read without head-motion

C 2 key F record F – thus, the order of CIs within a CA is not critical F 3 • Control Intervals K 4 free space … – index search identifies CI containing desired record F length – we may read or write an entire CI in one operation E length – records within a CI are in key-order, efficient searches D length • Leave free space in each CI, free CIs in each CA Control Interval Description – so we can do record insertions w/o I/O or head motion

File Systems Semantics and Structure 79 File Systems Semantics and Structure 80

MVS/VSAM Record Insertion MVS/VSAM Record Insertion (room in Control Interval) (Control Interval overflow) Control Interval Index Control Area #1 key CI Index key D record D Control Interval #1 C 2 Control Interval #2 key CI key EF recordrecord E F 35 Control Interval #3 Control Interval #4 C 2 K 4 Control Interval #5 F 3 K 4 free space

Insert Record E Insert Record E • find the appropriate CI • find the appropriate CI • check for adequate free space EF length • check for free space • allocate space for new record • check for a free CI in same CA • shift to make room for new record D length • move ½ of records from full CI to new one • initialize length & key fields Control Interval Description • update index to reflect new CI structure • copy in record contents • perform the insert as before • update index (if necessary)

File Systems Semantics and Structure 81 File Systems Semantics and Structure 82

MVS/VSAM Record Insertion (Control Area overflow) (MVS VSAM Record Insertion) Index Control Area #1 • Use index to find appropriate Control Interval key CI Control Interval #1 – insert new record into the existing Control Interval C 2 Control Interval #2 F 35 Control Interval #3 – may involve shifting existing records to the right K 46 Control Interval #4 • If Control Interval is is already full

Control Area #2 – find a free control interval within the control area Control Interval #5 – move half records in full control interval into new one Insert Record E Control Interval #6 • find the appropriate CI Control Interval #7 • if Control Area is completely full • check for free space in CI Control Interval #8 • check for a free CI in same CA – allocate a new extent/Control Area • allocate a new CA (extent) – move half of CIs from full CA into the new CA • move ½ of CIs in full CA into new CA • update index to reflect new CA/CI structure • update index accordingly • perform the insert as before

File Systems Semantics and Structure 83 File Systems Semantics and Structure 84

14 5/15/2017

MVS VSAM Free Space Management File Systems: which is better

• Dataset initially created with free space included • DOS file systems are very simple – free record space within each Control Interval – designed for simple sequential reads and writes – free Control Intervals within each Control Area – user specifies reserve percentages for each • MVS lets programs control file organization • After insertions/deletions dataset degenerates – extent sizes to minimize fragmentation – free space is no longer evenly distributed throughout – size and alignment to maximize I/O throughput it – many different file organizations are possible – consecutive records are no longer contiguous • UNIX combines flexibility with ease of use • Dataset can be copied out and reloaded – large & small files, random & sequential access – consecutive records will be allocated consecutively – free space will be well distributed throughout dataset – efficient allocation and I/O w/o help from program

File Systems Semantics and Structure 85 File Systems Semantics and Structure 86

Unix File Extension (Extending a BSD/UNIX file) block pointers (in I-node) • note the cylinder group for the file's i-node 1st 1st C.G. 2nd • note the cylinder for the previous block in the file nd summary 3rd 2 4th ... free th • find a free block in the desired cylinder 5 10 th I-node 6th map 7th 11 th – search the free-block bit-map for free block in right cyl 8th ... 9th ... – update bit-map to show the block has been allocated 10 th 1034 th th free 11 1035 th 12 th ... block • update the I-node to point to the new block 13 th ... map ... 2058 th – go to appropriate block pointer in I-node/indirect ...... 2059 th block – if new indirect block is needed, allocate/assign it first – update I-node/indirect to point to new block

File Systems Semantics and Structure 87 File Systems Semantics and Structure 88

MVS Free Space Management (MVS Free Space)

type 5 DSCB • free chunks are of variable length st chunk VTOC free list head 1 free chunk – due to variable size of allocated extents 2nd free chunk chunk … • described by format 5 DSCBs in VTOC 26 th free chunk chunk – DSCB points to 26 non-contiguous free chunks continued free list – pointer to next format 5 DSCB in free list type 5 DSCB • large extents make first-fit allocation practical 1st free chunk chunk 2nd free chunk chunk – if a chunk is the right size, its location is secondary … • allocation, freeing and coalescing require no 26 th free chunk chunk I/O continued free list – if the entire VTOC is kept in memory

File Systems Semantics and Structure 89 File Systems Semantics and Structure 90

15 5/15/2017

MVS Data and Free Extents (Extending an MVS file) type 1 DSCB type 5 DSCB • note the required extent size (from format 1 st name: myfile.txt 1 free chunk DSCB) nd extsize: 20 tacks 2 free chunk 1st extent … • find an appropriate free chunk of space th 2nd extent 26 free chunk – first-fit search the chain of format 5 DSCBs 3rd extent continued free list remove required extent size from chosen chunk continued extents – type 5 DSCB – update format 5 DSCB to point at remainder (if any) st type 3 DSCB 1 free chunk 2nd free chunk • attach the new chunk to the end of the file 4th extent … 5th extent – find the last extent pointer in the file th … 26 free chunk – if all full, allocate and attach a new format 3 DSCB continued free list – put pointer to new chunk in the last extent pointer

File Systems Semantics and Structure 91 File Systems Semantics and Structure 92

Unix System V Free Space Unix System V free space

Boot Block • superblock in memory for each active file system – contains list of first 100 free blocks I-nodes data blocks – last of these blocks contains list of next 100 free blocks super block • when you allocate the 100th block – move its pointer list into superblock parameters free blocks • when you free the 100th block free blocks ## – move superblock list into your block ## ## ## ## • performance ## ## ## ## – only one I/O operation per hundred allocates/frees ## ## ## – allocation in specific cylinder is impractical

File Systems Semantics and Structure 93 File Systems Semantics and Structure 94

16