File Systems: Semantics & Structure What Is a File

5/15/2017 File Systems: Semantics & Structure What is a File 11A. File Semantics • a file is a named collection of information 11B. Namespace Semantics • primary roles of file system: 11C. File Representation – to store and retrieve data – to manage the media/space where data is stored 11D. Free Space Representation • typical operations: 11E. Namespace Representation – where is the first block of this file 11L. Disk Partitioning – where is the next block of this file 11F. File System Integration – where is block 35 of this file – allocate a new block to the end of this file – free all blocks associated with this file File Systems Semantics and Structure 1 File Systems Semantics and Structure 2 Data and Metadata Sequential Byte Stream Access • File systems deal with two kinds of information int infd = open(“abc”, O_RDONLY); int outfd = open(“xyz”, O_WRONLY+O_CREATE, 0666); • Data – the contents of the file if (infd >= 0 && outfd >= 0) { – e.g. instructions of the program, words in the letter int count = read(infd, buf, sizeof buf); Metadata – Information about the file • while( count > 0 ) { e.g. how many bytes are there, when was it created – write(outfd, buf, count); sometimes called attributes – count = read(infd, inbuf, BUFSIZE); • both must be persisted and protected } – stored and connected by the file system close(infd); close(outfd); } File Systems Semantics and Structure 3 File Systems Semantics and Structure 4 Random Access Consistency Model void *readSection(int fd, struct hdr *index, int section) { struct hdr *head = &hdr[section]; • When do new readers see results of a write? off_t offset = head->section_offset; – read-after-write size_t len = head->section_length; • as soon as possible, data-base semantics void *buf = malloc(len); if (buf != NULL) { • this commonly called “POSIX consistency” lseek(fd, offset, SEEK_SET); – read-after-close (or sync/commit) if ( read(fd, buf, len) <= 0) { free(buf); • only after writes are committed to storage buf = NULL; – open-after-close (or sync/commit) } • each open sees a consistent snapshot } return(buf); – explicitly versioned files } • each open sees a named, consistent snapshot File Systems Semantics and Structure 5 File Systems Semantics and Structure 6 1 5/15/2017 File Attributes – basic properties Extended File Types and Attributes • thus far we have focused on a simple model • extended protection information – a file is a "named collection of data blocks" – e.g. access control lists • in most OS files have more state than this • resource forks – file type (regular file, directory, device, IPC port, ...) – e.g. configuration data, fonts, related objects – file length (may be excess space at end of last block)) • application defined types – ownership and protection information – e.g. load modules, HTML, e-mail, MPEG, ... – system attributes (e.g. hidden, archive) – creation time, modification time, last accessed time • application defined properties • typically stored in file descriptor structure – e.g. compression scheme, encryption algorithm, ... File Systems Semantics and Structure 7 File Systems Semantics and Structure 8 Databases Object Stores simplified file systems, cloud storage • a tool managing business critical data • – optimized for large but infrequent transfers • table is equivalent of a file system • bucket is equivalent of a file system • data organized in rows and columns – a bucket contains named, versioned objects – row indexed by unique key • objects have long names in a flat name space – columns are named fields within each row – object names are unique within a bucket • support a rich set of operations • an object is a blob of immutable bytes – multi-object, read/modify/write transactions – get … all or part of the object – SQL searches return consistent snapshots – put … new version, there is no append/update – insert/delete row/column operations – delete File Systems Semantics and Structure 9 File Systems Semantics and Structure 10 Key-Value Stores File Names and Name Binding • smaller and faster than an SQL database • file system knows files by internal descriptors – optimized for frequent small transfers • users know files by names • table is equivalent of a file system – names more easily remembered than disk addresses – a table is a collection of key/value pairs – names can be structured to organize millions of files • keys have long names in a flat name space • file system responsible for name-to-file mapping – associating names with new files – key names are unique within a table – changing names associated with existing files value is a (typically 64-64MB) string • – allowing users to search the name space – get/put (entire value) • there are many ways to structure a name space – delete File Systems Semantics and Structure 11 File Systems Semantics and Structure 12 2 5/15/2017 What is in a Name? Flat Name Spaces directory • there is one naming context per file system /home /mark /TODO. txt – all file names must be unique within that context suffix separator base name • all files have exactly one true name – these names are probably very long • suffixes and file types • file names may have some structure – file-to-application binding often based on suffix e.g. CAC101.CS111.SECTION1.SLIDES.LECTURE_13 • defined by system configuration registry – • configured per user, or per directory – this structure may be used to optimize searches – suffix may define the file type (e.g. Windows) – the structure is very useful to users – suffix may only be a hint (magic # defines type) – the structure has no meaning to the file system File Systems Semantics and Structure 13 File Systems Semantics and Structure 14 Hierarchical Namespaces A rooted directory tree • directory root – a file containing references to other files – it can be used as a naming context user_1 user_2 user_3 • each process has a current working directory • names are interpreted relative to directory file_a dir_a file_b file_c dir_a • nested directories can form a tree (/user_1/file_a) (/user_1/dir_a) (/user_2/file_b) (/user_3/file_c) (/user_3/dir_a) – file name is a path through that tree – directory tree expands from a root node file_a file_b (/user_1/dir_a/file_a) (/user_3/dir_a/file_b) • fully qualified names begin from the root – may actually form a directed graph File Systems Semantics and Structure 15 File Systems Semantics and Structure 16 True Names vs. Path Names Hard Links: example • Some file systems have “true names” root • DOS and ISO9660 have a single “path name” user_1 user_3 – files are described by directory entries – data is referred to by exactly one directory entry dir_a file_c – each file has only one (character string) name file_a ln /user_3/dir_a/file_b /user_1/file_a • Unix (and Linux) … have named links file_b – files are described by I-nodes (w/unique I#) – directories associate names with I-node numbers – many directory entries can refer to same I-node Both names now refer to the same I-node File Systems Semantics and Structure 17 File Systems Semantics and Structure 18 3 5/15/2017 Unix-style Hard Links Symbolic Links: example • all protection information is stored in the file root ln –s /user_3/dir_a/file_b /user_1/file_a – file owner sets file protection (e.g. read-only) user_1 user_3 – all links provide the same access to the file – anyone with read access to file can create new link file_a dir_a file_c – but directories are protected files too • not everyone has read or search access to every directory /user_3/dir_a/file_b file_b • all links are equal – there is nothing special about the owner‘s link – file is not deleted until no links remain to file – reference count keeps track of references /user_1/file_a is now a macro for /user_3/dir_a/file_b File Systems Semantics and Structure 19 File Systems Semantics and Structure 20 Symbolic Links Generalized Directories: Issues • another type of special file • Can there be multiple links to a single file? – an indirect reference to some other file – who can create new references to a file? – contents is a path name to another file – if one reference is deleted, does the file go away? • Operating System recognizes symbolic links – can the file's owner cancel other peoples' access? – automatically opens associated file instead • Can clients work their way back up the tree? – if file is inaccessible or non-existent, the open fails • Is the namespace truly a tree • symbolic link is not a reference to the I-node – or is it an a-cyclic directed graph – symbolic links will not prevent deletion – or an arbitrary directed graph – do not guarantee ability to follow the specified path – Internet URLs are similar to symbolic links • Does namespace span multiple file systems? File Systems Semantics and Structure 21 File Systems Semantics and Structure 22 File System Goals File System Structure • ensure the privacy and integrity of all files • disk volumes are divided into fixed-sized blocks • efficiently implement name-to-file binding – many sizes are used: 512, 1024, 2048, 4096, 8192 ... – find file associated with this name • most of them will store user data – list the file names in this part of the name space • some will store organizing “meta-data” • efficiently manage data associated w/each file – description of the file system (e.g. layout and state) – file control blocks to describe individual files – return data at offset X in file Y – lists of free blocks (not yet allocated to any file) – write data Z at offset X in file Y • all operating systems have such data structures • manage attributes associated w/each file – different OS and FS often have very different goals – what is the length of file Y – these result in very different implementations – change owner/protection of file Y to be X File Systems Semantics and Structure 23 File Systems Semantics and Structure 24 4 5/15/2017 Unix System 5 – Volume Structure File Descriptor Structures block 0 boot block • all file systems have file descriptor structures super block size and number of I-nodes are block 1 • contain all info about file UNIX I-node block specified in super block – type (e.g.

File Systems: Semantics & Structure What Is a File

Technical Standard

Allgemeines Abkürzungsverzeichnis

Active@ UNDELETE Documentation

Active @ UNDELETE Users Guide | TOC | 2

Partition Wizard About Minitool Partition Wizard Minitool Partition Wizard Is an Easy-To-Use Partitioning Software with High Security and Efficiency

MSD FATFS Users Guide

Computer Hardware

CIS 4360 Secure Computer Systems Attacks Against Boot And

Synchronizing Threads with POSIX Semaphores

Openextensions POSIX Conformance Document

So You Think You Know C? [Pdf]

IT Acronyms.Docx