COSC 6397 Big Data Analytics Distributed File Systems

COSC 6397 Big Data Analytics Distributed File Systems Edgar Gabriel Spring 2015 What is a file system • A clearly defined method that the OS uses to store, catalog and retrieve files • Manage the bits that make up a file itself and Metadata • Metadata: “data about data”, e.g. – where data is logically placed on hard drive – file name – organizational hierarchies (i.e. directory) – Last modification date – Permissions(read,write,execute etc.) 1 UNIX File Model - overview • A File is a sequence of bytes • When a program opens a file, the file system establishes a file pointer. The file pointer is an integer indicating the position in the file, where the next byte will be written/read. • Disk drives read and write data in fixed-sized units (disk sectors) • File systems allocate space in blocks, which is a fixed number of contiguous disk sectors. • In UNIX based file systems, the blocks that hold data are listed in an inode. An inode contains the information needed to find all the blocks that belong to a file. • If a file is too large and an inode can not hold the whole list of blocks, intermediate nodes (indirect blocks) are introduced. Write operations • Write: – the file systems copies bytes from the user buffer into system buffer. – If buffer filled up, system sends data to disk • System buffering + allows file systems to collect full blocks of data before sending to disk + File system can send several blocks at once to the disk (delayed write or write behind) - Data not really saved in the case of a system crash - For very large write operations, the additional copy from user to system buffer could/should be avoided 2 Read operations • Read: – File system determines, which blocks contain requested data – Read blocks from disk into system buffer – Copy data from system buffer into user memory • System buffering: + file system always reads a full block (file caching) + If application reads data sequentially, prefetching (read ahead) can improve performance - Prefetching harmful to the performance, if application has a random access pattern. Hiding disk latency: Caching and buffering • Avoids repeated access to the same block • Allows a file system to smooth out I/O behavior • Helps to hide the latency of the hard drives • Lowers the performance of I/O operations for irregular access • Non-blocking I/O gives users control over prefetching and delayed writing – Initiate read/write operations as soon as possible – Wait for the finishing of the read/write operations just when absolutely necessary. 3 Journaling file systems • Updating a file takes typically multiple steps. An interruption between the steps leads to an inconsistent file system • Example: deleting a file – Remove the directory entry – Mark the inode blocks as free in the space map • A journaling file system keeps track of the changes that will be made in a journal before committing them to the main file system – Entries to journal are made before modifying the file sytem • After a crash, the journal is replied and an entry either – Succeeds: could be completely replayed during recovery – Not replayed: journal entry has not been finished – Journal entries often contain a checksum per entry to verify for corruption Journaling file systems (II) • Physical journal: – Data and metadata are written to the journal before modifying the file system – Large overhead -> data written twice • Logical journal: – Only metadata written to journal – Modifications to data written to file system directly -> worst case scenario: data is garbage, but directory structure and file structure are consistent -> trade off between performance and reliability 4 Log structured file systems • Conventional file systems lay out files to optimize spatial locality – make in-place changes to their data structures in order to perform well on magnetic disks (seek is slow) • Log-structured file systems treat storage as a circular buffer – Write always occurs to the head of the log • Writes create multiple, chronologically-advancing versions of both file data and meta-data – Can be used to make old file versions nameable and accessible (snapshotting) • Recovery from crashes is simpler: upon its next mount, the file system can reconstruct its state from the last consistent point in the journal – not need to walk all its data structures Distributed File Systems • The generic term for a client/server file system where the data is not locally attached to a host. • Clients, servers, and storage are dispersed across machines. • Configuration and implementation may vary • Clients should view a DFS the same way they would a centralized FS; the distribution is hidden at a lower level. • Performance is concerned with throughput and response time. Slide based on a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section17-Dist_File_Sys.ppt 5 Distributed File Systems - Characteristics • Naming: mapping between logical and physical objects – Example: A filename maps to <cylinder, sector>. – In a conventional file system, it's understood where the file actually resides; the system and disk are known. – In a transparent DFS, the location of a file, somewhere in the network, is hidden. • Location transparency: The name of a file does not reveal any hint of the file's physical storage location. • Location independence: The name of a file doesn't need to be changed when the file's physical storage location changes. Slide based on a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section17-Dist_File_Sys.ppt Distributed File Systems - Characteristics • Caching – Reduce network traffic by retaining recently accessed disk blocks in a cache, so that repeated accesses to the same information can be handled locally. – If required data is not already cached, a copy of data is brought from the server to the user. – Perform accesses on the cached copy. – Files are identified with one master copy residing at the server machine, – Copies of (parts of) the file are scattered in different caches. • Cache Consistency Problem: Keeping the cached copies consistent with the master file. Slide based on a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section17-Dist_File_Sys.ppt 6 Distributed File Systems - Characteristics • Typical steps for a read operation: – The client makes a request for file access. – The request is passed to the server in message format. – The server makes the file access. – Return messages bring the result back to the client. • Cache location: – data can be kept in the local memory or in the local disk. – Caching can be done on the client and the server side Slide based on a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section17-Dist_File_Sys.ppt Distributed File Systems - Characteristics • Stateful: server keeps track of information about client requests. – Maintains what files are opened by a client – Memory must be reclaimed when client closes file or when client dies. – Good for Performance: no need to parse the filename each time, or "open/close" file on every request. – Bad for Reliability: stateful server loses everything on crash • Stateless: Each client request provides complete information needed by the server (i.e., filename, file offset ). – Server maintains information on behalf of the client – Stateless remembers nothing so it can start easily after a crash Slide based on a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section17-Dist_File_Sys.ppt 7 Example: NFS – The Network File System • Protocol for a remote file service • Stateless server (v3) • Communication based on RPC (Remote Procedure Call) • NFS provides session semantics – changes to an open file are initially only visible to the process that modified the file • File locking not part of NFS protocol (v3) but often available through a separate protocol/daemon Image taken from a lecture by Jerry Breecher: http://web.cs.wpi.edu/~jb/CS502/lectures/Section1 • Client caching not part of the 7-Dist_File_Sys.ppt NFS protocol (v3) – implementation dependent behavior Parallel File Systems • Parallel File System: data blocks are striped across multiple storage devices on multiple storage servers. • Support for parallel applications: all nodes access to the same files at the same time (concurrent read and write capabilities) • Three relevant parameters: – Stripe factor: number of disks – Stripe size: size of each block – Which disk contains the first block of the file … Block 1 Block 2 Block 3 Block n … Disk 1 Disk 2 Disk 3 Disk 4 8 Parallel File Systems: Conceptual overview Compute nodes Meta-data server storage server 0 storage server 1 storage server 2 storage server 3 Parallel File Systems - Concept • Metadata server: – stores namespace metadata, such as filenames, directories, access permissions, and file layout. – Metadata server not necessarily involved in file I/O operations • Distributed Metadata server: – E.g. multiple metadata server available, each hosting a part of the namespace • hashing function on file name or • Sub trees of the directory • Write operations: – Require locking of entire file or file block to ensure consistency – Distributed locking protocols can be used 9 Example: Parallel Virtual File System • Open source project from Clemson University • Lightweight server daemon to provide simultaneous access to storage • Each node in the cluster can be a server, a client, or both. • Best suited for providing large, fast temporary storage. • The basic PVFS2 package consists of three components: a server, a client, and a kernel module. • Default stripe size: 64kB – In practice: often changed to 1 MB – Can be adjusted on a per-directory basis Slides based on a talk by James W. Barker: http://www.slideshare.net/lystrata/survey-of-clusteredparallelfilesystems004lanlppt-10538039 Example: Parallel Virtual File System • Stateless architecture – PVFS2 servers do not keep track of typical file system bookkeeping information such as which files have been opened, file positions, etc. – No shared lock state to manage – Can fail and resume without disturbing the system as a whole.

COSC 6397 Big Data Analytics Distributed File Systems

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support