Distributed Computing Systems Distributed File Systems Outline

Distributed Computing Systems Distributed File Systems Outline

4/4/2014 Distributed File Systems • Early networking and files – Had FTP to transfer files – Telnet to remote login to other systems with files Distributed Computing Systems • But want more transparency! – local computing with remote file system • Distributed file systems One of earliest distributed Distributed File Systems system components • Enables programs to access remote files as if local – Transparency • Allows sharing of data and programs • Performance and reliability comparable to local disk Outline Concepts of Distributed File System • Overview (done) • Transparency • Basic principles (next) • Concurrent Updates – Concepts • Replication – Models • Fault Tolerance • Network File System (NFS) • Consistency • Platform Independence • Andrew File System (AFS) • Security • Dropbox • Efficiency 1 4/4/2014 Transparency Concurrent Updates Illusion that all files are similar. Includes: • Changes to file from one client should not • Access transparency — a single set of operations. Clients that work on local files can work with remote files. interfere with changes from other clients • Location transparency — clients see a uniform name – Even if changes at same time space. Relocate without changing path names. • Mobility transparency —files can be moved without • Solutions often include: modifying programs or changing system tables – File or record-level locking • Performance transparency —within limits, local and remote file access meet performance standards • Scaling transparency —increased loads do not degrade performance significantly. Capacity can be expanded. 5 6 Replication Fault Tolerance • File may have several copies of its data at • Function when clients or servers fail different locations – Often for performance reasons • Detect, report, and correct faults that occur – Requires update other copies when one copy is • Solutions often include: changed • Simple solution – Redundant copies of data, redundant hardware, – Change master copy and periodically refresh the other backups, transaction logs and other measures copies – Stateless servers • More complicated solution – Idempotent operations – Multiple copies can be updated independently at same time needs finer grained refresh and/or merge 7 8 2 4/4/2014 Consistency Platform Independence • Data must always be complete, current, and correct • Access even though hardware and OS • File seen by one process looks the same for all completely different in design, architecture processes accessing and functioning, from different vendors • Consistency special concern whenever data is • Solutions often include: duplicated – Well-defined way for clients to communicate with • Solutions often include: servers – Timestamps and ownership information 9 10 Security Efficiency • File systems must be protected against • Overall, want same power and generality as unauthorized access, data corruption, loss and local file systems other threats • Early days, goal was to share “expensive” • Solutions include: resource the disk – Access control mechanisms (ownership, • Now, allow convenient access to remotely permissions) stored files – Encryption of commands or data to prevent “sniffing” 11 12 3 4/4/2014 Outline File Service Models • Overview (done) Upload/Download Model Remote Access Model • Read file: copy file from server • File service provides functional • Basic principles (next) to client interface Concepts • Write file: copy file from client – Create, delete, read bytes, write – to server bytes, … – Models • Good • Good – Simple – Client only gets what’s needed Network File System (NFS) – Server can manage coherent view • • Bad of file system – Wasteful – what if client only • Andrew File System (AFS) needs small piece? • Bad – Problematic – what if client – Possible server and network • Dropbox doesn’t have enough space? congestion – Consistency – what if others • Servers used for duration of access need to modify file? • Same data may be requested repeatedly Semantics of File Service Accessing Remote Files (1 of 2) Sequential Semantics Session Semantics • For transparency, implement client as module Read returns result of last write Relax sequential rules under VFS • Easily achieved if • Changes to open file are – Only one server initially visible only to – Clients do not cache data process that modified it • But – Performance problems if no • Last process to modify file cache “wins” – Can instead write-through • Must notify clients holding • Can hide or lock file under copies modification from other • Requires extra state, generates extra traffic clients (Additional picture next slide) 4 4/4/2014 Accessing Remote Files (2 of 2) Stateful or Stateless Design Virtual file system allows for transparency Stateful Stateless Server maintains client-specific Server maintains no information on client accesses state • Each request must identify file • Shorter requests and offsets • Server can crash and recover • Better performance in – No state to lose processing requests • No open/close needed – They only establish state • Cache coherence possible • No server space used for state – Server can know who’s – Don’t worry about supporting accessing what many clients • Problems if file is deleted on • File locking possible server • File locking not possible Caching Concepts of Caching (1 of 2) • Hide latency to improve performance for Centralized control repeated accesses • Keep track of who has what open and cached on each node • Four places: • Stateful file system with signaling traffic – Server’s disk – Server’s buffer cache (memory) Read-ahead (pre-fetch) – Client’s buffer cache (memory) • Request chunks of data before needed – Client’s disk • Minimize wait when actually needed • Client caches risk cache consistency problems • But what if data pre-fetched is out of date? 5 4/4/2014 Concepts of Caching (2 of 2) Outline Write-through Overview (done) • All writes to file sent to server • – What if another client reads its own (out-of-date) cached copy? Basic principles (done) • All accesses require checking with server • • Or … server maintains state and sends invalidations • Network File System (NFS) (next) Delayed writes (write-behind) • Andrew File System (AFS) • Only send writes to files in batch mode (i.e., buffer locally) • One bulk write is more efficient than lots of little writes • Dropbox • Problem: semantics become ambiguous – Watch out for consistency – others won’t see updates! Write on close • Only allows session semantics • If lock, must lock whole file Network File System (NFS) NFS Overview • Introduced in 1984 (by Sun Microsystems) • Provides transparent access to remote files – Independent of OS (e.g., Mac, Linux, Windows) or • Not first made, but first to be used as product hardware • Made interfaces in public domain • Symmetric – any computer can be server and client – But many institutions have dedicated server Allowed other vendors to produce – • Export some or all files implementations • Must support diskless clients • Internet standard is NFS protocol (version 3) • Recovery from failure – RFC 1913 – Stateless, UDP, client retries • High performance • Still widely deployed, up to v4 but maybe too – Caching and read-ahead bloated so v3 widely used 6 4/4/2014 Underlying Transport Protocol NSF Protocols Initially NSF ran over UDP using Sun RPC • Since clients and servers can be implemented for • different platforms, need well-defined way to • Why UDP? communicate Protocol – Protocol – agreed upon set of requests and responses – Slightly faster than TCP between client and servers – No connection to maintain (or lose) • Once agreed upon, Apple implemented Mac NFS client can talk to a Sun implemented Solaris NFS server – NFS is designed for Ethernet LAN • NFS has two main protocols • Relatively reliable – Mounting Protocol : Request access to exported directory – Error detection but no correction tree – Directory and File Access Protocol : Access files and • NFS retries requests directories (read, write, mkdir, readdir … ) NFS Mounting Protocol NFS Architecture • In many cases, on same LAN, but not required • Request permission to access contents at pathname – Can even have client-server on same machine • Client • Directories available on server through /etc/exports – Parses pathname – When client mounts, becomes part of directory hierarchy – Contacts server for file handle Server 1 Client Server 2 • Server (root) (root) (root) – Returns file handle: file device #, i-node #, instance # • Client – Create in-memory VFS i-node at mount point export . vmunix usr nfs – Internally point to r-node for remote files • Client keeps state, not server Remote Remote people students x staff users • Soft-mounted – if client access fails, throw error to mount mount processes. But many do not handle file errors well big bobjon . jimann jane joe • Hard-mounted – client blocks processes, retries until server up (can cause problems when NFS server down) File system mounted at /usr/students is sub-tree located at /export/people in Server 1, and file system mounted at /usr/staff is sub-tree located at /nfs/users in Server 2 7 4/4/2014 Example NFS exports File NFS Automounter • File stored on server, typically /etc/exports • Automounter – only mount when access empty NFS-specified dir – Attempt unmount every 5 minutes – Avoids long start up costs when many NSF mounts # See exports(5) for a description. – Useful if users don’t need /public 192.168.1.0/255.255.255.0 (rw,no_root_squash) • Share the folder /public • Restrict to 192.168.1.0/24 Class C subnet – Use ‘ *’ for wildcard/any • Give read/write

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us