An Introduction to NFS

An Introduction to NFS

An Introduction to NFS Avishay Traeger IBM Haifa Research Lab Internal Storage Course ― November 2010 v1.2 Outline The Basics NFSv2 NFSv3 NFSv4 NFSv4.1 2 Typical Use ws-avishay ws-bob ws-carl mount -t nfs (NFS Client) (NFS Client) (NFS Client) nfsserv:/home /home 10.0.2.56 10.0.2.103 10.0.2.81 nfsserv (NFS Server) /etc/exports: / /home 10.0.2.*(rw) home … avishay bob carl Some benefits of NFS: 1. All clients have the same view 2. Centralized storage management RAID Storage 3 NFS Evolution NFS is a standardized protocol Version Year RFC # Pages Status NFSv2 1989 1094 27 Obsolete NFSv3 1995 1813 126 Most popular Available on several OSs, slowly NFSv4 2003 3530 275 but surely replacing NFSv3 NFSv4.1 2010 5661 617 Early adopters only 4 Design Goals OS independence & interoperability Simple crash recovery for clients and servers Transparent access (client programs do not know files are remote) Maintain local file system semantics Reasonable performance 5 Remote Procedure Call (RPC) NFS is defined as a set of RPCs – their arguments, results, and effects RPCs are synchronous The use of RPCs makes the protocol easier to understand 6 NFS Client/Server 7 Outline The Basics NFSv2 NFSv3 NFSv4 NFSv4.1 8 Stateless Protocol The server does not keep state for RPCs Each RPC contains the necessary information to complete the call This makes crash recovery easy Server crash: server does no crash recovery, clients resubmit requests Client crash: no crash recovery for client or server This is nice in theory, but Adds complexity Not really stateless... File locking adds state, provided by separate protocol & daemon Server keeps an RPC reply cache to handle duplicate non- idempotent RPC 9 File Handles The most common NFS procedure parameter is a structure called a file handle (fh, fhandle) Provided by the server and used by the client to reference a file The fhandle is opaque to the client New fhandles returned by LOOKUP, CREATE, MKDIR, ... The fhandle for the root of the file system is obtained by the client when it mounts the file system 10 Operations NULL() returns () Do nothing procedure to used for pinging the server LOOKUP(dirfh, name) returns (fh, attr) Returns a new fh and attributes for the named file in the directory specified by dirfh CREATE(dirfh, name, attr) returns (newfh, attr) Creates a new file name in the directory dirfh and returns the new fh and attributes. REMOVE(dirfh, name) returns (status) Removes the file name from directory dirfh. 11 Operations GETATTR(fh) returns (attr) Returns file attributes (similar to stat syscall) SETATTR(fh, attr) returns (attr) Sets the mode, uid, gid, size, access time, and modify time of a file. Setting the size to zero truncates the file. 12 Operations READ(fh, offset, count) returns (attr, data) Returns up to count bytes of data from a file starting offset bytes into the file. Returns the attributes of the file. WRITE(fh, offset, count, data) returns (attr) Writes count bytes of data to a file beginning at offset bytes from the beginning of the file. Returns the new attributes of the file after the write. 13 Operations RENAME(dirfh, name, tofh, toname) returns (status) Renames name in directory dirfh, to toname in directory tofh. LINK(dirfh, name, tofh, toname) returns (status) Creates a hard link toname in directory tofh, that points to name in directory dirfh. 14 Operations SYMLINK(dirfh, name, string) returns (status) Creates a symlink name in the directory dirfh with value string. The server does not interpret the string argument in any way, just saves it and makes an association to the new symlink file. READLINK(fh) returns (string) Returns the string which is associated with the symlink file. 15 Operations MKDIR(dirfh, name, attr) returns (fh, newattr) Creates a new directory name in the directory dirfh and returns the new fh and attributes. RMDIR(dirfh, name) returns (status) Removes the empty directory name from the parent directory dirfh. STATFS(fh) returns (fsstats) Returns file system information such as block size, number of free blocks, etc. 16 Operations READDIR (dirfh, cookie, count) returns (entries) Returns up to count bytes of directory entries from the directory dirfh. Each entry contains a file name, file id, and an opaque pointer to the next directory entry called a cookie. The cookie is used in subsequent readdir calls to start reading at a specific entry in the directory. A readdir call with the cookie of zero returns entries starting with the first entry in the directory. 17 The MOUNT Protocol The MOUNT protocol takes a directory pathname and returns an fhandle if the client has permissions to mount the file system Separate protocol Easier to plug in new permission check methods Separates the OS-dependent aspects of the protocol Other OS implementations can change the MOUNT protocol without having to change the NFS protocol 18 The Linux File Handle Remember that information contained in the fhandle is only meaningful on the server If the local FS on the server reuses an inode number, an NFS client could mistakenly use an old file handle and access the new file. File systems include generation numbers in the inode to avoid this. The value is usually taken from a counter used across the file system. Important file handle fields: Major/minor number of the exported device Inode number Generation number 19 Security NFSv2 uses UNIX-style permission checks The client passes uid/gid info in RPCs, and the server performs permission checks as if the user was performing the operation locally Problem – the mapping from uid/gid to user must be the same on the client and server Can be solved via Network Information Service (NIS) Another problem – should root on the client have root access to files on the server? Server specifies policy 20 Cache Consistency Problems Clients use caching and write buffering to improve performance, but this causes issues Problem: Update visibility; If client C1 buffers writes in its cache, client C2 will see the old version NFSv2 solution: Close-to-open consistency – Clients flush on close(), so other clients will see the latest version on open() Problem: Stale cache; If C1 has a file cached, it will see old data even if the file is updated by C2 NFSv2 solution: Send a GETATTR and check the file's modification time to see if it has been updated. Cache attributes for a few seconds to reduce the number of GETATTR calls. 21 Strong Semantics for Write Because the NFS server is stateless, when servicing an NFS request it must commit any modified data to stable storage before returning results The implication for UNIX based servers is that requests which modify the file system must flush all modified data & metadata to disk before returning from the call This can be a big performance bottleneck unless something is done to improve write performance (e.g., NetApp's WAFL file system) 22 Outline The Basics NFSv2 NFSv3 NFSv4 NFSv4.1 23 Major Changes from NFSv2 to v3 Sizes and offsets are widened from 32 bits to 64 bits A new COMMIT RPC allows for reliable asynchronous writes A new ACCESS RPC improves support for ACLs and super-user All operations now return attributes to reduce the number of subsequent GETATTR procedure calls The 8KB data size limitation on the READ and WRITE procedures is relaxed 24 Major Changes from NFSv2 to v3 A new READDIRPLUS RPC returns both file handle and attributes to eliminate LOOKUP calls when scanning a directory 25 Asynchronous Writes In NFSv3, the server can reply to WRITE RPCs immediately, without syncing to disk When the client wants to ensure that data is on stable storage, it sends a COMMIT RPC Asynchronous writes are optional, and negotiated at mount time 26 Asynchronous Writes: Crash Recovery The client must keep all uncommitted data in case of a server crash Replies for WRITE and COMMIT RPCs include a write verifier for server crash detection Write verifier: 8-byte value that the server must change if it crashes (many use boot time) The client must save verifiers returned by async WRITE RPCs, and compare them to the verifier returned by a leter COMMIT RPC If the verifiers don't match, the client must rewrite all uncommitted data 27 Outline The Basics NFSv2 NFSv3 NFSv4 NFSv4.1 28 Additional Goals for NFSv4 Improved access and good performance on the Internet Only TCP Easy to transit firewalls: uses one port (mount & lock protocols merged into NFS) COMPOUNDs, delegations, uid/gid issue resolved Strong security with negotiation built in Better cross-platform interoperability Better extensibility New security types, new attributes, etc. Big design change – NFSv4 is stateful 29 Security For previous versions, only UNIX permissions were widely adopted NFSv4 mandates the use of strong RPC security flavors that depend on cryptography Security type negotiation is done securely and in-band User and groups are identified with strings, not numbers Access control policies compatible with both UNIX and Windows The problematic MOUNT protocol is removed 30 RPCSEC_GSS A framework adopted by NFSv4 to provide authentication, integrity, and privacy at the RPC level The following mechanisms must be implemented: Kerberos v5, LIPKEY, SPKM3 Security options are negotiated at mount time The SECINFO operation allows a client to determine the security policy (usually on mount, but can be on a per-filehandle basis) RPCSEC_GSS can be used with previous versions of NFS, but in NFSv4 support is mandatory 31 Identifying

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    48 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us