Nache: Design and Implementation of a Caching Proxy for Nfsv4
Total Page:16
File Type:pdf, Size:1020Kb
Nache: Design and Implementation of a Caching Proxy for NFSv4 Ajay Gulati Manoj Naik Renu Tewari Rice University IBM Almaden Research Center IBM Almaden Research Center [email protected] [email protected] [email protected] Abstract wide area network. Other filesystem architectures such as AFS [17] and DCE/DFS[20] have attempted to solve In this paper, we present Nache, a caching proxy for the WAN file sharing problem through a distributed ar- NFSv4 that enables a consistent cache of a remote NFS chitecture that provides a shared namespace by uniting server to be maintained and shared across multiple lo- disparate file servers at remote locations into a single log- cal NFS clients. Nache leverages the features of NFSv4 ical filesystem. However, these technologies with propri- to improve the performance of file accesses in a wide- etary clients and protocols incur substantial deployment area distributed setting by bringing the data closer to the expense and have not been widely adopted for enterprise- client. Conceptually, Nache acts as an NFS server to the wide file sharing. In more controlled environments, data local clients and as an NFS client to the remote server. sharing can also be facilitated by a clustered filesystem To provide cache consistency, Nache exploits the read such as GPFS[31] or Lustre[1]. While these are designed and write delegations support in NFSv4. Nache enables for high performance and strong consistency, they are ei- the cache and the delegation to be shared among a set of ther expensive or difficult to deploy and administer or local clients, thereby reducing conflicts and improving both. performance. We have implemented Nache in the Linux 2.6 kernel. Using Filebench workloads and other bench- Recently, a new market has emerged to primarily marks, we present the evaluation of Nache and show that serve the file access requirements of enterprises with out- it can reduce the NFS operations at the server by 10-50%. sourced partnerships where knowledge workers are ex- pected to interact across a number of locations across a WAN. Wide Area File Services (WAFS) is fast gaining 1 Introduction momentum and recognition with leading storage and net- working vendors integrating WAFS solutions into new Most medium to large enterprises store their unstructured product offerings[5][3]. File access is provided through data in filesystems spread across multiple file servers. standard NFS or CIFS protocols with no modifications With ever increasing network bandwidths, enterprises are required to the clients or the server. In order to compen- moving toward distributed operations where sharing pre- sate for high latency of WAN accesses, low bandwidth sentations and documents across office locations, multi- and lossy links, the WAFS offerings rely on custom de- site collaborations and joint product development have vices at both the client and server with a custom protocol become increasingly common. This requires sharing optimized for WAN access in between. data in a uniform, secure, and consistent manner across One approach often used to reduce WAN latency is the global enterprise with reasonably good performance. to cache the data closer to the client. Another is to Data and file sharing has long been achieved through use a WAN-friendly access protocol. The version 4 of traditional file transfer mechanisms such as FTP or dis- the NFS protocol added a number of features to make tributed file sharing protocols such as NFS and CIFS. it more suitable for WAN access [29]. These include: While the former are mostly adhoc, the latter tend to be batching multiple operations in a single RPC call to the “chatty” with multiple round-trips across the network for server, enabling read and write file delegations for reduc- every access. Both NFS and CIFS were originally de- ing cache consistency checks, and support for redirecting signed to work for a local area network involving low la- clients to other, possibly closer, servers. In this paper, we tency and high bandwidth access among the servers and discuss the design and implementation of a caching file clients and as such are not optimized for access over a server proxy called Nache. Nache leverages the features USENIX Association FAST ’07: 5th USENIX Conference on File and Storage Technologies 199 of NFSv4 to improve the performance of file serving permissions on every open by sending a GETATTR in a wide-area distributed setting. Basically, the Nache or ACCESS operation to the server. If the file at- proxy sits in between a local NFS client and a remote tributes are the same as those just after the previ- NFS server bringing the remote data closer to the client. ous close, the client will assume its data cache is Nache acts as an NFS server to the local client and as an still valid; otherwise, the cache is purged. On every NFS client to the remote server. To provide cache con- close, the client writes back any pending changes sistency, Nache exploits the read and write delegations to the file so that the changes are visible on the support in NFSv4. Nache is ideally suited for environ- next open. NFSv3 introduced weak cache con- ments where data is commonly shared across multiple sistency [28] which provides a way, albeit imper- clients. It provides a consistent view of the data by al- fect, of checking a file’s attributes before and after lowing multiple clients to share a delegation, thereby re- an operation to allow a client to identify changes moving the overhead of a recall on a conflicting access. that could have been made by other clients. NFS, Sharing files across clients is common for read-only data however, never implemented distributed cache co- front-ended by web servers, and is becoming widespread herency or concurrent write management [29] to for presentations, videos, documents and collaborative differentiate between updates from one client and projects across a distributed enterprise. Nache is bene- those from multiple clients. Spritely NFS [35] and ficial even when the degree of sharing is small as it re- NQNFS [23] added stronger consistency semantics duces both the response time of a WAN access and the to NFS by adding server callbacks and leases but overhead of recalls. these never made it to the official protocol. In this paper, we highlight our three main contribu- • Andrew File System (AFS): Compared to NFS, AFS tions. First, we explore the performance implications is better suited for WAN accesses as it relies heav- of read and write open delegations in NFSv4. Second, ily on client-side caching for performance [18]. An we detail the implementation of an NFSv4 proxy cache AFS client does whole file caching (or large chunks architecture in the Linux 2.6 kernel. Finally, we dis- in later versions). Unlike an NFS client that checks cuss how delegations are leveraged to provide consistent with the server on a file open, in AFS, the server caching in the proxy. Using our testbed infrastructure, establishes a callback to notify the client of other we demonstrate the performance benefits of Nache using updates that may happen to the file. The changes the Filebench benchmark and other workloads. For these made to a file are made visible at the server when workloads, the Nache is shown to reduce the number of the client closes the file. When there is a conflict NFS operations seen at the server by 10-50%. with a concurrent close done by multiple clients, the The rest of the paper is organized as follows. In the file at the server reflects the data of the last client’s next section we provide a brief background of consis- close. DCE/DFS improves upon AFS caching by tency support in various distributed filesystems. Sec- letting the client specify the type of file access (read, tion 3 analyzes the delegation overhead and benefits in write) so that the callback is issued only when there NFSv4. Section 4 provides an overview of the Nache ar- is an open mode conflict. chitecture. Its implementation is detailed in section 5 and • Common Internet File System (CIFS): CIFS enables evaluated in section 6 using different workloads. We dis- stronger cache consistency [34] by using oppor- cuss related work in section 7. Finally, section 8 presents tunistic locks (OpLocks) as a mechanism for cache our conclusions and describes future work. coherence. The CIFS protocol allows a client to request three types of OpLocks at the time of file open: exclusive, batch and level II. An exclusive 2 Background: Cache Consistency oplock enables it to do all file operations on the file without any communication with the server till the An important consideration in the design of any caching file is closed. In case another client requests access solution is cache consistency. Consistency models in to the same file, the oplock is revoked and the client caching systems have been studied in depth in various is required to flush all modified data back to the large distributed systems and databases [9]. In this sec- server. A batch oplock permits the client to hold on tion, we review the consistency characteristics of some to the oplock even after a close if it plans to reopen widely deployed distributed filesystems. the file very soon. Level II oplocks are shared and • Network File System (NFS): Since perfect co- are used for read-only data caching where multiple herency among NFS clients is expensive to achieve, clients can simultaneously read the locally cached the NFS protocol falls back to a weaker model version of the file. known as close-to-open consistency [12]. In this model, the NFS client checks for file existence and 200 FAST ’07: 5th USENIX Conference on File and Storage Technologies USENIX Association 3 Delegations and Caching in NFSv4 14 No delegations With Read Delegations ) s 0 0 12 0 The design of the version 4 of the NFS protocol [29] in- ' n i ( 10 cludes a number of features to improve performance in a r e v r wide area network with high latency and low bandwidth e 8 S t links.