Andrew File System

Scale and Performance in a Distributed File System John H. Howard et al. ACM Transactions on Computer Systems, 1989 Presented by Gangwon Jo, Sangkuk Kim 1 Andrew File System . Andrew • Distributed computing environment for Carnegie Mellon University • 5,000 – 10,000 Andrew workstations in CMU . Andrew File System • Distributed file system for Andrew • Files are distributed across multiple servers • Presents a homogeneous file name space to all the client workstations 2 Andrew File System (contd.) Servers Disks . Disks . Disks . Unix Kernel Unix Kernel Unix Kernel Vice Vice Vice Network Clients User User User Prog. Venus Prog. Venus Prog. Venus Unix Kernel Unix Kernel Unix Kernel Disk . Disk . Disk . 3 Andrew File System (contd.) . Design goal: Scalability Disks • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are performed directly on the Unix Kernel cached copy Disk 4 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy Disk 5 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy Disk 6 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy Disk 7 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy A Disk 8 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are read/write performed directly on the Unix Kernel cached copy A Disk 9 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are read/write performed directly on the Unix Kernel cached copy A Disk 10 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are close(A) performed directly on the Unix Kernel cached copy A’ Disk 11 Andrew File System (contd.) . Design goal: Scalability Disks A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are close(A) performed directly on the Unix Kernel cached copy A’ Disk 12 Andrew File System (contd.) . Design goal: Scalability Disks A’A • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are close(A) performed directly on the Unix Kernel cached copy A’ Disk 13 Andrew File System (contd.) . Design goal: Scalability Disks A’ • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy A’ Disk 14 Andrew File System (contd.) . Design goal: Scalability Disks A’ • As much work as possible is performed by Venus Unix Kernel . Solution: Caching Vice • Venus caches files from Vice Network • Venus contacts Vice only when a file is opened or User closed Program Venus • Reading and writing are open(A) performed directly on the Unix Kernel cached copy A’ Disk 15 Outline . Building a prototype • Qualitative Observation • Performance Evaluation . Changes for performance • Performance Evaluation . Comparison with a Remote-Open File System . Change for operability . Conclusion 16 Outline . Building a prototype • Qualitative Observation • Performance Evaluation . Changes for performance • Performance Evaluation . Comparison with a Remote-Open File System . Change for operability . Conclusion 17 The Prototype . Preserve directory hierarchy • Each server contained a directory hierarchy mirroring the structure of the Vice files a/ Server Disks Vice Client Disk a1 .admin/ a2 Venus b/ a/ File a1 b1 Cache a2 b2 b/ → Server 2 c/ Status c/ c1/ Cache c11 c1/ → Server 3 .... c2 c12 c2 18 The Prototype (contd.) . Preserve directory hierarchy • Each server contained a directory hierarchy mirroring the structure of the Vice files a/ Server Disks Vice Client Disk a1 a2 .admin/ .admin directories: contain Vice Venus a/ b/ file status information File a1 b1 Cache a2 b2 c/ b/ → Server 2 Stub directories: represent portions Status c/ c1/ located on other servers Cache c1/ → Server 3 c11 .... c2 c12 c2 19 The Prototype (contd.) . Preserve directory hierarchy • Vice-Venus interface name files by their full pathname Server Disks Vice Client Disk .admin/ Venus a/ a/a1 File a1 Cache a2 b/ → Server 2 Status c/ Cache c1/ → Server 3 .... c2 20 The Prototype (contd.) . Dedicated processes • One process for each client Server Disks Vice Client Disk .admin/ Venus a/ File a1 Cache a2 b/ → Server 2 Status c/ Cache c1/ → Server 3 .... c2 21 The Prototype (contd.) . Use two caches • One for files, and the other for status information about files Server Disks Vice Client Disk .admin/ Venus a/ File a1 Cache a2 b/ → Server 2 Status c/ Cache c1/ → Server 3 .... c2 22 The Prototype (contd.) . Verify cached timestamp for each open • Before using a cached file, Venus verify its timestamp with that on the server Server Disks Vice Client Disk .admin/ Venus a/ a/a1(5)? Filea/a1 a1 Cache(5) a2 OK b/ → Server 2 Status c/ Cache c1/ → Server 3 .... c2 23 Qualitative Observation . stat primitive • Testing the presence of files, obtaining status information, ... • Programs using stat run much slower than the authors expected • Each stat involve a cache validity check . Dedicated processes • Excessive context switching overhead • High virtual memory paging demands . File location • Difficult to move users’ directories between servers 24 Performance Evaluation . Experience: the prototype was used in CMU • The authors + 400 other users • 100 workstations and 6 servers . Benchmark • A command script for source files • MakeDir → Copy → ScanDir → ReadAll → Make • Multiple clients (load units) run the benchmark simultaneously 25 Performance Evaluation (contd.) . Cache hit ratio • File cache: 81% • Status cache: 82% 26 Performance Evaluation (contd.) . Distribution of Vice calls in prototype on average SetFileStat ListDir All others Fetch Store Call Distribution (%) TestAuth 61.7 GetFileStat 26.8 Fetch 4.0 GetFileStat Store 2.1 TestAuth SetFileStat 1.8 ListDir 1.8 All others 1.7 27 Performance Evaluation (contd.) . Server usage • CPU utilizations are up to 40% • Disk utilizations are less than 15% • Server loads are imbalanced Utilization (%) Server CPU Disk 1 Disk 2 cluster0 37.8 12.0 6.8 cluster1 12.6 4.1 4.4 cmu-0 7.0 2.5 cmu-1 43.2 13.9 15.1 28 Performance Evaluation (contd.) . Benchmark performance • Time for TestAuth rises rapidly beyond a load 5 Overall time Time per TestAuth call 4.5 14 4 12 3.5 10 3 2.5 8 2 6 1.5 4 Normalized time Normalized Normalized time Normalized 1 0.5 2 0 0 1 2 5 8 10 1 2 5 8 10 Load units Load units 29 Performance Evaluation (contd.) . Caches work well! . We need to • Reduce the frequency of cache validity check • Reduce the number of server processes • Require workstations rather than the servers to do pathname traversals • Balance server usage by reassigning users 30 Outline . Building a prototype • Qualitative Observation • Performance Evaluation . Changes for performance • Performance Evaluation . Comparison with a Remote-Open File System . Change for operability . Conclusion 31 Changes for Performance . Cache management: use callback • Vice notifies Venus if a cached file or directory is modified by other workstation • Cache entries are valid unless otherwise notified −Verification is not needed • Each Vice and Venus maintain callback state information 32 Changes for Performance (contd.) . Name resolution and storage representation • CPU overhead is caused by namei routine −Maps a pathname to an inode • Indicate files by fids instead of pathnames −Volume is a collection of files located on one server – Contains multiple vnodes which indicate files in the volume −Uniquifier allows reuse of vnode numbers Volume number Vnode number Uniquifier 32bit 32bit 32bit 33 Changes for Performance (contd.) . Name resolution and storage representation Clients Volume number Vnode number Uniquifier Servers 34 Changes for Performance (contd.) .

Andrew File System

Andrew File System (AFS) Google File System February 5, 2004

A Survey of Distributed File Systems

The Influence of Scale on Distributed File System Design

Distributed File Systems

Design and Evolution of the Apache Hadoop File System(HDFS)

Andrew File System (AFS)

Using the Andrew File System with BSD

The Andrew File System (AFS)

The Andrew File System (From CMU) Case Study Andrew File System

Distributed File Systems

A Distributed File System for Distributed Conferencing System

Of File Systems and Storage Models