AFS, a History and Survey 1980-2000 Derrick Brashear OpenAFS Project and Your File System, Inc. Timeline
• 1981: CMU “charge” • 1984: AFS 1 • 1985: AFS 2 • 1986: Coda forked • 1987: AFS 3 • 1989: Transarc founded • 1991: I entered CMU • 1994: IBM buys out Transarc • 2000: OpenAFS Laying the groundwork
• 1980: CMU President Cyert creates committee to increase prominence • 1981 Oct: Cyert convened CMU Task Force for the Future of Computing • 1982 Feb: Preliminary report presages Andrew Project • 1982 Oct: 5 year joint venture agreement with IBM Hooking IBM
• Newell report used to attract interest of IBM Chief Scientist Lewis Branscomb. • IBM would loan about a dozen employees, and fund a similar sized group of CMU employees. • Cyert’s business background led to terms where IBM owned IP and would commercialize. Early personnel
• IBM - Michael Conner • CMU - Jim Morris (from Xerox PARC) • M. Satyanarayanan (CMU - first tech hire) • Alfred Spector (CMU faculty) - consultant • Rick Rashid • Jerome Saltzer, David Clark (MIT) - consultants Early discussions
• Original charge did not specify a filesystem • VICE: Vast Integrated Computing Environment • Coined by Jim Morris for a blob connecting file system clients • File System Team: Satya, Rashid, Spector, Dave McDonald, Dave Gifford Bootstrapping
Architecture design through early-mid • 1983. August 1983: ITC File System Subgroup • report • First real “cloud computing” proposal. September 1983: File System Goals and • Design • Months of debate on direction Design Goals
• Location Transparency (same path always) • User Mobility (use any computer) • Security • Performance • Scalability • Availability (avoid single points of failure) • Integrity (minimal risk of data loss) • Heterogeneity (use any kind of computer) Learning
• Some arguing on how to proceed • Throwaway prototype to gain experience • RPC, ACL mechanism, protection - Satya • Server - Mike West • Venus (client) - David Nichols AFS 1
• Whole file caching • Client outside the kernel (predated VFS) • TCP transport • fork() based server scaling • Polling for changes (Consistency Check) • File checkout allowed Moving on
• Became clear that the throwaway needed to be tossed. • Andrew Benchmark to profile for issues • New RPC system, reused ACL mech - Satya • New Cache Manager - Mike Kazar • Volume concept - Bob Sidebotham • New fileserver - Mike West AFS 2 Performance
• Cache Management (callbacks, LRU) • Name Resolution (fids) • Communication (RPC2) • Server architecture (LWP vs fork) • Server data storage (iopen) AFS2 Operability
• Volume, mountpoint concept for server- side location transparency • Quotas • Read-only replication • Volume-based backups • Process Authentication Groups Design tradeoffs
• No read-write replication • open()-close() rather than write() consistency • Advisory file locking AFS 2 Issues
• Whole file caching limited files to cache size • Scalability beyond 700 client nodes seemed unlikely • Self-administration of protection groups did not exist • Only a single administration domain was possible The “modern” era
• Different directions for research versus product: • Coda fork from AFS 2 late 1986 • Focused on issues from network unreliability • AFS 3 development began “Scale and Performance”
• Described AFS 2 • Most key features of modern AFS in place • 2008 SIGOPS Hall of Fame Award Changes for AFS 3
• New RPC system (first R, then Rx) • Cache manager in kernel • Chunked file caching • “Cellular Andrew” • Instant database replication (Ubik) • New authentication system (kaserver) Rx
• R: prototype, with XDR and rpcgen • Rx: R, extended. • Added: • Once-only semantic • Multiple call channels • Retained: • Bulk transfer • Parallel calls (rxmulti) Cellular Andrew
• Allowed separate domains of control for the first time. • Paper on the topic already called out the need for delegating permissions. Ubik
• Instant database replication • Distributed atomic commit semantics • Floating (elected) master • Relied on Rx semantics and security • Used under all AFS databases Kerberos AuthServer
• MIT Project Athena running concurrently • AuthServer rewritten to use Kerberos • DES derived keys available for Rx security use (FCrypt, not DES) Other features
• Time service - predated NTP • ACLs - mode bits not rich enough • Directory level security carried over since AFS 1 • Self-administered protection groups Intellectual Property
• IBM owned rights to AFS • Willing customers at AFS 2 time • Without documentation, IBM wouldn’t sell • Another path needed Transarc
• Founded 1989 by several Andrew Project members • IBM an initial investor • AFS 3.0 released • 1990 DARPA grant to document and improve product. Backups
• Added for AFS 3.0 • Online (snapshot) backups - exactly 1 • Volume based backups to tape • Per-backup host tape coordinator (butc) • Per-backup host command interpreter • Shared (Ubik) database added later (3.1) Divided focus
• Talk began of AFS 4.0 with missing features • AFS 3 releases also followed • 3.1 beta in November • 3.1 final in February 1991 • A migration path to 4.0 was planned • But most feature requests “will be in AFS 4.0” “AFS 4.0”
• OSF founded 1988 to counter Sun/AT&T • DCE to counter NIS/NFS • Transarc saw it as “the future” and joined • Ended up a political vessel • Some tech from each member Outside Research
• CITI (University of Michigan) • 1991 • Multilevel caching (intermediate AFS) • Hijacking AFS (Rx security) • 1992 • Faster AFS (larger chunks) • Intermediate fileservers (AFP, NFS, AFS) • AFS Write Performance Outside Research
• CITI (University of Michigan) • 1993 • Disconnected Operation for AFS • The Rx Hex (catching up to IP advances) • AFS Server Logging (RPC instrumentation) • 2001 • Improving AFS Performance via Selective Caching and Native ATM(Cache Bypass, Split Path) Contributed Development
• Aklog (MIT Athena) / Cklog (CMU CS) • MIT SIPB ports to NetBSD and Linux (1994) • Nested PTS groups (MIT Athena, and U of Michigan) Comparing to peers
• AT&T RFS • NFSv2 • Coda • NFSv3 • DCE DFS • CIFS AT&T RFS
• Plus • Looked like a local filesystem - remote system calls. • Minus • Unix only • Requires STREAMS • No Cache (other than client buffer cache, with limited callbacks) NFSv2
• Plus • Free • Minus • No caching • Poor performance • Poor authentication • No management tools Coda
• Plus • Supports Read-Write replication • Supports disconnected operation • Minus • Not production quality • No cell support • Poor security integration NFSv3
• Plus • Unicode support • GSSAPI security • TCP support • Minus • Lacks single namespace • No native caching (not helped by stateless protocol) • Minimal Windows clients • No common management layer DCE DFS
• Plus • Tries to mimic local filesystem semantics • Per-file ACLs • Filesets for server data management • Minus • DCE adds much unnecessary complexity • Poor Windows support CIFS
• Plus • ACLs • Huge installed base • Open Source SAMBA available • OpLocks offer some caching advantages • Minus • Lock recovery and migration is complex • Management layer is proprietary The IBM era
• 1994: IBM purchases Transarc • 1996: Windows NT client with 3.4a release • 1998: Revised pricing • First push for open source • 1999: 3.5 release. pthreads support, NT servers, namei fileserver, Linux port, Windows 95 gateway • 2000: 3.6 release, End of Support planned. • OpenAFS announced at LinuxWorld What went wrong
• Early instability • Wrong UNIX horse: Sun beat IBM DFS focus when 3.0 was out and barely • stable. • DCE RPC incompatible with Rx • Poor pricing models • NFS was often “free” • Missed chance to beat “the web” What went right
• AFS 3.0 incorporated many forward- looking ideas from the start. • The “Design Goals” made it useful to this day. • Ubiquity - on most non-mainframe OSes. • Code base mostly modernized. • Open sourced before it was too late. Thanks
Chronologically, • M. Satyanarayanan (and again last week) • Craig Everhart • Lyle Seaman • Mike Kazar • Jim Morris • Al Spector