A Community of Open Source ZFS Developers

OpenZFS: a Community of Open Source ZFS Developers Matthew Ahrens Delphix San Francisco, CA [email protected] Abstract—OpenZFS is a collaboration among open and ever-increasing fsck times. These scalability source ZFS developers on the FreeBSD, illumos, Linux, issues were mitigated only by increasingly complex and Mac OSX platforms. OpenZFS helps these developers administrative procedures. work together to create a consistent, reliable, performant There were incremental and isolated improve- implementation of ZFS. Several new features and performance enhancements have been developed for OpenZFS ments to these problems in various systems. Copy- and are available in all open-source ZFS distributions. on-write filesystems (e.g. NetApp’s WAFL) elimi- nated the need for fsck. High-end storage systems I. INTRODUCTION (e.g. EMC) used special disks with 520-byte sectors In the past decade, ZFS has grown from a project to store checksums of data. Extent-based filesystems managed by a single organization to a distributed, (e.g. NTFS, XFS) worked better than fixed-block- collaborative effort among many communities and size systems when used for non-homogenous work- companies. This paper will explain the motivation loads. But there was no serious attempt to tackle all behind creating the OpenZFS project, the problems of the above issues in a general-purpose filesystem. we aim to address with it, and the projects under- In 2001, Matt Ahrens and Jeff Bonwick started taken to accomplish these goals. the ZFS project at Sun Microsystems with one main goal: to end the suffering of system admin- II. HISTORY OF ZFS DEVELOPMENT istrators who were struggling to manage complex A. Early Days and fallible storage systems. To do so, we needed In the early 2000’s, the state of the art in filesys- to re-evaluate obsolete assumptions and design an tems was not pretty. There was no defense from integrated storage system from scratch. Over the silent data corruption introduced by bit rot, disk next 4 years, they and a team of a dozen engineers and controller firmware bugs, flaky cables, etc. It implemented the fundamentals of ZFS, including was akin to running a server without ECC memory. pooled storage, copy-on-write, RAID-Z, snapshots, Storage was difficult to manage, requiring different and send/receive. A simple administrative model tools to manage files, blocks, labels, NFS, mount- based on hierarchical property inheritance made points, SMB shares, etc. It wasn’t portable between it easy for system administrators to express their different operating systems (e.g. Linux vs Solaris) intent, and made high-end storage features like or processor architectures (e.g. x86 vs SPARC vs checksums, snapshots, RAID, and transparent com- ARM). Storage systems were slow and unscalable. pression accessible to non-experts. They limited the number of files per filesystem, the size of volumes, etc. When available at all, B. Open Source snapshots were limited in number and performance. As part of the OpenSolaris project, in 2005 Sun Backups were slow, and remote-replication soft- released the ZFS source code as open source software was extremely specialized and difficult to use. ware, under the CDDL license. This enabled the Filesystem performance was hampered by coarse- ports of ZFS to FreeBSD, Linux, and Mac OSX, grained locking, fixed block sizes, naive prefetch, and helped create a thriving community of ZFS users. Sun continued to enhance ZFS, bringing it to Linux increased, fragmentation between the plat- enterprise quality and making it part of the Solaris forms became a real risk. operating system and the foundation of the Sun Storage 7000 series (later renamed the Oracle ZFS III. THE OPENZFS COLLABORATION Storage Appliance). The other platforms continually A. Goals pulled changes from OpenSolaris, benefiting from The OpenZFS project was created to accomplish Sun’s continuing investment in ZFS. Other compa- three goals: nies started creating storage products based Open- 1) Open communication: We want everyone Solaris and FreeBSD, making open-source ZFS an working on ZFS to work together, regardless of integral part of their products. what platform they are working on. By working to- However, the vast majority of ZFS development gether, we can reduce duplicated effort and identify happened behind closed doors at Sun. At this time, common goals. very few core enhancements were made to ZFS 2) Consistent experience: We want users’ expe- by non-Sun contributors. Thus although ZFS was rience with OpenZFS to be high-quality regardless Open Source and multi-platform, it did not have an of what platform they are using. Features should be open development model. As long as Sun contin- available on all platforms, and all implementations ued maintaining and enhancing ZFS, this was not of ZFS should have good performance and be free necessarily an impediment to the continued success of bugs. of products and community projects based on open- 3) Public awareness: We want to make sure source ZFS – they could keep getting enhancements that people know that open-source ZFS is available and bug fixes from Sun. on many platforms (e.g. illumos, FreeBSD, Linux, OSX), that it is widely used in some of the most C. Turmoil demanding production environments, and that it continues to be enhanced. In 2010, Oracle acquired Sun Microsystems, stopped contributing source code changes to ZFS, B. Activities and began dismantling the OpenSolaris community. We have undertaken several activities to accom- This raised big concerns about the future of open- plish these goals: source ZFS – without its primary contributor, would 1) Website: The http://open-zfs.org website it stagnate? Would companies creating products (don’t forget the dash!) publicizes OpenZFS based on ZFS flounder without Sun’s engineering activities such as events, talks, and publications. resources behind them? To address this issue for It acts as the authoritative reference for technical both ZFS and OpenSolaris as a whole, the Illumos work, documenting both usage and how ZFS project was created. Illumos took the source code is implemented (e.g. the on-disk format). The from OpenSolaris (including ZFS) and formed a website is also used as a brainstorming and new community around it. Where OpenSolaris de- coordination area for work in progress. To facilitate velopment was controlled by one company, illumos collaboration, the website is a Wiki which can be creates common ground for many companies to con- edited by any registered user. tribute on equal footing. ZFS found a new home in 2) Mailing list: The OpenZFS developer mail- Illumos, with several companies basing their prod- ing list[1] serves as common ground for develop- ucts on it and contributing code changes. FreeBSD ers working on all platforms to discuss work in and Linux treated Illumos as their upstream for progress, review code changes, and share knowledge ZFS code. However, there was otherwise not much of how ZFS is implemented. Before its existence, interaction between platform-specific communities. changes made on one platform often came as a There continued to be duplicated efforts between surprise to developers on other platforms, and some- platforms, and surprises when code changes made times introduced platform compatibility issues or on one platform were not easily ported to others. required new functions to be implemented in the As the pace of ZFS development on FreeBSD and Solaris Porting Layer. The OpenZFS mailing list allows these concerns to be raised and addressed then later merge their changes together into one during code review, when they can easily be ad- codebase that understands both features. In the dressed. Note that this mailing list is not a re- version-number model, two companies or projects placement for platform-specific mailing lists, which each working on their own new on-disk features continue to serve their role primarily for end users would both use version N+1 to denote their new and system administrators to discuss how to use feature, but that number would mean different things ZFS, as well for developers to discuss platform- to each company’s software. This would make it specific code changes. very difficult for both companies to contribute their 3) Office hours: Experts in the OpenZFS com- new features into a common codebase. The world munity hold online office hours[2] approximately would forever be divided into ZFS releases that once a month. These question and answer sessions interpreted version N+1 as company A intended, are hosted by a rotating cast of OpenZFS develop- and those that interpreted it as company B intended. ers, using live audio/video/text conferencing tools. To address this problem, we designed and im- The recorded video is also available online. plemented “feature flags” to replace the linear ver- 4) Conferences: Since September 2013, 6 Open- sion number. Rather than having a simple ver- ZFS developers have presented at 8 conferences. sion number, each storage pool has a list of These events serve both to increase awareness of features that it is using. Features are identified OpenZFS, and also to network with other devel- with strings, using a reverse-DNS naming conven- opers, coordinating work in person. Additionally, tion (e.g. com.delphix:background destroy). This we held the first OpenZFS Developer Summit[3] enables on-disk format changes to be developed in- in November 2013 in San Francisco. More than 30 dependently, and later be integrated into a common individuals participated, representing 14 companies codebase. and all the major platforms. The two-day event With the old version numbers, once a pool was consisted of a dozen presentations and a hackathon. upgraded to a particular version, it couldn’t be Ten projects were started at the hackathon, including accessed by software that didn’t understand that the “best in show”: a team of 5 who ported the version number.

Load more