Wang Paper (Prepublication)

Total Page:16

File Type:pdf, Size:1020Kb

Wang Paper (Prepublication) Riverbed: Enforcing User-defined Privacy Constraints in Distributed Web Services Frank Wang Ronny Ko, James Mickens MIT CSAIL Harvard University Abstract 1.1 A Loss of User Control Riverbed is a new framework for building privacy-respecting Unfortunately, there is a disadvantage to migrating applica- web services. Using a simple policy language, users define tion code and user data from a user’s local machine to a restrictions on how a remote service can process and store remote datacenter server: the user loses control over where sensitive data. A transparent Riverbed proxy sits between a her data is stored, how it is computed upon, and how the data user’s front-end client (e.g., a web browser) and the back- (and its derivatives) are shared with other services. Users are end server code. The back-end code remotely attests to the increasingly aware of the risks associated with unauthorized proxy, demonstrating that the code respects user policies; in data leakage [11, 62, 82], and some governments have begun particular, the server code attests that it executes within a to mandate that online services provide users with more con- Riverbed-compatible managed runtime that uses IFC to en- trol over how their data is processed. For example, in 2016, force user policies. If attestation succeeds, the proxy releases the EU passed the General Data Protection Regulation [28]. the user’s data, tagging it with the user-defined policies. On Articles 6, 7, and 8 of the GDPR state that users must give con- the server-side, the Riverbed runtime places all data with com- sent for their data to be accessed. Article 17 defines a user’s patible policies into the same universe (i.e., the same isolated right to request her data to be deleted; Article 32 requires instance of the full web service). The universe mechanism a company to implement “appropriate” security measures allows Riverbed to work with unmodified, legacy software; for data-handling pipelines. Unfortunately, requirements like unlike prior IFC systems, Riverbed does not require devel- these lack strong definitions and enforcement mechanisms at opers to reason about security lattices, or manually annotate the systems level. Laws like GDPR provide little technical code with labels. Riverbed imposes only modest performance guidance to a developer who wants to comply with the laws overheads, with worst-case slowdowns of 10% for several while still providing the sophisticated applications that users real applications. enjoy. The research community has proposed information flow 1 INTRODUCTION control (IFC) as a way to constrain how sensitive data spreads In a web service, a client like a desktop browser or smart- throughout a complex system [35, 42]. IFC assigns labels to phone app interacts with datacenter machines. Although program variables or OS-level resources like processes and smartphones and web browsers provide rich platforms for pipes; given a partial ordering which defines the desired secu- computation, the core application state typically resides in rity relationships between labels, an IFC system can enforce cloud storage. This state accrues much of its value from server- rich properties involving data secrecy and integrity. Unfortu- side computations that involve no participation (or explicit nately, traditional IFC is too burdensome to use in modern, consent) from end-user devices. large-scale web services. The reason is that creating and main- By running the bulk of an application atop VMs in a com- taining a partial ordering of labels is too difficult—the average modity cloud, developers receive two benefits. First, develop- programmer or end-user struggles to reason about data safety ers shift the burden of server administration to professional via the abstraction of fine-grained label hierarchies. As a re- datacenter operators. Second, developers gain access to scale- sult, no popular, large-scale web service uses IFC to restrict out resources that vastly exceed those that are available to how sensitive data is processed and shared. a single user device. Scale-out storage allows developers to co-locate large amounts of data from multiple users; scale-out 1.2 Our Solution: Riverbed computation allows developers to process the co-located data In this paper, we introduce Riverbed, a distributed web plat- for the benefit of users (e.g., by providing tailored search form for safeguarding the privacy of user data. Riverbed pro- results) and the benefit of the application (e.g., by providing vides benefits to both web developers and end users. To web targeted advertising). developers, Riverbed provides a practical IFC system which Web content who share the same data flow policies. We call each copy a (x.com) universe. Since users in the same universe allow the same Browser Application code Universes types of data manipulations, any policy violations indicate HTTP POST upload.html true problems with the application (e.g., the application tried <Sensitive user data> HTTP response IO to transmit sensitive data to a server that was not whitelisted by the inhabitants of the universe). Riverbed Riverbed managed web proxy runtime (IFC) Before a user’s Riverbed proxy sends data to a server, the x.com User-defined policy Trusted hardware proxy employs remote attestation [9, 15] to verify that the Riverbed y.com server is running an IFC-enforcing Riverbed runtime. Impor- policies policy z.com tantly, a trusted server will perform next-hop attestation—the policy HTTP POST upload.html <Sensitive user data> server will not transmit sensitive data to another network + IO devices endpoint unless that endpoint is an attested Riverbed run- x.com policy time whose TLS certificate name is explicitly whitelisted by Figure 1: Riverbed’s architecture. The user’s client device is on the the user’s data flow policy. In this manner, Riverbed enables left, and the web service is on the right. Unmodified components are controlled data sharing between machines that span different white; modified or new components are grey. domains. allows developers to easily “bolt on” stronger security poli- 1.3 Our Contributions cies for complex applications written in standard managed To the best of our knowledge, Riverbed is the first distributed languages. To end users, Riverbed provides a straightforward IFC framework which is practical enough to support large- mechanism to verify that server-side code is running within a scale, feature-rich web services that are written in general- privacy-preserving environment. purpose managed languages. Riverbed preserves the tradi- Figure 1 describes Riverbed’s architecture. For each tional advantages of cloud-based applications, allowing devel- Riverbed web service, a user defines an information flow opers to offload administrative tasks and leverage scale-out policy using simple, human-understandable constraints like resources. However, Riverbed’s universe mechanism, coupled “do not save my data to persistent storage” or “my data may with a simple policy language, provides users with understand- only be sent over the network to x.com.” In the common able, enforceable abstractions for controlling how datacenters case, users employ predefined, templated policy files that are manipulate sensitive data. Riverbed makes it easier for devel- designed by user advocacy groups like the EFF. When a user opers to comply with laws like GDPR—users give explicit generates an HTTP request, a web proxy on the user’s de- consent for data access via Riverbed policies, with server-side vice transparently adds the appropriate data flow policy as a universes constraining how user data may be processed, and special HTTP header. where its derivatives can be stored. We have ported several non-trivial applications to Riverbed, Within a datacenter, Riverbed leverages the fact that many and written data flow policies for those applications. Our services run atop managed runtimes like Python, .NET, or experiments show that Riverbed enforces policies with worst- the JVM. Riverbed modifies such a runtime to automatically case end-to-end overheads of 10%. Riverbed also supports taint incoming HTTP data with the associated user policies. legacy code with little or no developer intervention, making As the application derives new data from tainted bytes, the it easy for well-intentioned (but average-skill) developers to runtime ensures that the new data is also marked as tainted. If write services that respect user privacy. the application tries to externalize data via the disk or the net- work, the externalization is only allowed if it is permitted by 2 RELATED WORK user policies. The Riverbed runtime terminates an application In this section, we compare Riverbed to representative in- process which attempts a disallowed externalization. stances of prior IFC systems. At a high level, Riverbed’s In Riverbed, application code (i.e., the code which the innovation is the leveraging of universes and human- managed runtime executes) is totally unaware that IFC is understandable, user-defined policies to enforce data flow con- occurring. Application developers have no way to read, write, straints in IFC-unaware programs. Riverbed enforces these create, or destroy taints and data flow policies. The advantage constraints without requiring developers to add security anno- of this scheme is that it makes Riverbed compatible with tations to source code. code that has not been explicitly annotated with traditional IFC labels. However, different end users will likely define 2.1 Explicit Labeling incompatible data flow policies. As a result, policy-agnostic In a classic IFC system, developers explicitly label program code would quickly generate a policy violation for some state, and construct a lattice which defines the ways in which subset of users; Riverbed would then terminate the application. differently-labeled state can interact. Roughly speaking, a To avoid this problem, Riverbed spawns multiple, lightweight program is composed of assignment statements; the IFC sys- copies of the back-end service, one for each set of users tem only allows a particular assignment if all of the policies involving righthand objects are compatible with the policies TaintDroid works on unmodified applications.
Recommended publications
  • Copy on Write Based File Systems Performance Analysis and Implementation
    Copy On Write Based File Systems Performance Analysis And Implementation Sakis Kasampalis Kongens Lyngby 2010 IMM-MSC-2010-63 Technical University of Denmark Department Of Informatics Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk Abstract In this work I am focusing on Copy On Write based file systems. Copy On Write is used on modern file systems for providing (1) metadata and data consistency using transactional semantics, (2) cheap and instant backups using snapshots and clones. This thesis is divided into two main parts. The first part focuses on the design and performance of Copy On Write based file systems. Recent efforts aiming at creating a Copy On Write based file system are ZFS, Btrfs, ext3cow, Hammer, and LLFS. My work focuses only on ZFS and Btrfs, since they support the most advanced features. The main goals of ZFS and Btrfs are to offer a scalable, fault tolerant, and easy to administrate file system. I evaluate the performance and scalability of ZFS and Btrfs. The evaluation includes studying their design and testing their performance and scalability against a set of recommended file system benchmarks. Most computers are already based on multi-core and multiple processor architec- tures. Because of that, the need for using concurrent programming models has increased. Transactions can be very helpful for supporting concurrent program- ming models, which ensure that system updates are consistent. Unfortunately, the majority of operating systems and file systems either do not support trans- actions at all, or they simply do not expose them to the users.
    [Show full text]
  • PDF, 32 Pages
    Helge Meinhard / CERN V2.0 30 October 2015 HEPiX Fall 2015 at Brookhaven National Lab After 2004, the lab, located on Long Island in the State of New York, U.S.A., was host to a HEPiX workshop again. Ac- cess to the site was considerably easier for the registered participants than 11 years ago. The meeting took place in a very nice and comfortable seminar room well adapted to the size and style of meeting such as HEPiX. It was equipped with advanced (sometimes too advanced for the session chairs to master!) AV equipment and power sockets at each seat. Wireless networking worked flawlessly and with good bandwidth. The welcome reception on Monday at Wading River at the Long Island sound and the workshop dinner on Wednesday at the ocean coast in Patchogue showed more of the beauty of the rather natural region around the lab. For those interested, the hosts offered tours of the BNL RACF data centre as well as of the STAR and PHENIX experiments at RHIC. The meeting ran very smoothly thanks to an efficient and experienced team of local organisers headed by Tony Wong, who as North-American HEPiX co-chair also co-ordinated the workshop programme. Monday 12 October 2015 Welcome (Michael Ernst / BNL) On behalf of the lab, Michael welcomed the participants, expressing his gratitude to the audience to have accepted BNL's invitation. He emphasised the importance of computing for high-energy and nuclear physics. He then intro- duced the lab focusing on physics, chemistry, biology, material science etc. The total head count of BNL-paid people is close to 3'000.
    [Show full text]
  • Master Boot Record Vs Guid Mac
    Master Boot Record Vs Guid Mac Wallace is therefor divinatory after kickable Noach excoriating his philosophizer hourlong. When Odell perches dilaceratinghis tithes gravitated usward ornot alkalize arco enough, comparatively is Apollo and kraal? enduringly, If funked how or following augitic is Norris Enrico? usually brails his germens However, half the UEFI supports the MBR and GPT. Following your suggested steps, these backups will appear helpful to restore prod data. OK, GPT makes for playing more logical choice based on compatibility. Formatting a suit Drive are Hard Disk. In this guide, is welcome your comments or thoughts below. Thus, making, or paid other OS. Enter an open Disk Management window. Erase panel, or the GUID Partition that, we have covered the difference between MBR and GPT to care unit while partitioning a drive. Each record in less directory is searched by comparing the hash value. Disk Utility have to its important tasks button activated for adding, total capacity, create new Container will be created as well. Hard money fix Windows Problems? MBR conversion, the main VBR and the backup VBR. At trial three Linux emergency systems ship with GPT fdisk. In else, the user may decide was the hijack is unimportant to them. GB even if lesser alignment values are detected. Interoperability of the file system also important. Although it hard be read natively by Linux, she likes shopping, the utility Partition Manager has endeavor to working when Disk Utility if nothing to remain your MBR formatted external USB hard disk drive. One station time machine, reformat the storage device, GPT can notice similar problem they attempt to recover the damaged data between another location on the disk.
    [Show full text]
  • Hardware-Driven Evolution in Storage Software by Zev Weiss A
    Hardware-Driven Evolution in Storage Software by Zev Weiss A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Sciences) at the UNIVERSITY OF WISCONSIN–MADISON 2018 Date of final oral examination: June 8, 2018 ii The dissertation is approved by the following members of the Final Oral Committee: Andrea C. Arpaci-Dusseau, Professor, Computer Sciences Remzi H. Arpaci-Dusseau, Professor, Computer Sciences Michael M. Swift, Professor, Computer Sciences Karthikeyan Sankaralingam, Professor, Computer Sciences Johannes Wallmann, Associate Professor, Mead Witter School of Music i © Copyright by Zev Weiss 2018 All Rights Reserved ii To my parents, for their endless support, and my cousin Charlie, one of the kindest people I’ve ever known. iii Acknowledgments I have taken what might be politely called a “scenic route” of sorts through grad school. While Ph.D. students more focused on a rapid graduation turnaround time might find this regrettable, I am glad to have done so, in part because it has afforded me the opportunities to meet and work with so many excellent people along the way. I owe debts of gratitude to a large cast of characters: To my advisors, Andrea and Remzi Arpaci-Dusseau. It is one of the most common pieces of wisdom imparted on incoming grad students that one’s relationship with one’s advisor (or advisors) is perhaps the single most important factor in whether these years of your life will be pleasant or unpleasant, and I feel exceptionally fortunate to have ended up iv with the advisors that I’ve had.
    [Show full text]
  • BSD Projects IV – BSD Certification • Main Features • Community • Future Directions a (Very) Brief History of BSD
    BSD Overview Jim Brown May 24, 2012 BSD Overview - 5/24/2012 - Jim Brown, ISD BSD Overview I – A Brief History of BSD III – Cool Hot Stuff • ATT UCB Partnership • Batteries Included • ATT(USL) Lawsuit • ZFS , Hammer • BSD Family Tree • pf Firewall, pfSense • BSD License • Capsicum • Virtualization Topics • Jails, Xen, etc. • Desktop PC-BSD II – The Core BSD Projects IV – BSD Certification • Main Features • Community • Future Directions A (Very) Brief History of BSD 1971 – ATT cheaply licenses Unix source code to many organizations, including UCB as educational material 1975 – Ken Thompson takes a sabbatical from ATT, brings the latest Unix source on tape to UCB his alma mater to run on a PDP 11 which UCB provided. (Industry/academic partnerships were much more common back then.) Computer Science students (notably Bill Joy and Chuck Haley) at UCB begin to make numerous improvements to Unix and make them available on tape as the “Berkeley Software Distribution” - BSD A (Very) Brief History of BSD Some notable CSRG • 1980 – Computer Science Research Group members (CSRG) forms at UCB with DARPA funding to make many more improvements to Unix - job control, autoreboot, fast filesystem, gigabit address space, Lisp, IPC, sockets, TCP/IP stack + applications, r* utils, machine independence, rewriting almost all ATT code with UCB/CSRG code, including many ports • 1991 – The Networking Release 2 tape is released on the Internet via anon FTP. A 386 port quickly follows by Bill and Lynne Jolitz. The NetBSD group is formed- the first Open Source community entirely on the Internet • 1992 – A commercial version, BSDI (sold for $995, 1-800-ITS-UNIX) draws the ire of USL/ATT.
    [Show full text]
  • Data Storage on Unix
    Data Storage on Unix Patrick Louis 2017-11-05 Published online on venam.nixers.net © Patrick Louis 2017 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of the rightful author. First published eBook format 2017 The author has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Introduction 4 Ideas and Concepts 5 The Overall Generic Architecture 6 Lowest Level - Hardware & Limitation 8 The Medium ................................ 8 Connectors ................................. 10 The Drivers ................................. 14 A Mention on Block Devices and the Block Layer ......... 16 Mid Level - Partitions and Volumes Organisation 18 What’s a partitions ............................. 21 High Level 24 A Big Overview of FS ........................... 24 FS Examples ............................. 27 A Bit About History and the Origin of Unix FS .......... 28 VFS & POSIX I/O Layer ...................... 28 POSIX I/O 31 Management, Commands, & Forensic 32 Conclusion 33 Bibliography 34 3 Introduction Libraries and banks, amongst other institutions, used to have a filing system, some still have them. They had drawers, holders, and many tools to store the paperwork and organise it so that they could easily retrieve, through some documented process, at a later stage whatever they needed. That’s where the name filesystem in the computer world emerges from and this is oneofthe subject of this episode. We’re going to discuss data storage on Unix with some discussion about filesys- tem and an emphasis on storage device.
    [Show full text]
  • Evolutionary Tree-Structured Storage: Concepts, Interfaces, and Applications
    Evolutionary Tree-Structured Storage: Concepts, Interfaces, and Applications Dissertation submitted for the degree of Doctor of Natural Sciences Presented by Marc Yves Maria Kramis at the Faculty of Sciences Department of Computer and Information Science Date of the oral examination: 22.04.2014 First supervisor: Prof. Dr. Marcel Waldvogel Second supervisor: Prof. Dr. Marc Scholl i Abstract Life is subdued to constant evolution. So is our data, be it in research, business or personal information management. From a natural, evolutionary perspective, our data evolves through a sequence of fine-granular modifications resulting in myriads of states, each describing our data at a given point in time. From a technical, anti- evolutionary perspective, mainly driven by technological and financial limitations, we treat the modifications as transient commands and only store the latest state of our data. It is surprising that the current approach is to ignore the natural evolution and to willfully forget about the sequence of modifications and therefore the past state. Sticking to this approach causes all kinds of confusion, complexity, and performance issues. Confusion, because we still somehow want to retrieve past state but are not sure how. Complexity, because we must repeatedly work around our own obsolete approaches. Performance issues, because confusion times complexity hurts. It is not surprising, however, that intelligence agencies notoriously try to collect, store, and analyze what the broad public willfully forgets. Significantly faster and cheaper random-access storage is the key driver for a paradigm shift towards remembering the sequence of modifications. We claim that (1) faster storage allows to efficiently and cleverly handle finer-granular modifi- cations and (2) that mandatory versioning elegantly exposes past state, radically simplifies the applications, and effectively lays a solid foundation for backing up, distributing and scaling of our data.
    [Show full text]
  • Freebsd Enterprise Storage Polish BSD User Group Welcome 2020/02/11 Freebsd Enterprise Storage
    FreeBSD Enterprise Storage Polish BSD User Group Welcome 2020/02/11 FreeBSD Enterprise Storage Sławomir Wojciech Wojtczak [email protected] vermaden.wordpress.com twitter.com/vermaden bsd.network/@vermaden https://is.gd/bsdstg FreeBSD Enterprise Storage Polish BSD User Group What is !nterprise" 2020/02/11 What is Enterprise Storage? The wikipedia.org/wiki/enterprise_storage page tells nothing about enterprise. Actually just redirects to wikipedia.org/wiki/data_storage page. The other wikipedia.org/wiki/computer_data_storage page also does the same. The wikipedia.org/wiki/enterprise is just meta page with lin s. FreeBSD Enterprise Storage Polish BSD User Group What is !nterprise" 2020/02/11 Common Charasteristics o Enterprise Storage ● Category that includes ser$ices/products designed &or !arge organizations. ● Can handle !arge "o!umes o data and !arge num%ers o sim#!tano#s users. ● 'n$olves centra!ized storage repositories such as SA( or NAS de$ices. ● )equires more time and experience%expertise to set up and operate. ● Generally costs more than consumer or small business storage de$ices. ● Generally o&&ers higher re!ia%i!it'%a"aila%i!it'%sca!a%i!it'. FreeBSD Enterprise Storage Polish BSD User Group What is !nterprise" 2020/02/11 EnterpriCe or EnterpriSe? DuckDuckGo does not pro$ide search results count +, Goog!e search &or enterprice word gi$es ~ 1 )00 000 results. Goog!e search &or enterprise word gi$es ~ 1 000 000 000 results ,1000 times more). ● /ost dictionaries &or enterprice word sends you to enterprise term. ● Given the *+,CE o& many enterprise solutions it could be enterPRICE 0 ● 0 or enterpri$e as well +.
    [Show full text]
  • 855400Cd46b6c9e7683b5ace58a55234.Pdf
    Ars TeAchrsn iTceachnica UK Register Log in ▼ ▼ Ars Technica has arrived in Europe. Check it out! × Encryption options are great, but Apple's attitude on checksums is still funky. by Adam H. Leventhal - Jun 26, 2016 1:00pm UTC KEEPING ON TRUCKING INDEPENDENCE, WHAT INDEPENDENCE? Two hours or so of WWDC keynoting and Tim Cook didn't mention a new file system once? Andrew Cunningham This article was originally published on Adam Leventhal's blog in multiple parts. Programmer Andrew Nõmm: "I had to be made Apple announced a new file system that will make its way into all of its OS variants (macOS, tvOS, iOS, an example of as a warning to all IT people." watchOS) in the coming years. Media coverage to this point has been mostly breathless elongations of Apple's developer documentation. With a dearth of detail I decided to attend the presentation and Q&A with the APFS team at WWDC. Dominic Giampaolo and Eric Tamura, two members of the APFS team, gave an overview to a packed room; along with other members of the team, they patiently answered questions later in the day. With those data points and some first-hand usage I wanted to provide an overview and analysis both as a user of Apple-ecosystem products and as a long-time operating system and file system developer. The overview is divided into several sections. I'd encourage you to jump around to topics of interest or skip right to the conclusion (or to the tweet summary). Highest praise goes to encryption; ire to data integrity.
    [Show full text]
  • Limits and the Practical Usability of Bsds, a Big Data Prospective
    Limits and the Practical Usability of BSDs, a Big Data Prospective Predrag Punosevacˇ [email protected] The Auton Lab Carnegie Mellon University June 11, 2016 1 / 22 Thanks Thanks to organizers for this great meeting and for giving me the op- portunity to speak. note 1 of slide 1 Intro ❖ Intro ● Who am I? ❖ Chronology ❖ Chronology II ❖ Genealogy Tree ❖ General Limitations ❖ Scientific Computing ❖ Continuation ❖ misc issues ❖ NetBSD ❖ OpenBSD ❖ pf.conf and pfctl ❖ OpenBSD cons ❖ FreeBSD ❖ TrueOS ❖ TurnKey Appliance ❖ FreeNAS ❖ pfSense ❖ DragonFly BSD ❖ HAMMER ❖ Dark Clouds ❖ References 2 / 22 Intro ❖ Intro ● Who am I? ❖ Chronology ❖ Chronology II ❖ Genealogy Tree ● What is the Auton Lab? ❖ General Limitations ❖ Scientific Computing ❖ Continuation ❖ misc issues ❖ NetBSD ❖ OpenBSD ❖ pf.conf and pfctl ❖ OpenBSD cons ❖ FreeBSD ❖ TrueOS ❖ TurnKey Appliance ❖ FreeNAS ❖ pfSense ❖ DragonFly BSD ❖ HAMMER ❖ Dark Clouds ❖ References 2 / 22 Intro ❖ Intro ● Who am I? ❖ Chronology ❖ Chronology II ❖ Genealogy Tree ● What is the Auton Lab? ❖ General Limitations ❖ Scientific ● Why don’t we just use SCS computing facilities? Computing ❖ Continuation ❖ misc issues ❖ NetBSD ❖ OpenBSD ❖ pf.conf and pfctl ❖ OpenBSD cons ❖ FreeBSD ❖ TrueOS ❖ TurnKey Appliance ❖ FreeNAS ❖ pfSense ❖ DragonFly BSD ❖ HAMMER ❖ Dark Clouds ❖ References 2 / 22 Intro ❖ Intro ● Who am I? ❖ Chronology ❖ Chronology II ❖ Genealogy Tree ● What is the Auton Lab? ❖ General Limitations ❖ Scientific ● Why don’t we just use SCS computing facilities? Computing ❖ Continuation ❖ misc issues ● How did
    [Show full text]
  • 08-Filesystems.Pdf
    ECE590-03 Enterprise Storage Architecture Fall 2017 File Systems Tyler Bletsch Duke University The file system layer User code open, read, write, seek, close, stat, mkdir, rmdir, unlink, ... Kernel VFS layer File system drivers ext4 fat nfs ... Disk driver NIC driver read_block, write_block packets Could be a single drive or a RAID HDD / SSD 2 High-level motivation • Disks are dumb arrays of blocks. • Need to allocate/deallocate (claim and free blocks) • Need logical containers for blocks (allow delete, append, etc. on different buckets of data – files) • Need to organize such containers (how to find data – directories) • May want to access control (restrict data access per user) • May want to dangle additional features on top Result: file systems 3 Disk file systems • All have same goal: • Fulfill file system calls (open, seek, read, write, close, mkdir, etc.) • Store resulting data on a block device • The big (non-academic) file systems • FAT (“File Allocation Table”): Primitive Microsoft filesystem for use on floppy disks and later adapted to hard drives • FAT32 (1996) still in use (default file system for USB sticks, SD cards, etc.) • Bad performance, poor recoverability on crash, but near-universal and easy for simple systems to implement • ext2, ext3, ext4: Popular Linux file system. • Ext2 (1993) has inode-based on-disk layout – much better scalability than FAT • Ext3 (2001) adds journaling – much better recoverability than FAT • Ext4 (2008) adds various smaller benefits • NTFS: Current Microsoft filesystem (1993). • Like ext3, adds journaling to provide better recoverability than FAT • More expressive metadata (e.g. Access Control Lists (ACLs)) • HFS+: Current Mac filesystem (1998).
    [Show full text]
  • Btrfs the Filesystemsswiss Army Knife of Storage JOSEF BACIK
    Btrfs The FILESYSTEMSSwiss Army Knife of Storage JOSEF BACIK Josef is the lead developer Btrfs is a new file system for Linux that has been under development for four years on Btrfs for Red Hat. He cut now and is based on Ohad Rodeh’s copy-on-write B-tree . Its aim is to bring more his teeth on the clustered file efficient storage management and better data integrity features to Linux . It has system GFS2, moving on to been designed to offer advanced features such as built-in RAID support, snapshot- help maintain ext3 for Red Hat until Btrfs was ting, compression, and encryption . Btrfs also checksums all metadata and will publicly announced in 2007. checksum data with the option to turn off data checksumming . In this article I [email protected] explain a bit about how Btrfs is designed and how you can use these new capabili- ties to your advantage . Historically, storage management on Linux has been disjointed . You have the traditional mdadm RAID or the newer dmraid if you wish to set up software RAID . There is LVM for setting up basic storage management capable of having separate volumes from a common storage pool as well as the ability to logically group disks into one storage pool . Then on top of your storage pools you have your file system, usually an ext variant or XFS . The drawback of this approach is that it can get complicated when you want to change your setup . For example, if you want to add another disk to your logical volume in order to add more space to your home file system, you first must initialize and add that disk to your LVM volume group, then extend your logical volume, and only then extend your file system .
    [Show full text]