Reducing Size and Complexity of the Security-Critical Code Base of File Systems

Total Page:16

File Type:pdf, Size:1020Kb

Reducing Size and Complexity of the Security-Critical Code Base of File Systems Reducing Size and Complexity of the Security-Critical Code Base of File Systems Dissertation zur Erlangung des akademischen Grades Doktoringenieur (Dr.-Ing.) vorgelegt an der Technischen Universität Dresden Fakultät Informatik eingereicht von Dipl.-Inf. Carsten Weinhold geboren am 30. September 1980 in Dresden Betreuender Hochschullehrer: Prof. Dr. rer. nat. Hermann Härtig Technische Universität Dresden Gutachter: Prof. Dr. Scott Brandt University of California, Santa Cruz Fachreferent: Prof. Dr. Christof Fetzer Technische Universität Dresden Korrigierte Version vom 2. Juli 2014 (eingereicht am 5. Juli 2013, verteidigt am 14. Januar 2014) 2 Acknowledgements I want to thank my advisor Prof. Dr. rer. nat. Hermann Härtig for all the support and guidance over the years and for encouraging me in my work. I warmly thank Björn Döbel, Michael Roitzsch, Anna Krasnova, and Dr. Claude-Joachim Hamann for all the time and effort they put into proof reading draft versions of this thesis and for giving me so much valuable feedback for improving it. Thanks also go to Adam Lackorzynski and Alexander Warg for maintaining and continuously improving the platform of my work and the work of so many others. Everybody in the coffee area: thank you for all the interesting discussions (not only about work). Finally, my greatest thanks go to my family and to my friends for all their help, support, and patience, but especially for being so much fun to be with. Thank you all! 3 4 Contents 1 Introduction 13 1.1 Motivation ........................................ 13 1.2 A Virtual Private File System .............................. 15 1.3 Thesis Goals ....................................... 16 1.4 Non-Goals ......................................... 17 1.5 Contribution ....................................... 18 2 Background and State of the Art 21 2.1 Isolation Architectures and Legacy Reuse ....................... 21 2.2 Trusted Computing for Software Integrity ....................... 23 2.3 Trusted Computing for Secure Storage ......................... 24 2.4 Threat Model and Platform Assumptions ....................... 25 2.5 Related Work ....................................... 27 2.5.1 Decomposition of Local File System Stacks .................. 27 2.5.2 Cryptographic Protection of Storage ...................... 28 2.5.3 Untrusted Storage ................................ 29 2.5.4 Versioning and Tamper-Proof Storage ..................... 29 3 A Trusted File System Wrapper 31 3.1 General Architecture ................................... 31 3.1.1 How to Reuse Untrusted Infrastructure .................... 31 3.1.2 Cryptographic Protection ............................ 34 3.1.3 Introducing the VPFS Subsystems ....................... 35 3.1.4 Alternative TCBs for Confidentiality and Integrity .............. 37 3.2 Design Principles ..................................... 38 3.2.1 Metadata Integrity and Input Sanitization ................... 38 3.2.2 Simple Inter-Component Interfaces ....................... 39 3.2.3 Handling and Surviving Errors ......................... 40 3.2.4 Reuse of Generic Implementations ....................... 40 3.3 Central Data Structure: The Block Tree ........................ 40 3.3.1 The Case for a Generic Block Tree ....................... 41 3.3.2 Buffer Registry .................................. 42 3.3.3 Generic Tree Walker and Specific Strategies .................. 44 3.4 Optimizations ....................................... 47 3.4.1 Caching Buffer Pointers ............................. 47 3.4.2 Optimal Block Tree Depth ........................... 47 3.4.3 Write Batching and Read Ahead ........................ 48 3.5 Summary ......................................... 48 5 4 Metadata and Namespace Protection 49 4.1 Protection of Metadata in File Systems ........................ 49 4.1.1 Approaches to Namespace Protection and Freshness ............. 49 4.1.2 Requirements for VPFS ............................. 50 4.2 Inode-Based Naming ................................... 51 4.3 Untrusted Naming .................................... 52 4.3.1 Potential of Reusing Untrusted Infrastructure ................. 52 4.3.2 Untrusted Lookup and Proofs .......................... 53 4.3.3 Limitations of Untrusted Naming ........................ 54 4.3.4 Verdict on Untrusted Naming .......................... 55 4.4 Trusted Directory Hierarchy ............................... 56 5 Consistency and Robustness 57 5.1 Consistency in Local File Systems ........................... 57 5.1.1 Approaches to Ensuring Consistency ...................... 57 5.1.2 A Consistency Paradigm for VPFS ....................... 59 5.2 Extended Persistency Interface ............................. 60 5.3 Dependency Tracking in VPFS Core .......................... 61 5.4 Being Prepared for Crashes ............................... 62 5.4.1 Journaling Metadata Operations ........................ 63 5.4.2 Journaling Block Updates ............................ 64 5.4.3 Checkpoints .................................... 65 5.4.4 Securing the Journal ............................... 66 5.5 Recovering From Crashes ................................ 66 5.5.1 Replaying the Journal .............................. 66 5.5.2 Recovery Results ................................. 68 5.6 Garbage Collection .................................... 69 5.7 Optimizations ....................................... 70 5.8 Summary ......................................... 71 6 Backup and Restore 73 6.1 Approaches to Backup and Restore ........................... 73 6.1.1 Backup and Cloud Storage ........................... 73 6.1.2 Availability of Remotely Stored Data ...................... 74 6.1.3 Backup and Restore in VPFS .......................... 75 6.2 Authentication and Transport Security ......................... 76 6.3 Extended Threat Model ................................. 78 6.4 Client-Side TCBs for Backup and Restore ....................... 79 6.5 Backup Creation ..................................... 81 6.5.1 Enabling Backups in Background ........................ 81 6.5.2 Updating a Remote Backup ........................... 82 6.5.3 Impact on TCB Complexity ........................... 84 6.6 Backup to Multiple Servers ............................... 84 6.6.1 Redundancy Schemes .............................. 84 6.6.2 Inconsistent Updates ............................... 85 6.6.3 Recoverability Model ............................... 86 6.6.4 Self-Attesting Erasure Coding .......................... 87 6.6.5 Isolation of Independent Auth Components .................. 89 6.7 Restore from Backup ................................... 89 6.8 Key and Credential Backup ............................... 91 6.9 Summary ......................................... 92 6 7 Evaluation 93 7.1 Experimentation Platform ................................ 93 7.2 Code Size and Complexity ................................ 94 7.2.1 Code Size ..................................... 95 7.2.2 Functional Complexity .............................. 97 7.3 Separation of TCBs and Attack Surface ........................ 100 7.3.1 Size and Complexity of the Recoverability TCB ................ 100 7.3.2 Size and Complexity of the Attack Surface .................. 101 7.4 Efficiency ......................................... 103 7.4.1 Hardware and Software Configuration ..................... 104 7.4.2 Microbenchmarks ................................. 105 7.4.3 Application Workloads .............................. 108 7.4.4 CPU Load ..................................... 111 7.4.5 Crash Recovery .................................. 113 8 Conclusions and Future Work 115 7 8 List of Figures 1.1 Isolation of private file system instances for mutually distrustful applications ... 15 2.1 VPFS in the Nizza Secure System Architecture .................... 22 3.1 Conceptual view of three points in the design space for splitting a file system stack 32 3.2 VPFS architecture with trusted and untrusted components ............. 36 3.3 Mapping of file contents and metadata to untrusted storage ............. 37 3.4 Two examples for inter-block dependencies in the block tree ............. 41 3.5 Pointer structure in VPFS Core’s buffer registry ................... 43 3.6 Addressing scheme for block tree nodes ......................... 44 3.7 C++ implementation of the dirty-block strategy of VPFS Core’s buffer cache ... 45 4.1 Data structures enabling untrusted naming ...................... 54 5.1 Lazy updates of the Merkle hash tree .......................... 60 5.2 Dependencies among cached file system structures .................. 62 5.3 Record groups as the basis of transactions in VPFS journal ............. 67 6.1 TCBs for confidentiality, integrity, and recoverability ................. 80 6.2 Application and backup instances of VPFS for creating backups in the background 81 6.3 Version-based identification of nodes in block tree that require backup ....... 83 6.4 Functional components of VPFS backup architecture and TCB assignment .... 89 6.5 VPFS components involved in recovering a file system from backup ......... 90 7.1 Measured throughput for sequential reads and writes using ’bonnie++’ traces ... 106 7.2 Runtimes for microbenchmarks (PostMark, untar, and grep traces) ......... 107 7.3 Benchmarks of short-running workload traces ..................... 110 7.4 Benchmarks of long-running workload traces ...................... 111 7.5 CPU load under timed playback of ’iPhoto4’ trace for VPFS and dmcrypt ..... 112 7.6 CPU load under timed playback of ’Keynote2’ trace for VPFS and dmcrypt
Recommended publications
  • CERIAS Tech Report 2017-5 Deceptive Memory Systems by Christopher N
    CERIAS Tech Report 2017-5 Deceptive Memory Systems by Christopher N. Gutierrez Center for Education and Research Information Assurance and Security Purdue University, West Lafayette, IN 47907-2086 DECEPTIVE MEMORY SYSTEMS ADissertation Submitted to the Faculty of Purdue University by Christopher N. Gutierrez In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2017 Purdue University West Lafayette, Indiana ii THE PURDUE UNIVERSITY GRADUATE SCHOOL STATEMENT OF DISSERTATION APPROVAL Dr. Eugene H. Spa↵ord, Co-Chair Department of Computer Science Dr. Saurabh Bagchi, Co-Chair Department of Computer Science Dr. Dongyan Xu Department of Computer Science Dr. Mathias Payer Department of Computer Science Approved by: Dr. Voicu Popescu by Dr. William J. Gorman Head of the Graduate Program iii This work is dedicated to my wife, Gina. Thank you for all of your love and support. The moon awaits us. iv ACKNOWLEDGMENTS Iwould liketothank ProfessorsEugeneSpa↵ord and SaurabhBagchi for their guidance, support, and advice throughout my time at Purdue. Both have been instru­ mental in my development as a computer scientist, and I am forever grateful. I would also like to thank the Center for Education and Research in Information Assurance and Security (CERIAS) for fostering a multidisciplinary security culture in which I had the privilege to be part of. Special thanks to Adam Hammer and Ronald Cas­ tongia for their technical support and Thomas Yurek for his programming assistance for the experimental evaluation. I am grateful for the valuable feedback provided by the members of my thesis committee, Professor Dongyen Xu, and Professor Math­ ias Payer.
    [Show full text]
  • The Google File System (GFS)
    The Google File System (GFS), as described by Ghemwat, Gobioff, and Leung in 2003, provided the architecture for scalable, fault-tolerant data management within the context of Google. These architectural choices, however, resulted in sub-optimal performance in regard to data trustworthiness (security) and simplicity. Additionally, the application- specific nature of GFS limits the general scalability of the system outside of the specific design considerations within the context of Google. SYSTEM DESIGN: The authors enumerate their specific considerations as: (1) commodity components with high expectation of failure, (2) a system optimized to handle relatively large files, particularly multi-GB files, (3) most writes to the system are concurrent append operations, rather than internally modifying the extant files, and (4) a high rate of sustained bandwidth (Section 1, 2.1). Utilizing these considerations, this paper analyzes the success of GFS’s design in achieving a fault-tolerant, scalable system while also considering the faults of the system with regards to data-trustworthiness and simplicity. Data-trustworthiness: In designing the GFS system, the authors made the conscious decision not prioritize data trustworthiness by allowing applications to access ‘stale’, or not up-to-date, data. In particular, although the system does inform applications of the chunk-server version number, the designers of GFS encouraged user applications to cache chunkserver information, allowing stale data accesses (Section 2.7.2, 4.5). Although possibly troubling
    [Show full text]
  • A Review on GOOGLE File System
    International Journal of Computer Science Trends and Technology (IJCST) – Volume 4 Issue 4, Jul - Aug 2016 RESEARCH ARTICLE OPEN ACCESS A Review on Google File System Richa Pandey [1], S.P Sah [2] Department of Computer Science Graphic Era Hill University Uttarakhand – India ABSTRACT Google is an American multinational technology company specializing in Internet-related services and products. A Google file system help in managing the large amount of data which is spread in various databases. A good Google file system is that which have the capability to handle the fault, the replication of data, make the data efficient, memory storage. The large data which is big data must be in the form that it can be managed easily. Many applications like Gmail, Facebook etc. have the file systems which organize the data in relevant format. In conclusion the paper introduces new approaches in distributed file system like spreading file’s data across storage, single master and appends writes etc. Keywords:- GFS, NFS, AFS, HDFS I. INTRODUCTION III. KEY IDEAS The Google file system is designed in such a way that 3.1. Design and Architecture: GFS cluster consist the data which is spread in database must be saved in of single master and multiple chunk servers used by a arrange manner. multiple clients. Since files to be stored in GFS are The arrangement of data is in the way that it can large, processing and transferring such huge files can reduce the overhead load on the server, the consume a lot of bandwidth. To efficiently utilize availability must be increased, throughput should be bandwidth files are divided into large 64 MB size highly aggregated and much more services to make chunks which are identified by unique 64-bit chunk the available data more accurate and reliable.
    [Show full text]
  • Ext4 File System and Crash Consistency
    1 Ext4 file system and crash consistency Changwoo Min 2 Summary of last lectures • Tools: building, exploring, and debugging Linux kernel • Core kernel infrastructure • Process management & scheduling • Interrupt & interrupt handler • Kernel synchronization • Memory management • Virtual file system • Page cache and page fault 3 Today: ext4 file system and crash consistency • File system in Linux kernel • Design considerations of a file system • History of file system • On-disk structure of Ext4 • File operations • Crash consistency 4 File system in Linux kernel User space application (ex: cp) User-space Syscalls: open, read, write, etc. Kernel-space VFS: Virtual File System Filesystems ext4 FAT32 JFFS2 Block layer Hardware Embedded Hard disk USB drive flash 5 What is a file system fundamentally? int main(int argc, char *argv[]) { int fd; char buffer[4096]; struct stat_buf; DIR *dir; struct dirent *entry; /* 1. Path name -> inode mapping */ fd = open("/home/lkp/hello.c" , O_RDONLY); /* 2. File offset -> disk block address mapping */ pread(fd, buffer, sizeof(buffer), 0); /* 3. File meta data operation */ fstat(fd, &stat_buf); printf("file size = %d\n", stat_buf.st_size); /* 4. Directory operation */ dir = opendir("/home"); entry = readdir(dir); printf("dir = %s\n", entry->d_name); return 0; } 6 Why do we care EXT4 file system? • Most widely-deployed file system • Default file system of major Linux distributions • File system used in Google data center • Default file system of Android kernel • Follows the traditional file system design 7 History of file system design 8 UFS (Unix File System) • The original UNIX file system • Design by Dennis Ritche and Ken Thompson (1974) • The first Linux file system (ext) and Minix FS has a similar layout 9 UFS (Unix File System) • Performance problem of UFS (and the first Linux file system) • Especially, long seek time between an inode and data block 10 FFS (Fast File System) • The file system of BSD UNIX • Designed by Marshall Kirk McKusick, et al.
    [Show full text]
  • F1 Query: Declarative Querying at Scale
    F1 Query: Declarative Querying at Scale Bart Samwel John Cieslewicz Ben Handy Jason Govig Petros Venetis Chanjun Yang Keith Peters Jeff Shute Daniel Tenedorio Himani Apte Felix Weigel David Wilhite Jiacheng Yang Jun Xu Jiexing Li Zhan Yuan Craig Chasseur Qiang Zeng Ian Rae Anurag Biyani Andrew Harn Yang Xia Andrey Gubichev Amr El-Helw Orri Erling Zhepeng Yan Mohan Yang Yiqun Wei Thanh Do Colin Zheng Goetz Graefe Somayeh Sardashti Ahmed M. Aly Divy Agrawal Ashish Gupta Shiv Venkataraman Google LLC [email protected] ABSTRACT 1. INTRODUCTION F1 Query is a stand-alone, federated query processing platform The data processing and analysis use cases in large organiza- that executes SQL queries against data stored in different file- tions like Google exhibit diverse requirements in data sizes, la- based formats as well as different storage systems at Google (e.g., tency, data sources and sinks, freshness, and the need for custom Bigtable, Spanner, Google Spreadsheets, etc.). F1 Query elimi- business logic. As a result, many data processing systems focus on nates the need to maintain the traditional distinction between dif- a particular slice of this requirements space, for instance on either ferent types of data processing workloads by simultaneously sup- transactional-style queries, medium-sized OLAP queries, or huge porting: (i) OLTP-style point queries that affect only a few records; Extract-Transform-Load (ETL) pipelines. Some systems are highly (ii) low-latency OLAP querying of large amounts of data; and (iii) extensible, while others are not. Some systems function mostly as a large ETL pipelines. F1 Query has also significantly reduced the closed silo, while others can easily pull in data from other sources.
    [Show full text]
  • A Study of Cryptographic File Systems in Userspace
    Turkish Journal of Computer and Mathematics Education Vol.12 No.10 (2021), 4507-4513 Research Article A study of cryptographic file systems in userspace a b c d e f Sahil Naphade , Ajinkya Kulkarni Yash Kulkarni , Yash Patil , Kaushik Lathiya , Sachin Pande a Department of Information Technology PICT, Pune, India [email protected] b Department of Information Technology PICT, Pune, India [email protected] c Department of Information Technology PICT, Pune, India [email protected] d Department of Information Technology PICT, Pune, India [email protected] e Veritas Technologies Pune, India, [email protected] f Department of Information Technology PICT, Pune, India [email protected] Article History: Received: 10 January 2021; Revised: 12 February 2021; Accepted: 27 March 2021; Published online: 28 April 2021 Abstract: With the advancements in technology and digitization, the data storage needs are expanding; along with the data breaches which can expose sensitive data to the world. Thus, the security of the stored data is extremely important. Conventionally, there are two methods of storage of the data, the first being hiding the data and the second being encryption of the data. However, finding out hidden data is simple, and thus, is very unreliable. The second method, which is encryption, allows for accessing the data by only the person who encrypted the data using his passkey, thus allowing for higher security. Typically, a file system is implemented in the kernel of the operating systems. However, with an increase in the complexity of the traditional file systems like ext3 and ext4, the ones that are based in the userspace of the OS are now allowing for additional features on top of them, such as encryption-decryption and compression.
    [Show full text]
  • CS 152: Computer Systems Architecture Storage Technologies
    CS 152: Computer Systems Architecture Storage Technologies Sang-Woo Jun Winter 2019 Storage Used To be a Secondary Concern Typically, storage was not a first order citizen of a computer system o As alluded to by its name “secondary storage” o Its job was to load programs and data to memory, and disappear o Most applications only worked with CPU and system memory (DRAM) o Extreme applications like DBMSs were the exception Because conventional secondary storage was very slow o Things are changing! Some (Pre)History Magnetic core memory Rope memory (ROM) 1960’s Drum memory 1950~1970s 72 KiB per cubic foot! 100s of KiB (1024 bits in photo) Hand-woven to program the 1950’s Apollo guidance computer Photos from Wikipedia Some (More Recent) History Floppy disk drives 1970’s~2000’s 100 KiBs to 1.44 MiB Hard disk drives 1950’s to present MBs to TBs Photos from Wikipedia Some (Current) History Solid State Drives Non-Volatile Memory 2000’s to present 2010’s to present GB to TBs GBs Hard Disk Drives Dominant storage medium for the longest time o Still the largest capacity share Data organized into multiple magnetic platters o Mechanical head needs to move to where data is, to read it o Good sequential access, terrible random access • 100s of MB/s sequential, maybe 1 MB/s 4 KB random o Time for the head to move to the right location (“seek time”) may be ms long • 1000,000s of cycles! Typically “ATA” (Including IDE and EIDE), and later “SATA” interfaces o Connected via “South bridge” chipset Ding Yuan, “Operating Systems ECE344 Lecture 11: File
    [Show full text]
  • Cloud Computing Bible Is a Wide-Ranging and Complete Reference
    A thorough, down-to-earth look Barrie Sosinsky Cloud Computing Barrie Sosinsky is a veteran computer book writer at cloud computing specializing in network systems, databases, design, development, The chance to lower IT costs makes cloud computing a and testing. Among his 35 technical books have been Wiley’s Networking hot topic, and it’s getting hotter all the time. If you want Bible and many others on operating a terra firma take on everything you should know about systems, Web topics, storage, and the cloud, this book is it. Starting with a clear definition of application software. He has written nearly 500 articles for computer what cloud computing is, why it is, and its pros and cons, magazines and Web sites. Cloud Cloud Computing Bible is a wide-ranging and complete reference. You’ll get thoroughly up to speed on cloud platforms, infrastructure, services and applications, security, and much more. Computing • Learn what cloud computing is and what it is not • Assess the value of cloud computing, including licensing models, ROI, and more • Understand abstraction, partitioning, virtualization, capacity planning, and various programming solutions • See how to use Google®, Amazon®, and Microsoft® Web services effectively ® ™ • Explore cloud communication methods — IM, Twitter , Google Buzz , Explore the cloud with Facebook®, and others • Discover how cloud services are changing mobile phones — and vice versa this complete guide Understand all platforms and technologies www.wiley.com/compbooks Shelving Category: Use Google, Amazon, or
    [Show full text]
  • Filesystem Considerations for Embedded Devices ELC2015 03/25/15
    Filesystem considerations for embedded devices ELC2015 03/25/15 Tristan Lelong Senior embedded software engineer Filesystem considerations ABSTRACT The goal of this presentation is to answer a question asked by several customers: which filesystem should you use within your embedded design’s eMMC/SDCard? These storage devices use a standard block interface, compatible with traditional filesystems, but constraints are not those of desktop PC environments. EXT2/3/4, BTRFS, F2FS are the first of many solutions which come to mind, but how do they all compare? Typical queries include performance, longevity, tools availability, support, and power loss robustness. This presentation will not dive into implementation details but will instead summarize provided answers with the help of various figures and meaningful test results. 2 TABLE OF CONTENTS 1. Introduction 2. Block devices 3. Available filesystems 4. Performances 5. Tools 6. Reliability 7. Conclusion Filesystem considerations ABOUT THE AUTHOR • Tristan Lelong • Embedded software engineer @ Adeneo Embedded • French, living in the Pacific northwest • Embedded software, free software, and Linux kernel enthusiast. 4 Introduction Filesystem considerations Introduction INTRODUCTION More and more embedded designs rely on smart memory chips rather than bare NAND or NOR. This presentation will start by describing: • Some context to help understand the differences between NAND and MMC • Some typical requirements found in embedded devices designs • Potential filesystems to use on MMC devices 6 Filesystem considerations Introduction INTRODUCTION Focus will then move to block filesystems. How they are supported, what feature do they advertise. To help understand how they compare, we will present some benchmarks and comparisons regarding: • Tools • Reliability • Performances 7 Block devices Filesystem considerations Block devices MMC, EMMC, SD CARD Vocabulary: • MMC: MultiMediaCard is a memory card unveiled in 1997 by SanDisk and Siemens based on NAND flash memory.
    [Show full text]
  • Mapreduce: Simplified Data Processing On
    MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat [email protected], [email protected] Google, Inc. Abstract given day, etc. Most such computations are conceptu- ally straightforward. However, the input data is usually MapReduce is a programming model and an associ- large and the computations have to be distributed across ated implementation for processing and generating large hundreds or thousands of machines in order to finish in data sets. Users specify a map function that processes a a reasonable amount of time. The issues of how to par- key/value pair to generate a set of intermediate key/value allelize the computation, distribute the data, and handle pairs, and a reduce function that merges all intermediate failures conspire to obscure the original simple compu- values associated with the same intermediate key. Many tation with large amounts of complex code to deal with real world tasks are expressible in this model, as shown these issues. in the paper. As a reaction to this complexity, we designed a new Programs written in this functional style are automati- abstraction that allows us to express the simple computa- cally parallelized and executed on a large cluster of com- tions we were trying to perform but hides the messy de- modity machines. The run-time system takes care of the tails of parallelization, fault-tolerance, data distribution details of partitioning the input data, scheduling the pro- and load balancing in a library. Our abstraction is in- gram's execution across a set of machines, handling ma- spired by the map and reduce primitives present in Lisp chine failures, and managing the required inter-machine and many other functional languages.
    [Show full text]
  • Google File System 2.0: a Modern Design and Implementation
    Google File System 2.0: A Modern Design and Implementation Babatunde Micheal Okutubo Gan Tu Xi Cheng Stanford University Stanford University Stanford University [email protected] [email protected] [email protected] Abstract build a GFS-2.0 from scratch, with modern software design principles and implementation patterns, in order to survey The Google File System (GFS) presented an elegant de- ways to not only effectively achieve various system require- sign and has shown tremendous capability to scale and to ments, but also with great support for high-concurrency tolerate fault. However, there is no concrete implementa- workloads and software extensibility. tion of GFS being open sourced, even after nearly 20 years A second motivation of this project is that GFS has a of its initial development. In this project, we attempt to build single master, which served well back in 2003 when this a GFS-2.0 in C++ from scratch1 by applying modern soft- work was published. However, as data volume, network ware design discipline. A fully functional GFS has been de- capacity as well as hardware capability all have been dras- veloped to support concurrent file creation, read and paral- tically improved from that date, our hypothesis is that a lel writes. The system is capable of surviving chunk servers multi-master version of GFS is advantageous as it provides going down during both read and write operations. Various better fault tolerance (especially from the master servers’ crucial implementation details have been discussed, and a point of view), availability and throughput. Although the micro-benchmark has been conducted and presented.
    [Show full text]
  • State-Of-The-Art Garbage Collection Policies for NILFS2
    State-of-the-art Garbage Collection Policies for NILFS2 DIPLOMARBEIT zur Erlangung des akademischen Grades Diplom-Ingenieur im Rahmen des Studiums Software Engineering/Internet Computing eingereicht von Andreas Rohner, Bsc Matrikelnummer 0502196 an der Fakultät für Informatik der Technischen Universität Wien Betreuung: Ao.Univ.Prof. Dipl.-Ing. Dr.techn. M. Anton Ertl Wien, 23. Jänner 2018 Andreas Rohner M. Anton Ertl Technische Universität Wien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at State-of-the-art Garbage Collection Policies for NILFS2 DIPLOMA THESIS submitted in partial fulfillment of the requirements for the degree of Diplom-Ingenieur in Software Engineering/Internet Computing by Andreas Rohner, Bsc Registration Number 0502196 to the Faculty of Informatics at the TU Wien Advisor: Ao.Univ.Prof. Dipl.-Ing. Dr.techn. M. Anton Ertl Vienna, 23rd January, 2018 Andreas Rohner M. Anton Ertl Technische Universität Wien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at Erklärung zur Verfassung der Arbeit Andreas Rohner, Bsc Grundsteingasse 43/22 Hiermit erkläre ich, dass ich diese Arbeit selbständig verfasst habe, dass ich die verwen- deten Quellen und Hilfsmittel vollständig angegeben habe und dass ich die Stellen der Arbeit – einschließlich Tabellen, Karten und Abbildungen –, die anderen Werken oder dem Internet im Wortlaut oder dem Sinn nach entnommen sind, auf jeden Fall unter Angabe der Quelle als Entlehnung kenntlich gemacht habe. Wien, 23. Jänner 2018 Andreas Rohner v Danksagung Ich danke meinem Betreuer Ao.Univ.Prof. Dipl.-Ing. Dr.techn. M. Anton Ertl für seine Geduld und Unterstützung. Außerdem schulde ich Ryusuke Konishi, dem Maintainer des NILFS2-Subsystems im Linux-Kernel, Dank für die Durchsicht meiner Patches und die vielen wertvollen Verbesserungsvorschläge.
    [Show full text]