A Distributed File System (Dfs)

Total Page:16

File Type:pdf, Size:1020Kb

A Distributed File System (Dfs) A DISTRIBUTED FILE SYSTEM (DFS) By PRASHANT JAYARAMAN A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2006 Copyright 2006 by Prashant Jayaraman To my parents and to my brother Vivek. ACKNOWLEDGMENTS I thank Dr. Richard Newman for his help and guidance. I would like to express my gratitude to Dr. Randy Chow and Dr. Shigang Chen for serving on my supervisory committee and reviewing my work. I also acknowledge the contributions of all members of the Distributed Conferencing System group. Special thanks are due to my family and friends for their encouragement and support. iv TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................................................................................. iv LIST OF FIGURES ......................................................................................................... viii LIST OF ABBREVIATIONS............................................................................................ ix ABSTRACT..................................................................................................................... xiii CHAPTER 1 INTRODUCTION ...................................................................................................1 Introduction..............................................................................................................1 Distributed Conferencing System Services .............................................................2 2 STUDY OF DISTRIBUTED FILE SYSTEMS ......................................................4 Introduction..............................................................................................................4 Network File System (NFS).....................................................................................4 Introduction..................................................................................................4 Protocol........................................................................................................4 Naming and Transparency ...........................................................................5 Concurrency Control and Consistency ........................................................6 Replication and Migration ...........................................................................6 Caching ........................................................................................................7 Security ........................................................................................................7 Andrew File System (AFS)......................................................................................8 Introduction..................................................................................................8 Architecture..................................................................................................8 Naming and Transparency ...........................................................................9 Concurrency Control and Consistency ........................................................9 Replication and Migration ...........................................................................9 Caching ......................................................................................................10 Security ......................................................................................................10 Common Internet File System (CIFS) ...................................................................11 Coda ....................................................................................................................11 Plan 9 ....................................................................................................................12 xFS ....................................................................................................................13 v Low Bandwidth File System (LBFS) ....................................................................14 Freenet....................................................................................................................15 3 REQUIREMENTS AND DESIGN .......................................................................18 Requirements .........................................................................................................18 Naming and Transparency .........................................................................18 Concurrency Control and Consistency ......................................................19 Replication and Migration .........................................................................19 Caching ......................................................................................................20 Security ......................................................................................................20 Design ....................................................................................................................21 Communication Architecture.....................................................................21 Stateful/Stateless Server.............................................................................21 File Access Model......................................................................................21 Concurrency Control and Consistency ......................................................22 Versioning..................................................................................................23 Forward deltas................................................................................23 Reverse deltas ................................................................................23 Previous design ..............................................................................24 Current design................................................................................24 Naming and Transparency .........................................................................26 Storing Metadata........................................................................................26 Symbolic links ...............................................................................28 File Information Table (FIT) .........................................................28 Previous design ..............................................................................29 Proposed design .............................................................................30 Cog Policy..................................................................................................33 User Policy.................................................................................................34 Migration....................................................................................................35 Replication .................................................................................................36 Replicate entire file........................................................................36 Replicate portions of file................................................................36 Replicate on create.........................................................................36 Replicate on open...........................................................................37 Design decision..............................................................................37 Caching ......................................................................................................38 Client caching ................................................................................38 Server caching................................................................................38 Security ......................................................................................................39 File System Interface .................................................................................39 Conclusion .............................................................................................................41 vi 4 IMPLEMENTATION............................................................................................45 Requirements .........................................................................................................45 Interaction with Distributed Conferencing System Services.................................45 Interaction with Conference Control Service ............................................45 Interaction with Access Control Service....................................................46 Interaction with Secure Communication Service ......................................46 Communication......................................................................................................46 Distributed File System Architecture.....................................................................47 File Information Table Manager (FITM)...............................................................47 Version Naming Manager (VNM).........................................................................48 Cog Policy Manager (CPM) ..................................................................................49 User Policy Manager (UPM) .................................................................................49 Version Control Manager (VCM)..........................................................................50 Creating a Delta .........................................................................................51
Recommended publications
  • Elastic Storage for Linux on System Z IBM Research & Development Germany
    Peter Münch T/L Test and Development – Elastic Storage for Linux on System z IBM Research & Development Germany Elastic Storage for Linux on IBM System z © Copyright IBM Corporation 2014 9.0 Elastic Storage for Linux on System z Session objectives • This presentation introduces the Elastic Storage, based on General Parallel File System technology that will be available for Linux on IBM System z. Understand the concepts of Elastic Storage and which functions will be available for Linux on System z. Learn how you can integrate and benefit from the Elastic Storage in a Linux on System z environment. Finally, get your first impression in a live demo of Elastic Storage. 2 © Copyright IBM Corporation 2014 Elastic Storage for Linux on System z Trademarks The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. AIX* FlashSystem Storwize* Tivoli* DB2* IBM* System p* WebSphere* DS8000* IBM (logo)* System x* XIV* ECKD MQSeries* System z* z/VM* * Registered trademarks of IBM Corporation The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
    [Show full text]
  • BESIII Physical Analysis on Hadoop Platform
    20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013) IOP Publishing Journal of Physics: Conference Series 513 (2014) 032044 doi:10.1088/1742-6596/513/3/032044 BESIII Physical Analysis on Hadoop Platform Jing HUO12, Dongsong ZANG12, Xiaofeng LEI12, Qiang LI12, Gongxing SUN1 1Institute of High Energy Physics, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China E-mail: [email protected] Abstract. In the past 20 years, computing cluster has been widely used for High Energy Physics data processing. The jobs running on the traditional cluster with a Data-to-Computing structure, have to read large volumes of data via the network to the computing nodes for analysis, thereby making the I/O latency become a bottleneck of the whole system. The new distributed computing technology based on the MapReduce programming model has many advantages, such as high concurrency, high scalability and high fault tolerance, and it can benefit us in dealing with Big Data. This paper brings the idea of using MapReduce model to do BESIII physical analysis, and presents a new data analysis system structure based on Hadoop platform, which not only greatly improve the efficiency of data analysis, but also reduces the cost of system building. Moreover, this paper establishes an event pre-selection system based on the event level metadata(TAGs) database to optimize the data analyzing procedure. 1. Introduction 1.1 The current BESIII computing architecture High Energy Physics experiment is a typical data-intensive application. The BESIII computing system now consist of 3PB+ data, 6500 CPU cores, and it is estimated that there will be more than 10PB data produced in the future 5 years.
    [Show full text]
  • Globalfs: a Strongly Consistent Multi-Site File System
    GlobalFS: A Strongly Consistent Multi-Site File System Leandro Pacheco Raluca Halalai Valerio Schiavoni University of Lugano University of Neuchatelˆ University of Neuchatelˆ Fernando Pedone Etienne Riviere` Pascal Felber University of Lugano University of Neuchatelˆ University of Neuchatelˆ Abstract consistency, availability, and tolerance to partitions. Our goal is to ensure strongly consistent file system operations This paper introduces GlobalFS, a POSIX-compliant despite node failures, at the price of possibly reduced geographically distributed file system. GlobalFS builds availability in the event of a network partition. Weak on two fundamental building blocks, an atomic multicast consistency is suitable for domain-specific applications group communication abstraction and multiple instances of where programmers can anticipate and provide resolution a single-site data store. We define four execution modes and methods for conflicts, or work with last-writer-wins show how all file system operations can be implemented resolution methods. Our rationale is that for general-purpose with these modes while ensuring strong consistency and services such as a file system, strong consistency is more tolerating failures. We describe the GlobalFS prototype in appropriate as it is both more intuitive for the users and detail and report on an extensive performance assessment. does not require human intervention in case of conflicts. We have deployed GlobalFS across all EC2 regions and Strong consistency requires ordering commands across show that the system scales geographically, providing replicas, which needs coordination among nodes at performance comparable to other state-of-the-art distributed geographically distributed sites (i.e., regions). Designing file systems for local commands and allowing for strongly strongly consistent distributed systems that provide good consistent operations over the whole system.
    [Show full text]
  • Considerations When Choosing a Backup System for AFS
    Considerations when Choosing a Backup System for AFS By Kristen J. Webb President and CTO Teradactyl LLC. June 18, 2005 The Andrew File System® has a proven track record as a scalable and secure network file system. Organizations everywhere depend on AFS for the sharing of data both internally and externally. The most telling sign of the longevity of AFS is the adoption of the open source version of the software (OpenAFS). One of the important challenges facing AFS system administrators is the need to provide efficient, cost-effective backup strategies that can scale with ever increasing cell sizes. This paper provides an overview of the issues to consider when putting together a long term backup and recovery strategy. The topics include: Methods available for data backup and recovery; Impact of the backup function on cell performance and scale; Volume management; Centralized vs. Distributed backups; Disk-to-Tape, Disk-to-Disk and Disk-to-Disk-to-Tape comparisons; Disaster Recovery requirements; Advanced Solutions. The reader should have a basic understanding of backup and recovery concepts and familiarity with AFS architecture and cell administration. Introduction The purpose of this paper it to examine the general behavior of backup systems in relation to working with the Andrew File System (AFS). It is written as a guide for system administers to help make them aware of all of the considerations for backup, restore, and disaster recovery that they need to take into account when putting together a backup and recovery strategy for AFS. The paper includes discussions on the features and limitations of the native backup and recovery programs provided with AFS.
    [Show full text]
  • Distributed Systems 14
    Distributed Systems 14. Network File Systems (Network Attached Storage) Paul Krzyzanowski Rutgers University Fall 2018 October 22, 2018 © 2014-2018 Paul Krzyzanowski 1 Accessing files File sharing with socket-based programs HTTP, FTP, telnet: – Explicit access – User-directed connection to access remote resources We want more transparency – Allow user to access remote resources just as local ones NAS: Network Attached Storage October 22, 2018 © 2014-2018 Paul Krzyzanowski 2 File service models Upload/Download model Remote access model – Read file: copy file from server to client File service provides functional interface: – Write file: copy file from client to server – create, delete, read bytes, write bytes, etc… Advantage: Advantages: – Simple – Client gets only what’s needed – Server can manage coherent view of file system Problems: – Wasteful: what if client needs small Problem: piece? – Possible server and network congestion – Problematic: what if client doesn’t have enough space? • Servers are accessed for duration of file access – Consistency: what if others need to modify the same file? • Same data may be requested repeatedly October 22, 2018 © 2014-2018 Paul Krzyzanowski 3 Semantics of file sharing Sequential Semantics Session Semantics Read returns result of last write Relax the rules Easily achieved if • Changes to an open file are – Only one server initially visible only to the process – Clients do not cache data (or machine) that modified it. BUT • Need to hide or lock file under – Performance problems if no cache modification
    [Show full text]
  • CIFS SMB2 SMB3 Meets Linux a Year in Review
    The Future of File Protocols: CIFS SMB2 SMB3 Meets Linux A Year in Review Steve French Senior Engineer – File System Architect IBM Linux Technology Center 1 IBM, Linux, and Building a Smarter Planet © 2012 IBM Corporation Legal Statement – This work represents the views of the author(s) and does not necessarily reflect the views of IBM Corporation – A full list of U.S. trademarks owned by IBM may be found at http://www.ibm.com/legal/copytrade.shtml. – Linux is a registered trademark of Linus Torvalds. – Other company, product, and service names may be trademarks or service marks of others. 2 © 2012 IBM Corporation Who am I? – Steve French ([email protected] or [email protected]) – Author and maintainer of Linux cifs vfs (for accessing Samba, Windows and various SMB/CIFS based NAS appliances) – Wrote initial SMB2 kernel client prototype – Member of the Samba team, coauthor of SNIA CIFS Technical Reference and former SNIA CIFS Working Group chair – Architect: File Systems/NFS/Samba for IBM Linux Technology Center © 2012 IBM Corporation SMB3: Great Feature Set, Broad Deployment, Amazing Performance ● Introduction of new storage features in Windows 8 causing one of most dramatic shifts in storage industry as companies rapidly move to support “SMB3” (aka SMB2.2) ● “SMB2.2 (CIFS) screams over InfiniBand” (Storage CH Blog) • Is (traditional) SAN use going to die? – “Market trends show virtualization workloads moving to NAS” (Dennis Chapman, Technical Director NetApp, SNIA SDC 2011) – “Unstructured data (file-based) storage is growing faster than structured data” (Thomas Pfenning, Microsoft GM, SNIA SDC 2011) – Customers prefer “file” • SMB2/CIFS market share is MUCH larger than NFS.
    [Show full text]
  • Introduction to Distributed File Systems. Orangefs Experience 0.05 [Width=0.4]Lvee-Logo-Winter
    Introduction to distributed file systems. OrangeFS experience Andrew Savchenko NRNU MEPhI, Moscow, Russia 16 February 2013 . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Outline 1. Introduction 2. Species of distributed file systems 3. Behind the curtain 4. OrangeFS 5. Summary . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Introduction Why one needs a non-local file system? • a large data storage • a high performance data storage • redundant and highly available solutions There are dozens of them: 72 only on wiki[1], more IRL. Focus on free software solutions. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Introduction Why one needs a non-local file system? • a large data storage • a high performance data storage • redundant and highly available solutions There are dozens of them: 72 only on wiki[1], more IRL. Focus on free software solutions. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Species of distributed file systems Network Clustered Distributed Terminology is ambiguous!. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Network file systems client1 client2 clientN server storage A single server (or at least an appearance) and multiple network clients. Examples: NFS, CIFS. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Clustered file systems server1 server2 serverN storage Servers sharing the same local storage (usually SAN[2] at block level). shared storage architecture. Examples: GFS2[3], OCFS2[4]. .. .. .. .. .
    [Show full text]
  • Building Windows File Systems: a Look at the Openafs Client for Windows
    Building Windows File Systems: A Look at the OpenAFS Client for Windows Peter Scott Kernel Drivers, LLC Jeffrey Altman Your File System, Inc 2010 Storage Developer Conference. Kernel Drivers, LLC. All Rights Reserved. AFS History … on Windows The SMB Interface Used the Loopback Adapter for NetBIOS name registration All requests were processed by the AFS Service in user mode Minimal kernel level caching Lack of support for network browsing within Explorer 2010 Storage Developer Conference. Kernel Drivers, LLC. All Rights Reserved. 2 More History … ‘Relatively’ easy to implement but problems … Performance Generic solution to fit all situations Microsoft components, very difficult to get bugs fixed Several critical bugs found in SMB with no fix in sight Ex: Static thread worker pool count leads to dead locks Documentation is minimal, at best Reliant on the Windows redirector for network behavior … good and bad 2010 Storage Developer Conference. Kernel Drivers, LLC. All Rights Reserved. 3 A Native Approach … Custom File System Pros … Complete control of name space parsing Reparse Points Mount Points DFS Links Better optimization of IO pathways Minimize Kernel to User transitions Cache invalidation requirements Better system cache integration Support for write-through behavior when needed 2010 Storage Developer Conference. Kernel Drivers, LLC. All Rights Reserved. 4 A Native Approach … And the down side … Complex implementation Undocumented interfaces srvsvc, wkssvc, IPC$ Network Provider support Cache and Memory Manager Must deal with 3rd party products File system filters 2010 Storage Developer Conference. Kernel Drivers, LLC. All Rights Reserved. 5 Windows File System Model Windows IFS (Installable File System) Interface IRP Based model IO Request Packets ‘Fast IO’ Interface used for more than just IO Direct-call model, no IRPs … almost Network Provider Interface for Network Redirectors only Support for drive mappings UNC Browsing 2010 Storage Developer Conference.
    [Show full text]
  • The Evolution of File Systems
    The Evolution of File Systems Thomas Rivera, Hitachi Data Systems Craig Harmer, April 2011 SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Neither the Author nor the Presenter is an attorney and nothing in this presentation is intended to be nor should be construed as legal advice or opinion. If you need legal advice or legal opinion please contact an attorney. The information presented herein represents the Author's personal opinion and current understanding of the issues involved. The Author, the Presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. The Evolution of File Systems 2 © 2012 Storage Networking Industry Association. All Rights Reserved. 2 Abstract The File Systems Evolution Over time additional file systems appeared focusing on specialized requirements such as: data sharing, remote file access, distributed file access, parallel files access, HPC, archiving, security, etc. Due to the dramatic growth of unstructured data, files as the basic units for data containers are morphing into file objects, providing more semantics and feature- rich capabilities for content processing This presentation will: Categorize and explain the basic principles of currently available file system architectures (e.g.
    [Show full text]
  • The Selinux Notebook - the Foundations
    The SELinux Notebook - The Foundations The SELinux Notebook The Foundations Page 1 The SELinux Notebook - The Foundations 0. Notebook Information 0.1 Copyright Information Copyright © 2009 Richard Haines. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNUFree Documentation License”. The scripts and source code in this Notebook are covered by the GNU General Public License. The scripts and code are free source: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version. These are distributed in the hope that they will be useful in researching SELinux, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with scripts and source code. If not, see <http://www.gnu.org/licenses/>. 0.2 Revision History Version Date Description of Change Changed By 1.0 20th Nov ‘09 First release. Richard Haines 0.3 Acknowledgements Logo designed by Máirín Duffy 0.4 Abbreviations Term Definition apol Policy analysis tool AV Access Vector AVC Access Vector Cache BLP Bell-La Padula CC Common Criteria CMW Compartmented Mode Workstation DAC Discretionary Access Control Page 2 The SELinux Notebook - The Foundations Term Definition F-10 Fedora 10 FLASK Flux Advanced Security Kernel - A security-enhanced version of the Fluke kernel and OS developed by the Utah Flux team and the US Department of Defence.
    [Show full text]
  • Around the Linux File System World in 45 Minutes
    Around the Linux File System World in 45 minutes Steve French IBM Samba Team [email protected] Abstract lines of code), most active (averaging 10 changesets a day!), and most important kernel subsystems. What makes the Linux file system interface unique? What has improved over the past year in this important part of the kernel? Why are there more than 50 Linux 2 The Linux Virtual File System Layer File Systems? Why might you choose ext4 or XFS, NFS or CIFS, or OCFS2 or GFS2? The differences are not al- The heart of the Linux file system, and what makes ways obvious. This paper will describe the new features Linux file systems unique, is the virtual file system in the Linux VFS, how various Linux file systems dif- layer, or VFS, which they connect to. fer in their use, and compare some of the key Linux file systems. 2.1 Virtual File System Structures and Relation- ships File systems are one of the largest and most active parts of the Linux kernel, but some key sections of the file sys- tem interface are little understood, and with more than The relationships between Linux file system compo- than 50 Linux file systems the differences between them nents is described in various papers [3] and is impor- can be confusing even to developers. tant to understand when carefully comparing file system implementations. 1 Introduction: What is a File System? 2.2 Comparison with other Operating Systems Linux has a rich file system interface, and a surprising number of file system implementations.
    [Show full text]
  • Openafs Quick Start Guide for UNIX I
    OpenAFS Quick Start Guide for UNIX i OpenAFS Quick Start Guide for UNIX OpenAFS Quick Start Guide for UNIX ii Copyright © 2000-2014 IBM Corporation and other contributors. All Rights Reserved This documentation is covered by the IBM Public License Version 1.0. OpenAFS Quick Start Guide for UNIX iii COLLABORATORS TITLE : OpenAFS Quick Start Guide for UNIX ACTION NAME DATE SIGNATURE WRITTEN BY February 12, 2016 REVISION HISTORY NUMBER DATE DESCRIPTION NAME BP--1.6.x-4781- gc0876-dirty OpenAFS Quick Start Guide for UNIX iv Contents 1 Installation Overview 1 1.1 The Procedures Described in this Guide . .1 1.1.1 Required Initial Procedures . .1 1.1.1.1 Incorporating AFS Into the Kernel . .1 1.1.1.2 Installing the First AFS Machine . .1 1.1.2 As-needed Procedures . .2 1.1.2.1 Upgrading the Operating System . .2 1.1.2.2 Installing Additional File Server Machines . .2 1.1.2.3 Configuring or Decommissioning Database Server Machines . .2 1.1.2.4 Installing Additional AFS Client Machines . .2 1.1.2.5 Building AFS from Source Code . .2 1.1.2.6 Configuring Legacy Components . .2 1.2 Recommended Reading List . .2 1.3 Requirements . .3 1.3.1 Login Identity . .3 1.3.2 General Requirements . .3 1.3.3 File Server Machine Requirements . .4 1.3.4 Client Machine Requirements . .4 1.4 Supported System Types . .4 1.5 About Upgrading the Operating System . .4 1.6 The OpenAFS Binary Distribution . .5 1.7 How to Continue . .5 2 Installing the First AFS Machine 6 2.1 Requirements and Configuration Decisions .
    [Show full text]