GFS2 Filesystem 253 Steven Whitehouse

Total Page:16

File Type:pdf, Size:1020Kb

GFS2 Filesystem 253 Steven Whitehouse Proceedings of the Linux Symposium Volume Two June 27th–30th, 2007 Ottawa, Ontario Canada Contents Unifying Virtual Drivers 9 J. Mason, D. Shows, and D. Olien The new ext4 filesystem: current status and future plans 21 A. Mathur, M. Cao, S. Bhattacharya, A. Dilger, A. Tomas, & L. Vivier The 7 dwarves: debugging information beyond gdb 35 Arnaldo Carvalho de Melo Adding Generic Process Containers to the Linux Kernel 45 P.B. Menage KvmFS: Virtual Machine Partitioning For Clusters and Grids 59 Andrey Mirtchovski Linux-based Ultra Mobile PCs 65 Rajeev Muralidhar, Hari Seshadri, Krishna Paul, & Srividya Karumuri Where is your application stuck? 71 S. Nagar, B. Singh, V. Kashyap, C. Seethraman, N. Sharoff, & P. Banerjee Trusted Secure Embedded Linux 79 Hadi Nahari Hybrid-Virtualization—Enhanced Virtualization for Linux 87 J. Nakajima and A.K. Mallick Readahead: time-travel techniques for desktop and embedded systems 97 Michael Opdenacker Semantic Patches 107 Y. Padioleau & J.L. Lawall cpuidle—Do nothing, efficiently. 119 V. Pallipadi, A. Belay, & S. Li My bandwidth is wider than yours 127 Iñaky Pérez-González Zumastor Linux Storage Server 135 Daniel Phillips Cleaning up the Linux Desktop Audio Mess 145 Lennart Poettering Linux-VServer 151 H. Pötzl Internals of the RT Patch 161 Steven Rostedt lguest: Implementing the little Linux hypervisor 173 Rusty Russell ext4 online defragmentation 179 Takashi Sato The Hiker Project: An Application Framework for Mobile Linux Devices 187 David Schlesinger Getting maximum mileage out of tickless 201 Suresh Siddha Containers: Challenges with the memory resource controller and its performance 209 Balbir Singh and Vaidyanathan Srinivasan Kernel Support for Stackable File Systems 223 Sipek, Pericleous & Zadok Linux Rollout at Nortel 229 Ernest Szeideman Request-based Device-mapper multipath and Dynamic load balancing 235 Kiyoshi Ueda Short-term solution for 3G networks in Linux: umtsmon 245 Klaas van Gend The GFS2 Filesystem 253 Steven Whitehouse Driver Tracing Interface 261 David J. Wilder, Michael Holzheu, & Thomas R. Zanussi Linux readahead: less tricks for more 273 Fengguang Wu, Hongsheng Xi, Jun Li, & Nanhai Zou Regression Test Framework and Kernel Execution Coverage 285 Hiro Yoshioka Enable PCI Express Advanced Error Reporting in the Kernel 297 Y.M. Zhang and T. Long Nguyen Enabling Linux Network Support of Hardware Multiqueue Devices 305 Zhu Yi and PJ. Waskiewicz Concurrent Pagecache 311 Peter Zijlstra Conference Organizers Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering C. Craig Ross, Linux Symposium Review Committee Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering Dirk Hohndel, Intel Martin Bligh, Google Gerrit Huizenga, IBM Dave Jones, Red Hat, Inc. C. Craig Ross, Linux Symposium Proceedings Formatting Team John W. Lockhart, Red Hat, Inc. Gurhan Ozen, Red Hat, Inc. John Feeney, Red Hat, Inc. Len DiMaggio, Red Hat, Inc. John Poelstra, Red Hat, Inc. Authors retain copyright to all submitted papers, but have granted unlimited redistribution rights to all as a condition of submission. Unifying Virtual Drivers Jon Mason 3Leaf Systems [email protected] Dwayne Shows 3Leaf Systems [email protected] Dave Olien 3Leaf Systems [email protected] Abstract implemented through two distinct ways: Full Virtualiza- tion and Para-virtualization. Para-virtualization presents a wide variety of issues to Full Virtualization is a method that fully simulates hard- Operating Systems. One of these is presenting virtual ware devices through emulating physical hardware com- devices to the para-virtualized operating system, as well ponents in software. This simulation of hardware allows as the device drivers which handle these devices. With the OS and other components to run their software un- the increase of virtualization in the Linux kernel, there modified. In this technique, all instructions are trans- has been an influx of unique drivers to handle all of these lated through the hypervisor into hardware system calls. new virtual devices, and there are more devices on the The hypervisor controls the devices and I/O, and simu- way. The current state of Linux has four in-tree versions lated hardware devices are exported to the virtual ma- of a virtual network device (IBM pSeries Virtual Eth- chine. This hardware simulation does have a signifi- ernet, IBM iSeries Virtual Ethernet, UML Virtual Net- cant performance penalty when compared to running the work Device, and TUN/TAP) and numerous out-of-tree same software on native hardware, but allows the user versions (one for Xen, VMware, 3leaf, and many oth- to run multiple instances of a VM (virtual machine) si- ers). Also, there are a similar number of block device multaneously (thus allowing a higher utilization of the drivers. hardware and I/O). Examples of this type of virtualiza- This paper will go into why there are so many, and the tion are QEMU and VMware. differences and commonalities between them. It will Para-virtualization is a method that modifies the OS run- go into the benefits and drawbacks of combining them, ning inside the VM to run under the hypervisor. It is their requirements, and any design issues. It will discuss modified to support the hypervisor and avoid unneces- the changes to the Linux kernel to combine the virtual sary use of privileged instructions. These modifications network and virtual block devices into two common de- allow the performance of the system to be near native. vices. We will discuss how to adapt the existing virtual However, this type of virtualization requires that vir- devices and write drivers to take advantage of this new tual devices be exposed for access to the underlying I/O interface. hardware. UML and Xen are examples of such virtu- alization implimentations. The scope of this paper is 1 Introduction to I/O Virtualization in considering only para-virtualized virtual machine ab- stractions; this is where virtual device implementations have proliferated. Virtualization is the ability of a system through hard- ware and/or software to run multiple instances of Oper- To have better I/O performance in a virtual machine, the ating Systems simultaneously. This is done through ab- VMs have virtual devices exported to them and have the stracting the physical hardware layer through a software native device located in another VM or in the hypervi- layer, known as a hypervisor. Virtualzation is primarily sor. These virtual devices have virtual device drivers for • 9 • 10 • Unifying Virtual Drivers them to appear in the VM as if they were native, physi- From the kernel address space the driver can use for user cal devices. These virtual device drivers allow the data processes to mmap to their own address spaces. to be passed from the VM to the physical hardware. This hardware then does the required work, and may or may All the guest reads and writes go through an API to the not return information to the virtual device driver. This host and are translated to host reads and writes. As such, level of abstraction allows for higher performance I/O, these operations are using the buffer cache on the host. but the underlying procedure to perform the requested This causes a disproportionate amount of memory to be operation differs on each para-virtualzation implemen- consumed by the UMLs without limits as to how much tation. they can utilize. As is characteristic to Linux, flushing modified buffers back to disk will be done when appro- priate, based upon memory pressure in the host. 2 Linux Virtualization support Another feature of UML virtual block device is supports There are many Linux-centered virtualization imple- “copy on write” (COW) partitions. COW partitions al- mentations, both in the kernel and outside the kernel. low a common image of a root device to be shared The implementations that are currently in the Linux ker- among many instances of UML. Writes to blocks on the nel have been there for some time and have matured COW partition are kept in a unique device for that UML over that time to be robust. Those implementations are instance. This is more convenient than the Xen recom- UML, IBM’s POWER virtualization, and the tun/tap de- mendation for using the LVM to provide COW function- vice driver. Those implementations that are currently ality. living outside the kernel tree are gaining popularity, and UML provides a number of APIs for file IO. These APIs may one day be included in the mainline kernel. The hook into native versions largely unmodified Linux implementations of note are Xen and 3leaf Systems vir- code. These APIs provides the virtual interface to the tualization (though many others exist). physical device on the host. Examples of these in- terfaces are os_open_file, os_read_file, and 2.1 In-tree Linux Virtualization support os_write_file. 2.1.1 User Mode Linux 2.1.2 IBM POWER virtualization User Mode Linux (UML) provides virtual block and network interfaces with a couple of predominant drivers. IBM’s PowerPC-based iSeries and pSeries hardware provides a native hypervisor in system firmware, allow- Briefly, the network drivers recommended for UML are ing the system to be easily virtualized. The user-level TUN/TAP. These are described in a later section. The tools allow the underlying hardware to be assigned to virtual devices created can be used with other virtual specific virtual machines, and in certain cases no I/O devices on other VMs or with systems on the external hardware at all. In the latter case, one virtual machine network. To configure for forwarding to other VMs, is assigned the physical adapters and manages all I/O to UML comes with a switch daemon that allows user- and from those adapters. The hypervisor then exposes level forwarding of packets. UML supports hot plug virtual devices to the other virtual machines. The virtual devices: new virtual devices can be configured dynam- devices are presented to the OS via the system’s Open ically while the system is up.
Recommended publications
  • Elastic Storage for Linux on System Z IBM Research & Development Germany
    Peter Münch T/L Test and Development – Elastic Storage for Linux on System z IBM Research & Development Germany Elastic Storage for Linux on IBM System z © Copyright IBM Corporation 2014 9.0 Elastic Storage for Linux on System z Session objectives • This presentation introduces the Elastic Storage, based on General Parallel File System technology that will be available for Linux on IBM System z. Understand the concepts of Elastic Storage and which functions will be available for Linux on System z. Learn how you can integrate and benefit from the Elastic Storage in a Linux on System z environment. Finally, get your first impression in a live demo of Elastic Storage. 2 © Copyright IBM Corporation 2014 Elastic Storage for Linux on System z Trademarks The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. AIX* FlashSystem Storwize* Tivoli* DB2* IBM* System p* WebSphere* DS8000* IBM (logo)* System x* XIV* ECKD MQSeries* System z* z/VM* * Registered trademarks of IBM Corporation The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
    [Show full text]
  • Google Android
    Google Android 2008/3/10 NemusTech, Inc. Lee Seung Min 네무스텍㈜ Agenda Introduction Mobile Platform Overview Background : Current Linux Mobile Platform What is Android? Features Architecture Technical Detail Android SDK Porting Android to Real Target Future of Android A conceptual model for mobile software Software Stack Kernel the core of the SW (HW drivers, memory, filesystem, and process management) Middleware The set of peripheral software libraries (messaging and communication engines, WAP renders, codecs, etc) Application Execution Environment An application manager and set APIs UI framework A set of graphic components and an interaction framework Application Suite The set of core handset application ( IDLE screen, dialer, menu screen, contacts, calendar, etc) Mobile Platforms Feature Phone Vendor Platform : Mocha, PDK, WAVE, WISE, KX, etc...... Carrier Platform : SKTelecom TPAK, NTT i-Mode (WAP), Java, WIPI, BREW, etc…… 3rd Party Solution : TAT Cascade, Qualcomm uiOne Smart Phone MicroSoft Windows Mobile Nokia : Symbian, Series 60 Apple, iPhone – OSX 10.5 Leopard Linux Customers & Licensees Not all customers or licensees are shown Source:vendor data Smartphone OS Market Share by Region Smartphone OS market share by region, 2006 Source : Canalys Current Linux Mobile Platforms LiMo Foundation https://www.limofoundation.org/sf/sfmain/do/home TrollTech Qtopia GreenPhone Acquired by Nokia OpenMoko : GNU/Linux based software development platform http://www.openmoko.org , http://www.openmoko.com Linux
    [Show full text]
  • Trusted Docker Containers and Trusted Vms in Openstack
    Trusted Docker Containers and Trusted VMs in OpenStack Raghu Yeluri Abhishek Gupta Outline o Context: Docker Security – Top Customer Asks o Intel’s Focus: Trusted Docker Containers o Who Verifies Trust ? o Reference Architecture with OpenStack o Demo o Availability o Call to Action Docker Overview in a Slide.. Docker Hub Lightweight, open source engine for creating, deploying containers Provides work flow for running, building and containerizing apps. Separates apps from where they run.; Enables Micro-services; scale by composition. Underlying building blocks: Linux kernel's namespaces (isolation) + cgroups (resource control) + .. Components of Docker Docker Engine – Runtime for running, building Docker containers. Docker Repositories(Hub) - SaaS service for sharing/managing images Docker Images (layers) Images hold Apps. Shareable snapshot of software. Container is a running instance of image. Orchestration: OpenStack, Docker Swarm, Kubernetes, Mesos, Fleet, Project Docker Layers Atomic, Lattice… Docker Security – 5 key Customer Asks 1. How do you know that the Docker Host Integrity is there? o Do you trust the Docker daemon? o Do you trust the Docker host has booted with Integrity? 2. How do you verify Docker Container Integrity o Who wrote the Docker image? Do you trust the image? Did the right Image get launched? 3. Runtime Protection of Docker Engine & Enhanced Isolation o How can Intel help with runtime Integrity? 4. Enterprise Security Features – Compliance, Manageability, Identity authentication.. Etc. 5. OpenStack as a single Control Plane for Trusted VMs and Trusted Docker Containers.. Intel’s Focus: Enable Hardware-based Integrity Assurance for Docker Containers – Trusted Docker Containers Trusted Docker Containers – 3 focus areas o Launch Integrity of Docker Host o Runtime Integrity of Docker Host o Integrity of Docker Images Today’s Focus: Integrity of Docker Host, and how to use it in OpenStack.
    [Show full text]
  • Lguest a Journey of Learning the Linux Kernel Internals
    Lguest A journey of learning the Linux kernel internals Daniel Baluta <[email protected]> What is lguest? ● “Lguest is an adventure, with you, the reader, as a Hero” ● Minimal 32-bit x86 hypervisor ● Introduced in 2.6.23 (October, 2007) by Rusty Russel & co ● Around 6000 lines of code ● “Relatively” “easy” to understand and hack on ● Support to learn linux kernel internals ● Porting to other architectures is a challenging task Hypervisor types make Preparation ● Guest ○ Life of a guest ● Drivers ○ virtio devices: console, block, net ● Launcher ○ User space program that sets up and configures Guests ● Host ○ Normal linux kernel + lg.ko kernel module ● Switcher ○ Low level Host <-> Guest switch ● Mastery ○ What’s next? Lguest overview make Guest ● “Simple creature, identical to the Host [..] behaving in simplified ways” ○ Same kernel as Host (or at least, built from the same source) ● booting trace (when using bzImage) ○ arch/x86/boot/compressed/head_32.S: startup_32 ■ uncompresses the kernel ○ arch/x86/kernel/head_32.S : startup_32 ■ initialization , checks hardware_subarch field ○ arch/x86/lguest/head_32.S ■ LGUEST_INIT hypercall, jumps to lguest_init ○ arch/x86/lguest/boot.c ■ once we are here we know we are a guest ■ override privileged instructions ■ boot as normal kernel, calling i386_start_kernel Hypercalls struct lguest_data ● second communication method between Host and Guest ● LHCALL_GUEST_INIT ○ hypercall to tell the Host where lguest_data struct is ● both Guest and Host publish information here ● optimize hypercalls ○ irq_enabled
    [Show full text]
  • Is Parallel Programming Hard, And, If So, What Can You Do About It?
    Is Parallel Programming Hard, And, If So, What Can You Do About It? Edited by: Paul E. McKenney Linux Technology Center IBM Beaverton [email protected] December 16, 2011 ii Legal Statement This work represents the views of the authors and does not necessarily represent the view of their employers. IBM, zSeries, and Power PC are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Linux is a registered trademark of Linus Torvalds. i386 is a trademarks of Intel Corporation or its subsidiaries in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of such companies. The non-source-code text and images in this doc- ument are provided under the terms of the Creative Commons Attribution-Share Alike 3.0 United States li- cense (http://creativecommons.org/licenses/ by-sa/3.0/us/). In brief, you may use the contents of this document for any purpose, personal, commercial, or otherwise, so long as attribution to the authors is maintained. Likewise, the document may be modified, and derivative works and translations made available, so long as such modifications and derivations are offered to the public on equal terms as the non-source-code text and images in the original document. Source code is covered by various versions of the GPL (http://www.gnu.org/licenses/gpl-2.0.html). Some of this code is GPLv2-only, as it derives from the Linux kernel, while other code is GPLv2-or-later.
    [Show full text]
  • Release Notes for Debian GNU/Linux 5.0 (Lenny), Alpha
    Release Notes for Debian GNU/Linux 5.0 (lenny), Alpha The Debian Documentation Project (http://www.debian.org/doc/) November 11, 2010 Release Notes for Debian GNU/Linux 5.0 (lenny), Alpha Published 2009-02-14 This document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2, as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; with- out even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. The license text can also be found at http://www.gnu.org/copyleft/gpl.html and /usr/ share/common-licenses/GPL-2 on Debian GNU/Linux. ii Contents 1 Introduction 3 1.1 Reporting bugs on this document . .3 1.2 Contributing upgrade reports . .3 1.3 Sources for this document . .4 2 What’s new in Debian GNU/Linux 5.05 2.1 What’s new in the distribution? . .5 2.1.1 Package management . .7 2.1.2 The proposed-updates section . .7 2.2 System improvements . .8 2.3 Major kernel-related changes . .8 2.3.1 Changes in kernel packaging . .8 2.4 Emdebian 1.0 (based on Debian GNU/Linux lenny 5.0) . .9 2.5 Netbook support .
    [Show full text]
  • Red Hat Satellite 6.7 Provisioning Guide
    Red Hat Satellite 6.7 Provisioning Guide A guide to provisioning physical and virtual hosts on Red Hat Satellite Servers. Last Updated: 2021-05-14 Red Hat Satellite 6.7 Provisioning Guide A guide to provisioning physical and virtual hosts on Red Hat Satellite Servers. Red Hat Satellite Documentation Team [email protected] Legal Notice Copyright © 2021 Red Hat, Inc. The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/ . In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. Linux ® is the registered trademark of Linus Torvalds in the United States and other countries. Java ® is a registered trademark of Oracle and/or its affiliates. XFS ® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. MySQL ® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
    [Show full text]
  • Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C
    Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift Computer Sciences Department, University of Wisconsin, Madison Abstract and most complex code bases in the kernel. Further, We introduce Membrane, a set of changes to the oper- file systems are still under active development, and new ating system to support restartable file systems. Mem- ones are introduced quite frequently. For example, Linux brane allows an operating system to tolerate a broad has many established file systems, including ext2 [34], class of file system failures and does so while remain- ext3 [35], reiserfs [27], and still there is great interest in ing transparent to running applications; upon failure, the next-generation file systems such as Linux ext4 and btrfs. file system restarts, its state is restored, and pending ap- Thus, file systems are large, complex, and under develop- plication requests are serviced as if no failure had oc- ment, the perfect storm for numerous bugs to arise. curred. Membrane provides transparent recovery through Because of the likely presence of flaws in their imple- a lightweight logging and checkpoint infrastructure, and mentation, it is critical to consider how to recover from includes novel techniques to improve performance and file system crashes as well. Unfortunately, we cannot di- correctness of its fault-anticipation and recovery machin- rectly apply previous work from the device-driver litera- ery. We tested Membrane with ext2, ext3, and VFAT. ture to improving file-system fault recovery. File systems, Through experimentation, we show that Membrane in- unlike device drivers, are extremely stateful, as they man- duces little performance overhead and can tolerate a wide age vast amounts of both in-memory and persistent data; range of file system crashes.
    [Show full text]
  • Version 7.8-Systemd
    Linux From Scratch Version 7.8-systemd Created by Gerard Beekmans Edited by Douglas R. Reno Linux From Scratch: Version 7.8-systemd by Created by Gerard Beekmans and Edited by Douglas R. Reno Copyright © 1999-2015 Gerard Beekmans Copyright © 1999-2015, Gerard Beekmans All rights reserved. This book is licensed under a Creative Commons License. Computer instructions may be extracted from the book under the MIT License. Linux® is a registered trademark of Linus Torvalds. Linux From Scratch - Version 7.8-systemd Table of Contents Preface .......................................................................................................................................................................... vii i. Foreword ............................................................................................................................................................. vii ii. Audience ............................................................................................................................................................ vii iii. LFS Target Architectures ................................................................................................................................ viii iv. LFS and Standards ............................................................................................................................................ ix v. Rationale for Packages in the Book .................................................................................................................... x vi. Prerequisites
    [Show full text]
  • Lguest the Little Hypervisor
    Lguest the little hypervisor Matías Zabaljáuregui [email protected] based on lguest documentation written by Rusty Russell note: this is a draft and incomplete version lguest introduction launcher guest kernel host kernel switcher don't have time for: virtio virtual devices patching technique async hypercalls rewriting technique guest virtual interrupt controller guest virtual clock hw assisted vs paravirtualization hybrid virtualization benchmarks introduction A hypervisor allows multiple Operating Systems to run on a single machine. lguest paravirtualization within linux kernel proof of concept (paravirt_ops, virtio...) teaching tool guests and hosts A module (lg.ko) allows us to run other Linux kernels the same way we'd run processes. We only run specially modified Guests. Setting CONFIG_LGUEST_GUEST, compiles [arch/x86/lguest/boot.c] into the kernel so it knows how to be a Guest at boot time. This means that you can use the same kernel you boot normally (ie. as a Host) as a Guest. These Guests know that they cannot do privileged operations, such as disable interrupts, and that they have to ask the Host to do such things explicitly. Some of the replacements for such low-level native hardware operations call the Host through a "hypercall". [ drivers/lguest/hypercalls.c ] high level overview launcher /dev/lguest guest kernel host kernel switcher Launcher main() some implementations /dev/lguest write → initialize read → run guest Launcher main [ Documentation/lguest/lguest.c ] The Launcher is the Host userspace program which sets up, runs and services the Guest. Just to confuse you: to the Host kernel, the Launcher *is* the Guest and we shall see more of that later.
    [Show full text]
  • Andrew File System (AFS) Google File System February 5, 2004
    Advanced Topics in Computer Systems, CS262B Prof Eric A. Brewer Andrew File System (AFS) Google File System February 5, 2004 I. AFS Goal: large-scale campus wide file system (5000 nodes) o must be scalable, limit work of core servers o good performance o meet FS consistency requirements (?) o managable system admin (despite scale) 400 users in the “prototype” -- a great reality check (makes the conclusions meaningful) o most applications work w/o relinking or recompiling Clients: o user-level process, Venus, that handles local caching, + FS interposition to catch all requests o interaction with servers only on file open/close (implies whole-file caching) o always check cache copy on open() (in prototype) Vice (servers): o Server core is trusted; called “Vice” o servers have one process per active client o shared data among processes only via file system (!) o lock process serializes and manages all lock/unlock requests o read-only replication of namespace (centralized updates with slow propagation) o prototype supported about 20 active clients per server, goal was >50 Revised client cache: o keep data cache on disk, metadata cache in memory o still whole file caching, changes written back only on close o directory updates are write through, but cached locally for reads o instead of check on open(), assume valid unless you get an invalidation callback (server must invalidate all copies before committing an update) o allows name translation to be local (since you can now avoid round-trip for each step of the path) Revised servers: 1 o move
    [Show full text]
  • A Survey of Distributed File Systems
    A Survey of Distributed File Systems M. Satyanarayanan Department of Computer Science Carnegie Mellon University February 1989 Abstract Abstract This paper is a survey of the current state of the art in the design and implementation of distributed file systems. It consists of four major parts: an overview of background material, case studies of a number of contemporary file systems, identification of key design techniques, and an examination of current research issues. The systems surveyed are Sun NFS, Apollo Domain, Andrew, IBM AIX DS, AT&T RFS, and Sprite. The coverage of background material includes a taxonomy of file system issues, a brief history of distributed file systems, and a summary of empirical research on file properties. A comprehensive bibliography forms an important of the paper. Copyright (C) 1988,1989 M. Satyanarayanan The author was supported in the writing of this paper by the National Science Foundation (Contract No. CCR-8657907), Defense Advanced Research Projects Agency (Order No. 4976, Contract F33615-84-K-1520) and the IBM Corporation (Faculty Development Award). The views and conclusions in this document are those of the author and do not represent the official policies of the funding agencies or Carnegie Mellon University. 1 1. Introduction The sharing of data in distributed systems is already common and will become pervasive as these systems grow in scale and importance. Each user in a distributed system is potentially a creator as well as a consumer of data. A user may wish to make his actions contingent upon information from a remote site, or may wish to update remote information.
    [Show full text]