A File System for Write-Once Media Simson L. Garfinkel and J. Spencer
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Large-Scale Study of File-System Contents
A Large-Scale Study of File-System Contents John R. Douceur and William J. Bolosky Microsoft Research Redmond, WA 98052 {johndo, bolosky}@microsoft.com ABSTRACT sizes are fairly consistent across file systems, but file lifetimes and file-name extensions vary with the job function of the user. We We collect and analyze a snapshot of data from 10,568 file also found that file-name extension is a good predictor of file size systems of 4801 Windows personal computers in a commercial but a poor predictor of file age or lifetime, that most large files are environment. The file systems contain 140 million files totaling composed of records sized in powers of two, and that file systems 10.5 TB of data. We develop analytical approximations for are only half full on average. distributions of file size, file age, file functional lifetime, directory File-system designers require usage data to test hypotheses [8, size, and directory depth, and we compare them to previously 10], to drive simulations [6, 15, 17, 29], to validate benchmarks derived distributions. We find that file and directory sizes are [33], and to stimulate insights that inspire new features [22]. File- fairly consistent across file systems, but file lifetimes vary widely system access requirements have been quantified by a number of and are significantly affected by the job function of the user. empirical studies of dynamic trace data [e.g. 1, 3, 7, 8, 10, 14, 23, Larger files tend to be composed of blocks sized in powers of two, 24, 26]. However, the details of applications’ and users’ storage which noticeably affects their size distribution. -
The New Standard for Data Archiving
The new standard for data archiving 1 Explosively increasing 2 Why data management digital data is important Ever-increasing volumes of digital data are mounting up every day due to rapidly growing Today, files can be lost from computers in any number of ways—you might accidentally internet technology, widespread use of SNS, data transmission between network- connected delete a file, a virus might wipe one out, or there could be a complete hard drive failure. devices, and other trends. Within the video production industry, data-heavy video content When a hard drive dies an untimely death, it can feel like a house has burnt down. (for example, 4K, 8K, and 4K/8K high-frame-rate video) is becoming a major source of video Important personal items are usually gone forever—photos, significant documents, broadcasting. Many companies and research institutes are creating high volumes of data downloaded music, and more. (big data) for use in AI systems. Somehow, these newly created assets need to be managed effectively, stored safely, and utilized along with the old assets. There are many options for backing up content without any sophisticated equipment—you can 163,000 EB use DVDs, external hard drives, optical discs, or even online storage. It’s a good idea to back up data to multiple places. Computer Natural Viruses Disasters 4% Other 2% 40,026 EB 3% 2,837 EB 8,591 EB 1,227 EB Software Corruption 9% 2010 2012 2015 2020 2025 System/ Hardware 1 EB = 1,000,000 TB Malfunction 56% User Error Performance 26% Source: Ontrack Data Recovery High Performance www.ontrack.co.uk/understandingdataloss E-commerce, Financial HOT SSD RAM Capacity Optimized Back Oce, General Service WARM HDD Sony's Optical Disc Archive storage system offers the solution, Long-Term Archive COLD with a low total cost of ownership through the use of long-life Auto Loader, O-line Tape Optical Disc Cost media, and it includes inter-generational compatibility based on the same optical disc technology used in DVDs and Blu-ray discs. -
Restoring in All Situations
Restoring in all situations Desktop and laptop protection 1 Adapt data restore methods to different situations Executive Summary Whether caused by human error, a cyber attack, or a physical disaster, data loss costs organizations millions of euros every year (3.5 million euros on average according to the Ponemon Institute). Backup Telecommuting Office remains the method of choice to protect access to company data and ensure their availability. But a good backup strategy must necessarily be accompanied by a good disaster recovery strategy. Nomadization The purpose of this white paper is to present the best restore practices depending on the situation you are in, in order to save valuable time! VMs / Servers Apps & DBs NAS Laptops Replication 2 Table of Contents EXECUTIVE SUMMARY .......................................................................................................................... 2 CONTEXT ................................................................................................................................................ 4 WHAT ARE THE STAKES BEHIND THE RESTORING ENDPOINT USER DATA? ................................. 5 SOLUTION: RESTORING ENDPOINT DATA USING LINA ................................................................... 6-7 1. RESTORE FOR THE MOST NOVICE USERS ................................................................................ 8-10 2. RESTORE FOR OFF SITE USERS ................................................................................................11-13 3. RESTORE FOLLOWING LOSS, -
IBM Connect:Direct for Microsoft Windows: Documentation Fixpack 1 (V6.1.0.1)
IBM Connect:Direct for Microsoft Windows 6.1 Documentation IBM This edition applies to Version 5 Release 3 of IBM® Connect:Direct and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright International Business Machines Corporation 1993, 2018. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Chapter 1. Release Notes.......................................................................................1 Requirements...............................................................................................................................................1 Features and Enhancements....................................................................................................................... 2 Special Considerations................................................................................................................................ 3 Known Restrictions...................................................................................................................................... 4 Restrictions for Connect:Direct for Microsoft Windows........................................................................ 4 Restrictions for Related Software.......................................................................................................... 6 Installation Notes.........................................................................................................................................6 -
Latency-Aware, Inline Data Deduplication for Primary Storage
iDedup: Latency-aware, inline data deduplication for primary storage Kiran Srinivasan, Tim Bisson, Garth Goodson, Kaladhar Voruganti NetApp, Inc. fskiran, tbisson, goodson, [email protected] Abstract systems exist that deduplicate inline with client requests for latency sensitive primary workloads. All prior dedu- Deduplication technologies are increasingly being de- plication work focuses on either: i) throughput sensitive ployed to reduce cost and increase space-efficiency in archival and backup systems [8, 9, 15, 21, 26, 39, 41]; corporate data centers. However, prior research has not or ii) latency sensitive primary systems that deduplicate applied deduplication techniques inline to the request data offline during idle time [1, 11, 16]; or iii) file sys- path for latency sensitive, primary workloads. This is tems with inline deduplication, but agnostic to perfor- primarily due to the extra latency these techniques intro- mance [3, 36]. This paper introduces two novel insights duce. Inherently, deduplicating data on disk causes frag- that enable latency-aware, inline, primary deduplication. mentation that increases seeks for subsequent sequential Many primary storage workloads (e.g., email, user di- reads of the same data, thus, increasing latency. In addi- rectories, databases) are currently unable to leverage the tion, deduplicating data requires extra disk IOs to access benefits of deduplication, due to the associated latency on-disk deduplication metadata. In this paper, we pro- costs. Since offline deduplication systems impact la- -
A Survey of Distributed File Systems
A Survey of Distributed File Systems M. Satyanarayanan Department of Computer Science Carnegie Mellon University February 1989 Abstract Abstract This paper is a survey of the current state of the art in the design and implementation of distributed file systems. It consists of four major parts: an overview of background material, case studies of a number of contemporary file systems, identification of key design techniques, and an examination of current research issues. The systems surveyed are Sun NFS, Apollo Domain, Andrew, IBM AIX DS, AT&T RFS, and Sprite. The coverage of background material includes a taxonomy of file system issues, a brief history of distributed file systems, and a summary of empirical research on file properties. A comprehensive bibliography forms an important of the paper. Copyright (C) 1988,1989 M. Satyanarayanan The author was supported in the writing of this paper by the National Science Foundation (Contract No. CCR-8657907), Defense Advanced Research Projects Agency (Order No. 4976, Contract F33615-84-K-1520) and the IBM Corporation (Faculty Development Award). The views and conclusions in this document are those of the author and do not represent the official policies of the funding agencies or Carnegie Mellon University. 1 1. Introduction The sharing of data in distributed systems is already common and will become pervasive as these systems grow in scale and importance. Each user in a distributed system is potentially a creator as well as a consumer of data. A user may wish to make his actions contingent upon information from a remote site, or may wish to update remote information. -
File Server Scaling with Network-Attached Secure Disks
Appears in Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (Sigmetrics ‘97), Seattle, Washington, June 15-18, 1997. File Server Scaling with Network-Attached Secure Disks Garth A. Gibson†, David F. Nagle*, Khalil Amiri*, Fay W. Chang†, Eugene M. Feinberg*, Howard Gobioff†, Chen Lee†, Berend Ozceri*, Erik Riedel*, David Rochberg†, Jim Zelenka† *Department of Electrical and Computer Engineering †School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213-3890 [email protected] http://www.cs.cmu.edu/Web/Groups/NASD/ Abstract clients with efficient, scalable, high-bandwidth access to stored data. This paper discusses a powerful approach to fulfilling this By providing direct data transfer between storage and client, net- need. Network-attached storage provides high bandwidth by work-attached storage devices have the potential to improve scal- directly attaching storage to the network, avoiding file server ability for existing distributed file systems (by removing the server store-and-forward operations and allowing data transfers to be as a bottleneck) and bandwidth for new parallel and distributed file striped over storage and switched-network links. systems (through network striping and more efficient data paths). The principal contribution of this paper is to demonstrate the Together, these advantages influence a large enough fraction of the potential of network-attached storage devices for penetrating the storage market to make commodity network-attached storage fea- markets defined by existing distributed file system clients, specifi- sible. Realizing the technology’s full potential requires careful cally the Network File System (NFS) and Andrew File System consideration across a wide range of file system, networking and (AFS) distributed file system protocols. -
On the Performance Variation in Modern Storage Stacks
On the Performance Variation in Modern Storage Stacks Zhen Cao1, Vasily Tarasov2, Hari Prasath Raman1, Dean Hildebrand2, and Erez Zadok1 1Stony Brook University and 2IBM Research—Almaden Appears in the proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17) Abstract tions on different machines have to compete for heavily shared resources, such as network switches [9]. Ensuring stable performance for storage stacks is im- In this paper we focus on characterizing and analyz- portant, especially with the growth in popularity of ing performance variations arising from benchmarking hosted services where customers expect QoS guaran- a typical modern storage stack that consists of a file tees. The same requirement arises from benchmarking system, a block layer, and storage hardware. Storage settings as well. One would expect that repeated, care- stacks have been proven to be a critical contributor to fully controlled experiments might yield nearly identi- performance variation [18, 33, 40]. Furthermore, among cal performance results—but we found otherwise. We all system components, the storage stack is the corner- therefore undertook a study to characterize the amount stone of data-intensive applications, which become in- of variability in benchmarking modern storage stacks. In creasingly more important in the big data era [8, 21]. this paper we report on the techniques used and the re- Although our main focus here is reporting and analyz- sults of this study. We conducted many experiments us- ing the variations in benchmarking processes, we believe ing several popular workloads, file systems, and storage that our observations pave the way for understanding sta- devices—and varied many parameters across the entire bility issues in production systems. -
Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO
Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO..........................................................................................................................................1 Martin Hinner < [email protected]>, http://martin.hinner.info............................................................1 1. Introduction..........................................................................................................................................1 2. Volumes...............................................................................................................................................1 3. DOS FAT 12/16/32, VFAT.................................................................................................................2 4. High Performance FileSystem (HPFS)................................................................................................2 5. New Technology FileSystem (NTFS).................................................................................................2 6. Extended filesystems (Ext, Ext2, Ext3)...............................................................................................2 7. Macintosh Hierarchical Filesystem − HFS..........................................................................................3 8. ISO 9660 − CD−ROM filesystem.......................................................................................................3 9. Other filesystems.................................................................................................................................3 -
An Advanced Solution for Long-Term Retention of Enterprise Digital
Data Management, Simplified. Enterprise Edition An Advanced Solution for Long-term Retention of Enterprise Digital Assets A centrally managed, secure archive is a key requirement for today’s enterprises, which must contend with exponential growth in data volumes, long-term retention requirements, and e-discovery mandates. Atempo Digital Archive, Enterprise Edition, is a comprehensive solution that addresses each of these data archiving challenges. Providing capabilities for both automatically and manually archiving data in long-term storage platforms, Atempo Digital Archive ensures that all corporate data is easy to find and access for as long as it needs to be retained. An Answer to Each Archiving Need Addressing Today’s Storage Management Challenges Enabling the Long-term Retention of Digital Assets Today, organizations must manage more, and more Digital information represents both the lifeblood of a critical, data than ever before—and data volumes continue business today, and the basis of records that may need to grow at an explosive pace. Traditional backup and to be accessed for generations. With its integrated recovery solutions have proven ill-equipped to keep up capabilities for full content indexing, search, and metadata with these demands. To ensure these increasing volumes indexing, Atempo Digital Archive allows users to retrieve of critical information are always available when needed, information quickly and easily. In addition, it offers organizations require advanced new archiving capabilities. refreshing mechanisms that verify whether storage Atempo Digital Archive is the one solution that enables media is still readable, and that ensure information IT organizations to address their storage management can continue to be accessed on an ongoing basis. -
Overview Addressing Today's Storage Management Challenges Enabling
DATA PROTECTION ATEMPO-DIGITAL ARCHIVE High performance file archiving software for large data volumes OVERVIEW A centrally managed, secure archive is a key requirement for today’s enterprises, which must contend with exponential growth in data volumes and long-term retention requirements. Atempo-Digital Archive is a comprehensive solution that addresses each of these data archiving challenges. Providing capabilities for both automatically and manually archiving large data volumes in long-term storage platforms, Atempo-Digital Archive ensures that all corporate data is easy to find and access for as long as it needs to be retained. ADDRESSING TODAY’S STORAGE MANAGEMENT CHALLENGES Today, organizations must manage more critical data than ever before - and data volumes continue to grow at an explosive pace. Traditional backup and recovery solutions have proven ill-equipped to keep up with these demands. To ensure these increasing volumes of critical information are always available when needed, organizations require advanced new archiving capabilities. Atempo-Digital Archive (ADA) is the one solution that enables IT organizations to address their storage management challenges today and in the long term. By managing data in a secure, centrally-managed archive, Atempo-Digital Archive represents the most efficient, reliable, and cost-effective way to protect critical business information. This solution ensures that an organization’s most valuable data is organized and easily accessible - while at the same time freeing up expensive disk space, lowering storage costs, and reducing the backup window for data that is infrequently accessed. ENABLING THE LONG-TERM RETENTION OF DIGITAL ASSETS Digital information represents both the lifeblood of a business today, and the basis of records that may need to be accessed for generations. -
CD ROM Drives and Their Handling Under RISC OS Explained
28th April 1995 Support Group Application Note Number: 273 Issue: 0.4 **DRAFT** Author: DW CD ROM Drives and their Handling under RISC OS Explained This Application Note details the interfacing of, and protocols underlying, CD ROM drives and their handling under RISC OS. Applicable Related Hardware : Application All Acorn computers running Notes: 266: Developing CD ROM RISC OS 3.1 or greater products for Acorn machines 269: File transfer between Acorn and foreign platforms Every effort has been made to ensure that the information in this leaflet is true and correct at the time of printing. However, the products described in this leaflet are subject to continuous development and Support Group improvements and Acorn Computers Limited reserves the right to change its specifications at any time. Acorn Computers Limited Acorn Computers Limited cannot accept liability for any loss or damage arising from the use of any information or particulars in this leaflet. Acorn, the Acorn Logo, Acorn Risc PC, ECONET, AUN, Pocket Acorn House Book and ARCHIMEDES are trademarks of Acorn Computers Limited. Vision Park ARM is a trademark of Advance RISC Machines Limited. Histon, Cambridge All other trademarks acknowledged. ©1995 Acorn Computers Limited. All rights reserved. CB4 4AE Support Group Application Note No. 273, Issue 0.4 **DRAFT** 28th April 1995 Table of Contents Introduction 3 Interfacing CD ROM Drives 3 Features of CD ROM Drives 4 CD ROM Formatting Standards ("Coloured Books") 5 CD ROM Data Storage Standards 7 CDFS and CD ROM Drivers 8 Accessing Data on CD ROMs Produced for Non-Acorn Platforms 11 Troubleshooting 12 Appendix A: Useful Contacts 13 Appendix B: PhotoCD Mastering Bureaux 14 Appendix C: References 14 Support Group Application Note No.