Practical File System Design:The Be File System, Dominic Giampaolo Half Title Page Page I
Total Page:16
File Type:pdf, Size:1020Kb
Practical File System Design:The Be File System, Dominic Giampaolo half title page page i Practical File System Design with the Be File System Practical File System Design:The Be File System, Dominic Giampaolo BLANK page ii Practical File System Design:The Be File System, Dominic Giampaolo title page page iii Practical File System Design with the Be File System Dominic Giampaolo Be, Inc. ® MORGAN KAUFMANN PUBLISHERS, INC. San Francisco, California Practical File System Design:The Be File System, Dominic Giampaolo copyright page page iv Editor Tim Cox Director of Production and Manufacturing Yonie Overton Assistant Production Manager Julie Pabst Editorial Assistant Sarah Luger Cover Design Ross Carron Design Cover Image William Thompson/Photonica Copyeditor Ken DellaPenta Proofreader Jennifer McClain Text Design Side by Side Studios Illustration Cherie Plumlee Composition Ed Sznyter, Babel Press Indexer Ty Koontz Printer Edwards Brothers Designations used by companies to distinguish their products are often claimed as trademarks or registered trademarks. In all instances where Morgan Kaufmann Publishers, Inc. is aware of a claim, the product names appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration. Morgan Kaufmann Publishers, Inc. Editorial and Sales Office 340 Pine Street, Sixth Floor San Francisco, CA 94104-3205 USA Telephone 415/392-2665 Facsimile 415/982-2665 Email [email protected] WWW http://www.mkp.com Order toll free 800/745-7323 c 1999 Morgan Kaufmann Publishers, Inc. All rights reserved Printed in the United States of America 03 02 01 00 99 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher. Library of Congress Cataloging-in-Publication Data is available for this book. ISBN 1-55860-497-9 Practical File System Design:The Be File System, Dominic Giampaolo page v Contents Preface ix Chapter 1 Introduction to the BeOS and BFS 1 1.1 History Leading Up to BFS 1 1.2 Design Goals 4 1.3 Design Constraints 5 1.4 Summary 5 Chapter 2 What Is a File System? 7 2.1 The Fundamentals 7 2.2 The Terminology 8 2.3 The Abstractions 9 2.4 Basic File System Operations 20 2.5 Extended File System Operations 28 2.6 Summary 31 Chapter 3 Other File Systems 33 3.1 BSD FFS 33 3.2 Linux ext2 36 3.3 Macintosh HFS 37 3.4 Irix XFS 38 3.5 Windows NT’s NTFS 40 3.6 Summary 44 Chapter 4 The Data Structures of BFS 45 4.1 What Is a Disk? 45 4.2 How to Manage Disk Blocks 46 4.3 Allocation Groups 46 4.4 Block Runs 47 v Practical File System Design:The Be File System, Dominic Giampaolo page vi vi CONTENTS 4.5 The Superblock 48 4.6 The I-Node Structure 51 4.7 The Core of an I-Node: The Data Stream 55 4.8 Attributes 59 4.9 Directories 61 4.10 Indexing 62 4.11 Summary 63 Chapter 5 Attributes, Indexing, and Queries 65 5.1 Attributes 65 5.2 Indexing 74 5.3 Queries 90 5.4 Summary 97 Chapter 6 Allocation Policies 99 6.1 Where Do You Put Things on Disk? 99 6.2 What Are Allocation Policies? 99 6.3 Physical Disks 100 6.4 What Can You Lay Out? 102 6.5 Types of Access 103 6.6 Allocation Policies in BFS 104 6.7 Summary 109 Chapter 7 Journaling 111 7.1 The Basics 112 7.2 How Does Journaling Work? 113 7.3 Types of Journaling 115 7.4 What Is Journaled? 115 7.5 Beyond Journaling 116 7.6 What’s the Cost? 117 7.7 The BFS Journaling Implementation 118 7.8 What Are Transactions?—A Deeper Look 124 7.9 Summary 125 Chapter 8 The Disk Block Cache 127 8.1 Background 127 8.2 Organization of a Buffer Cache 128 8.3 Cache Optimizations 132 8.4 I/O and the Cache 133 8.5 Summary 137 Chapter 9 File System Performance 139 9.1 What Is Performance? 139 9.2 What Are the Benchmarks? 140 9.3 Performance Numbers 144 9.4 Performance in BFS 150 9.5 Summary 153 Practical File System Design:The Be File System, Dominic Giampaolo page vii vii CONTENTS Chapter 10 The Vnode Layer 155 10.1 Background 156 10.2 Vnode Layer Concepts 159 10.3 Vnode Layer Support Routines 161 10.4 How It Really Works 162 10.5 The Node Monitor 181 10.6 Live Queries 183 10.7 Summary 184 Chapter 11 User-Level API 185 11.1 The POSIX API and C Extensions 185 11.2 The C++ API 190 11.3 Using the API 198 11.4 Summary 202 Chapter 12 Testing 203 12.1 The Supporting Cast 203 12.2 Examples of Data Structure Verification 204 12.3 Debugging Tools 205 12.4 Data Structure Design for Debugging 206 12.5 Types of Tests 207 12.6 Testing Methodology 211 12.7 Summary 213 Appendix A File System Construction Kit 215 A.1 Introduction 215 A.2 Overview 215 A.3 The Data Structures 216 A.4 The API 217 Bibliography 221 Index 225 Practical File System Design:The Be File System, Dominic Giampaolo BLANK page viii Practical File System Design:The Be File System, Dominic Giampaolo page ix Preface lthough many operating system textbooks offer high- level descriptions of file systems, few go into sufficient Adetail for an implementor, and none go into details about advanced topics such as journaling. I wrote this book to address that lack of information. This book covers the details of file systems, from low-level to high-level, as well as related topics such as the disk cache, the file system interface to the kernel, and the user-level APIs that use the features of the file system. Reading this book should give you a thorough understanding of how a file system works in general, how the Be File System (BFS) works in particular, and the issues involved in designing and implementing a file system. The Be operating system (BeOS) uses BFS as its native file system. BFS is a modern 64-bit journaled file system. BFS also supports extended file attri- butes (name/value pairs) and can index the extended attributes, which allows it to offer a query interface for locating files in addition to the normal name- based hierarchical interface. The attribute, indexing, and query features of BFS set it apart from other file systems and make it an interesting example to discuss. Throughout this book there are discussions of different approaches to solv- ing file system design problems and the benefits and drawbacks of different techniques. These discussions are all based on the problems that arose when implementing BFS. I hope that understanding the problems BFS faced and the changes it underwent will help others avoid mistakes I made, or perhaps spur them on to solve the problems in different or more innovative ways. Now that I have discussed what this book is about, I will also mention what it is not about. Although there is considerable information about the details of BFS, this book does not contain exhaustive bit-level information about every BFS data structure. I know this will disappoint some people, but ix Practical File System Design:The Be File System, Dominic Giampaolo page x x PREFACE it is the difference between a reference manual and a work that is intended to educate and inform. My only regret about this book is that I would have liked for there to be more information about other file systems and much more extensive perfor- mance analyses of a wider variety of file systems. However, just like software, a book has to ship, and it can’t stay in development forever. You do not need to be a file system engineer, a kernel architect, or have a PhD to understand this book. A basic knowledge of the C programming language is assumed but little else. Wherever possible I try to start from first principles to explain the topics involved and build on that knowledge throughout the chapters. You also do not need to be a BeOS developer or even use the BeOS to understand this book. Although familiarity with the BeOS may help, it is not a requirement. It is my hope that if you would like to improve your knowledge of file sys- tems, learn about how the Be File System works, or implement a file system, you will find this book useful. Acknowledgments I’d like to thank everyone that lent a hand during the development of BFS and during the writing of this book. Above all, the BeOS QA team (led by Baron Arnold) is responsible for BFS being where it is today. Thanks, guys! The rest of the folks who helped me out are almost too numerous to mention: my fiancee,´ Maria, for helping me through many long weekends of writing; Mani Varadarajan, for taking the first crack at making BFS write data to double- indirect blocks; Cyril Meurillon, for being stoic throughout the whole project, as well as for keeping the fsil layer remarkably bug-free; Hiroshi Lockheimer, for keeping me entertained; Mike Mackovitch, for letting me run tests on SGI’s machines; the whole BeOS team, for putting up with all those buggy versions of the file system before the first release; Mark Stone, for approach- ing me about writing this book; the people who make the cool music that gets me through the 24-, 48-, and 72-hour programming sessions; and of course Be, Inc., for taking the chance on such a risky project.