Reasoning About POSIX File Systems
Total Page:16
File Type:pdf, Size:1020Kb
Imperial College London Department of Computing Reasoning About POSIX File Systems Gian Ntzik September 2016 Supervised by Philippa Gardner Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing of Imperial College London and the Diploma of Imperial College London 1 2 Declaration I herewith certify that all material in this thesis which is not my own work has been properly acknowl- edged. Gian Ntzik 3 4 Copyright The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or trans- mit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work. 5 6 Abstract POSIX is a standard for operating systems, with a substantial part devoted to specifying file-system operations. File-system operations exhibit complex concurrent behaviour, comprising multiple actions affecting different parts of the state: typically, multiple atomic reads followed by an atomic update. However, the standard's description of concurrent behaviour is unsatisfactory: it is fragmented; con- tains ambiguities; and is generally under-specified. We provide a formal concurrent specification of POSIX file systems and demonstrate scalable reasoning for clients. Our specification is based on a concurrent specification language, which uses a modern concurrent separation logic for reasoning about abstract atomic operations, and an associated refinement calculus. Our reasoning about clients highlights an important difference between reasoning about modules built over a heap, where the interference on the shared state is restricted to the operations of the module, and modules built over a file system, where the interference cannot be restricted as the file system is a public namespace. We introduce specifications conditional on context invariants used to restrict the interference, and apply our reasoning to lock files and named pipes. Program reasoning based on separation logic has been successful at verifying that programs do not crash due to illegal use of resources, such invalid memory accesses. The underlying assumption of separation logics, however, is that machines do not fail. In practice, machines can fail unpredictably for various reasons, such as power loss, corrupting resources or resulting in permanent data loss. Critical software, such as file systems and databases, employ recovery methods to mitigate these effects. We introduce an extension of the Views framework to reason about programs in the presence of such events and their associated recovery methods. We use concurrent separation logic as an instance of the framework to illustrate our reasoning, and explore programs using write-ahead logging, such a stylised ARIES recovery algorithm. 7 8 To my loved ones forever lost, but in my heart forgotten not. 9 10 Contents 1. Introduction 19 1.1. Contributions and Dissertation Outline.......................... 21 2. File Systems and POSIX 23 2.1. File Systems.......................................... 23 2.1.1. Structure and Implementation........................... 24 2.2. The POSIX Standard.................................... 27 2.2.1. POSIX File-System Structure and Interface Overview.............. 28 2.3. Concurrency in POSIX File Systems............................ 32 2.3.1. Thread Safety and Atomicity............................ 32 2.3.2. Path Resolution and Client Applications...................... 36 2.3.3. Atomicity of directory operations.......................... 38 2.3.4. Non-determinism................................... 39 2.3.5. File Systems are a Public Namespace....................... 39 2.4. Conclusions.......................................... 40 3. Reasoning about Concurrent Modules 42 3.1. Separation Logic....................................... 42 3.2. Concurrent Separation Logic................................ 44 3.3. Abstract Separation Logic and Views............................ 45 3.4. Fine-grained Concurrency.................................. 46 3.5. Atomicity........................................... 47 3.5.1. Contextual Refinement and Separation Logic................... 48 3.5.2. Atomic Hoare Triples................................ 49 3.6. Conclusions.......................................... 51 4. Formal Methods for File Systems 52 4.1. Model Checking........................................ 52 4.2. Testing............................................ 52 4.3. Specification and Refinement to Implementation..................... 53 4.4. Program Reasoning...................................... 53 5. Modelling the File System 55 6. Concurrent Specifications and Client Reasoning 59 6.1. Specifications......................................... 59 6.1.1. Operations on Links................................. 59 11 6.1.2. Operations on Directories.............................. 68 6.1.3. I/O Operations on Regular Files.......................... 70 6.2. Client Reasoning....................................... 75 6.3. Extending Specifications................................... 84 6.3.1. Symbolic Links and Relative Paths......................... 84 6.3.2. File-Access Permissions............................... 87 6.3.3. Threads and Processes................................ 90 6.4. Conclusions.......................................... 91 7. Atomicity and Refinement 93 7.1. Specification Language.................................... 93 7.2. Model............................................. 97 7.3. Operational Semantics.................................... 105 7.4. Refinement.......................................... 107 7.4.1. Contextual Refinement................................ 107 7.4.2. Denotational Refinement.............................. 108 7.4.3. Adequacy....................................... 111 7.5. Refinement Laws....................................... 112 7.6. Abstract Atomicity...................................... 117 7.7. Syntax Encodings....................................... 129 7.8. Conclusions.......................................... 130 8. Client Examples 132 8.1. CAP Style Locks....................................... 132 8.2. Coarse-grained vs Fine-grained Specifications for POSIX................. 134 8.2.1. Behaviour with Coarse-grained Specifications................... 135 8.2.2. Behaviour with POSIX Specifications....................... 140 8.2.3. Correcting the Email Client-Server Interaction.................. 149 8.3. Case Study: Named Pipes.................................. 156 8.3.1. Specification..................................... 156 8.3.2. Implementation.................................... 159 8.3.3. Verification...................................... 166 8.4. Conclusions.......................................... 187 9. Fault Tolerance 188 9.1. Faults and Resource Reasoning............................... 188 9.2. Motivating Examples..................................... 190 9.2.1. Naive Bank Transfer................................. 190 9.2.2. Fault-tolerant Bank Transfer: Implementation.................. 190 9.2.3. Fault-tolerant Bank Transfer: Verification..................... 191 9.3. Journaling File Systems................................... 196 9.4. FTCSL Program Logic.................................... 196 9.4.1. Unsound Rules.................................... 199 12 9.5. Example: Concurrent Bank Transfer............................ 199 9.6. Semantics and Soundness.................................. 202 9.6.1. Fault-tolerant Views................................. 202 9.6.2. Fault-tolerant Concurrent Separation Logic.................... 202 9.7. Case Study: ARIES..................................... 203 9.8. Related Work......................................... 215 9.9. Conclusions.......................................... 216 10.Conclusions 217 10.1. Future Work......................................... 217 10.1.1. Mechanisation and Automation........................... 218 10.1.2. Total Correctness................................... 218 10.1.3. Helping and Speculation............................... 218 10.1.4. File-System Implementation and Testing...................... 219 10.1.5. Fault-tolerance of File Systems........................... 219 10.1.6. Atomicity? Which Atomicity?........................... 219 A. POSIX Fragment Formalisation 228 A.1. Path Resolution........................................ 230 A.2. Operations on Links..................................... 230 A.3. Operations on Directories.................................. 234 A.4. I/O Operations on Regular Files.............................. 235 A.5. I/O Operations on Directories................................ 236 B. Atomicity and Refinement Technical Appendix 238 B.1. Adequacy Addendum.................................... 238 B.1.1. Operational traces are denotational traces..................... 246 B.1.2. Denotational traces are operational traces..................... 249 B.2. Proofs of General Refinement Laws............................. 256 B.3. Primitive Atomic Refinement Laws Proofs......................... 268 B.4. Proofs of Abstract Atomic Refinement Laws........................ 274 B.5. Proofs of Hoare-Statement Refinement Laws........................ 288 C. Fault-Tolerant Views Framework 291 C.1. General Framework and Soundness............................. 291 C.2. FTCSL...........................................