Evaluation of Performance and Space Utilisation When Using Snapshots in the ZFS and Hammer File Systems
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF OSLO Department of Informatics Evaluation of Performance and Space Utilisation When Using Snapshots in the ZFS and Hammer File Systems Master thesis Martin Christoffer Aasen Oppegaard Network and System Administration Oslo University College Spring 2009 Evaluation of Performance and Space Utilisation When Using Snapshots in the ZFS and Hammer File Systems Martin Christoffer Aasen Oppegaard Network and System Administration Oslo University College Spring 2009 Abstract Modern file systems implements snapshots, or read-only point-in-time rep- resentations of the file system. Snapshots can be used to keep a record of the changes made to the data, and improve backups. Previous work had shown that snapshots decrease read- and write performance, but there was an open question as to how the number of snapshots affect the file system. This thesis studies this on the ZFS and Hammer file systems. The study is done by run- ning a series of benchmarks and creating snapshots of each file system. The results show that performance decreases significantly on both ZFS and Ham- mer, and ZFS becomes unstable after a certain point; there is a steep decrease in performance, and increase in latency and the variance of the measurements. The performance of ZFS is significantly lower than on Hammer, and the per- formance decrease is higher. On space utilisation, the results are linear for ZFS, up to the point where the system turns unstable. The results are not linear on Hammer, but more work is needed to reveal by which function. Acknowledgements I wish to thank the following list of individuals, which have made possible this final work to end a two years master’s program in network and system administration. First of all, my project adviser, Simen Hagen, for supporting me through this project, giving me advise on everything from typography to file system aging; my colleague, Theodoros Nikolakopoulos, for restarting my computers and fruitful discussions; my cat, Bruno, for being a cat; and finally, George Stobbart, Gabriel Knight and Garrett Quest. Martin Christoffer Aasen Oppegaard Contents 1 Introduction1 1.1 Motivation and Research Questions ................ 1 1.2 Related Work ............................. 2 1.3 Hypotheses .............................. 3 1.4 Type Conventions........................... 3 1.5 Thesis Outline............................. 4 2 Background5 2.1 File systems .............................. 5 2.1.1 What is a File System..................... 5 2.1.2 Files .............................. 6 2.1.3 File System Internals..................... 6 2.1.4 Volume Management..................... 8 2.1.5 RAID.............................. 9 2.1.6 Different Types of File Systems............... 10 2.2 Backup................................. 12 2.3 Snapshots ............................... 14 2.4 ZFS................................... 16 2.4.1 Storage Pool Model...................... 16 2.4.2 Dynamic Block Sizes..................... 17 2.4.3 Strong Data Integrity..................... 17 2.4.4 Integrated Software RAID.................. 17 2.4.5 Snapshots........................... 18 2.4.6 Mirroring ........................... 18 2.4.7 Command History...................... 19 2.4.8 Maximum Storage ...................... 19 2.5 Hammer................................ 19 2.5.1 Crash Recovery and History Retention .......... 19 2.5.2 Snapshots........................... 20 2.5.3 Dynamic Block Sizes..................... 21 2.5.4 Data Integrity......................... 21 2.5.5 Decoupled Front-end and Back-end............ 21 2.5.6 Mirroring ........................... 21 2.5.7 Maximum Storage ...................... 21 3 Methodology 23 3.1 Describing the Environment..................... 23 v CONTENTS 3.1.1 Running Services....................... 23 3.1.2 Warm or Cold Cache..................... 24 3.1.3 Aging the File System .................... 24 3.1.4 Location of Test Partition .................. 25 3.2 Running the Experiments ...................... 26 4 File System Benchmarking 29 4.1 Overview of Benchmarking..................... 29 4.2 Types of Benchmarks......................... 30 4.2.1 Macrobenchmarks ...................... 30 4.2.2 Microbenchmarks ...................... 30 4.2.3 Trace Replays......................... 31 4.3 Reviews of Benchmarks ....................... 31 4.3.1 Postmark............................ 31 4.3.2 Bonnie and Bonnie++ .................... 32 4.3.3 SPECsfs ............................ 32 4.3.4 IOzone............................. 33 4.3.5 Filebench ........................... 34 4.4 Selected Benchmarks......................... 35 5 Experiment 37 5.1 Hardware Specifications....................... 37 5.2 Software Specifications........................ 38 5.3 Hard Disk Drive............................ 38 5.3.1 Partitions ........................... 38 5.3.2 ZFS Properties ........................ 41 5.3.3 Hammer Mount Options .................. 41 5.3.4 Aging the File Systems.................... 43 5.4 System Environment......................... 44 5.4.1 Secure Shell and Logged in Users ............. 45 5.4.2 Solaris ............................. 45 5.4.3 DragonFly........................... 50 5.5 Experiments.............................. 54 5.5.1 Space Utilisation ....................... 54 5.5.2 Read and Write Performance................ 54 5.6 Snapshot Creation........................... 59 5.6.1 The Creation Process..................... 59 6 Results 63 6.1 Sample Size .............................. 63 6.1.1 Filebench ........................... 63 6.1.2 IOzone............................. 63 6.2 ZFS................................... 64 6.2.1 Read and Write Performance................ 64 6.2.2 Space Utilisation ....................... 71 6.3 Hammer................................ 71 6.3.1 Read and Write Performance................ 71 vi CONTENTS 6.3.2 Space Utilisation ....................... 80 6.4 Comparison of ZFS and Hammer.................. 80 6.4.1 Calculations.......................... 81 7 Discussion 83 7.1 Hypotheses .............................. 83 7.1.1 Read and Write Performance................ 83 7.1.2 Space Utilisation ....................... 84 7.2 ZFS................................... 84 7.2.1 Read and Write Performance................ 84 7.2.2 Space Utilisation ....................... 87 7.3 Hammer................................ 88 7.3.1 Filebench ........................... 88 7.3.2 Space Utilisation ....................... 91 7.4 How Many Snapshots?........................ 91 7.5 Comparison of ZFS and Hammer.................. 93 7.5.1 Calculations.......................... 93 8 Conclusion 95 Appendices 99 A Hardware Specifications 99 A.1 Solaris ................................. 99 A.2 DragonFly............................... 106 B Configuration 111 B.1 Auto-pilot ............................... 111 B.1.1 DragonFly........................... 113 B.1.2 Solaris ............................. 117 B.2 Filebench................................ 120 C Scripts 129 C.1 Automate Auto-pilot......................... 129 C.2 Auto-Pilot ............................... 132 C.2.1 Internal............................. 132 C.2.2 External ............................ 136 C.3 File System Ager ........................... 137 C.4 R..................................... 142 D Benchmark ports 153 D.1 DragonFly............................... 153 D.1.1 Filebench ........................... 153 D.1.2 IOzone............................. 155 D.1.3 Auto-pilot........................... 157 D.2 Solaris ................................. 166 D.2.1 IOzone............................. 166 D.2.2 Auto-pilot........................... 167 vii CONTENTS E Plots 173 Acronyms 179 Bibliography 183 viii List of Figures 2.1 Hard Disk Drive layout........................ 7 2.2 Network File System......................... 10 2.3 Clustered File System......................... 11 2.4 Copy-on-write............................. 14 2.5 Redirect-on-write........................... 15 3.1 Experiment Automation....................... 26 5.1 Experiment Setup........................... 38 6.1 Plot of ZFS: Filebench Operations.................. 64 6.2 Plot of ZFS: ECDF and QQ Plots of Filebench ‘Operations’ on 0 snapshots................................ 65 6.3 Plot of ZFS: ECDF and QQ Plots of IOzone ‘Write’ on 846 snap- shots .................................. 65 6.4 Plot of ZFS: ECDF and QQ Plots of Filebench ‘Operations’ on 1811 snapshots............................. 66 6.5 Plot of ZFS: Filebench ‘Efficiency’.................. 66 6.6 Plot of ZFS: Filebench ‘Throughput’ ................ 67 6.7 Plot of ZFS: Filebench ‘Read and Write’ .............. 67 6.8 Plot of ZFS: IOzone ‘Write and Re-write’.............. 69 6.9 Plot of ZFS: ECDF and QQ Plots of IOzone ‘Write’ on 0 snapshots 70 6.10 Plot of ZFS: ECDF and QQ Plots of IOzone ‘Write’ on 1811 snap- shots .................................. 70 6.11 Plot of ZFS: Space Utilisation .................... 72 6.12 Plot of ZFS: Snapshot Size...................... 72 6.13 Plot of Hammer: Filebench ‘Operations’.............. 73 6.14 Plot of Hammer: ECDF and QQ Plots of Filebench ‘Operations’ on 0 snapshots............................. 74 6.15 Plot of Hammer: ECDF and QQ Plots of Filebench ‘Operations’ on 840 snapshots ........................... 74 6.16 Plot of Hammer: Filebench ‘Read and Write’ ........... 75 6.17 Plot of Hammer: IOzone ‘Write and Re-write’........... 76 6.18 Plot of Hammer: ECDF and QQ Plots of IOzone ‘Write’ on 0 snapshots................................ 76 6.19 Plot of Hammer: ECDF and QQ