The Effects of Age on File System Performance

Masaryk University Faculty of Informatics The Effects of Age on File System Performance Bachelor’s Thesis Samuel Petrovič Brno, Spring 2017 Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Samuel Petrovič Advisor: Adam Rambousek i Acknowledgement I would like to thank my consultant Matúš Kocka for tremendous help and guidance during writing of this thesis. I would also like to thank Red Hat for collaboration and provision of necessary testing equipment. iii Abstract The purpose of this thesis is to research effects of age on file system performance. For the testing purpose, two automated tests were implemented, which use open source benchmarks fs-drift and FIO to simulate aging and to measure performance. Research was aimed at effects of aging on popular Linux file systems, XFS and ext4. Further research examined effects of underlying storage technology on performance of aged file systems. Implemented file system aging test was used to expand testing matrix of Red Hat Kernel Performance team. iv Keywords file system, performance, aging, benchmarking, fragmentation, storage, trim, fs-drift, XFS, ext4 v Contents 1 Introduction 1 2 Related work 5 2.1 Aging file system using real-life data .............5 2.2 Synthetic aging simulation ..................6 3 File system and storage devices 7 3.1 File systems ..........................7 3.1.1 XFS . .8 3.1.2 Ext4 . .8 3.2 Storage ............................9 3.2.1 Hard disk drive . .9 3.2.2 Solid state drive . .9 4 Environment setup and benchmark tools 11 4.1 Beaker ............................. 11 4.2 Storage generator ....................... 11 4.3 FIO .............................. 12 4.4 Fs-drift ............................ 12 4.4.1 Changes in original code . 13 5 Aging the file system 17 5.1 Fs-drift configuration ..................... 17 5.2 File system images ....................... 18 5.3 Implementation details of file system aging test ....... 19 6 Performance testing of aged file system 23 6.1 FIO configuration ....................... 23 6.2 Implementation details of aged file system performance test . 24 7 Results 27 7.1 Testing environment ..................... 27 7.2 Data processing ........................ 29 7.2.1 Fragmentation . 29 7.2.2 Evaluation of data generated by fs-drift . 30 7.3 Testing on images of aged file system ............. 30 vii 7.4 Performance of aged file system ................ 30 7.5 Differences between XFS and ext4 .............. 33 7.6 Performance accross different storage devices ......... 34 7.7 Testing with fstrim ..................... 37 8 Conclusion 41 Bibliography 43 A Reports 47 A.1 Test of medium utilisation using XFS on HDD ....... 47 A.2 Test of high utilisation using XFS on HDD ......... 51 A.3 Test of high utilisation using ext4 on HDD ......... 55 A.4 Test of XFS on SSD with regular trimming ......... 59 A.5 Test of ext4 on SSD with regular trimming ......... 63 A.6 Test of XFS on SSD without regular trimming ....... 67 B Examples 71 C Source code 75 viii List of Tables 7.1 Comparison of latencies of medium and high utilises file system. 33 7.2 Comparison of latencies gathered during testing with XFS and ext4 on SSD and HDD. 35 7.3 Comparison of latencies of gathered during aging test on HDD and SSD. 36 7.4 Differences in latencies of trimmed and not trimmed file system (XFS). 38 ix List of Figures 4.1 Normal distribution of file access 14 4.2 Moving random distribution 15 4.3 File size distribution as measured by Agrawal. [20] 16 4.4 Log-normal distribution of file size. 16 5.1 Activity diagram of file system aging test 21 6.1 Activity diagram of testing of aged file system. 25 7.1 Evolution of sequential read of XFS during testing of high and medium utilisation of HDD 32 7.2 Evolution of latency of create operation, in untrimmed file system (XFS) 39 A.1 Free space fragmentation of XFS during testing of medium utilisation of HDD 47 A.2 Size distribution of file extents of XFS during testing of medium utilisation of HDD 48 A.3 Usage of available space of XFS during testing of medium utilisation of HDD 49 A.4 Evolution of latencies of XFS during testing of medium utilisation of HDD 50 A.5 Evolution of free space fragmentation of XFS during testing of high utilisation of HDD 51 A.6 Size distribution of file extents of XFS during testing high utilisation of HDD 52 A.7 Usage of available space of XFS during testing of high utilisation of HDD 53 A.8 Evolution of latencies of XFS during testing of high utilisation of HDD 54 A.9 Evolution of free space fragmentation of ext4 during testing of high utilisation of HDD 55 A.10 Size distribution of file extents of ext4 during testing of high utilisation of HDD 56 A.11 Usage of available space of ext4 during testing of high utilisation of HDD 57 A.12 Evolution of latencies of ext4 during testing of high utilisation of HDD 58 xi A.13 Evolution of free space fragmentation of XFS during testing on SSD with regular trimming 59 A.14 Size distribution of file extents of XFS during testing on SSD with regular trimming 60 A.15 Usage of available space of XFS during testing on SSD with regular trimming 61 A.16 Evolution of latencies of XFS during testing on SSD with regular trimming 62 A.17 Evolution of free space fragmentation of ext4 during testing on SSD with regular trimming 63 A.18 Size distribution of file extents of ext4 during testing on SSD with regular trimming 64 A.19 Usage of available space of ext4 during testing on SSD with regular trimming 65 A.20 Evolution of latencies of ext4 during testing on SSD with regular trimming 66 A.21 Evolution of free space fragmentation of XFS during testing on SSD without regular trimming 67 A.22 Size distribution of file extents of XFS during testing on SSD without regular trimming 68 A.23 Usage of available space of XFS during testing on SSD without regular trimming 69 A.24 Evolution of latencies of XFS during testing on SSD without regular trimming 70 xii 1 Introduction File systems remain an important part of modern storage solutions. Large, growing databases, multimedia and other storage based appli- cations need to be supported by high-performing infrastructure layer of storing and retrieving information. Such infrastructure have to be provided by the operating systems (OS) in a form of file system. Originally, file system was a simple tool developed to handle com- munication between OS and physical device, but today, it is a very complex piece of software with a large set of tools and features to go with. Performance testing is an integral part of development cycle of most of produced software. Because of growing complexity of file systems, performance testing took of as an important part of file system evaluation. The standard workflow of performance testing is called out-of- box testing. Its principle is to run benchmark (i.e. testing tool) on a clean instance of OS and on a clean instance of tested file system [1]. Generally, this workflow presents stable and meaningful results, yetit only gives overall idea of file system behavior in early stage of its life cycle. File systems, as well as other complex software, are subjected to progressive degradation, referred to as software aging [2]. Causes of file system aging are many, but mostly fragmentation of free and used space and unclustered data [3]. This degradation cause problems in performance and functionality over time. Understanding of performance changes of aged file system can help developers toim- plement various preventions of aging related problems. Furthermore, aging testing can help developers with implementation of long-term performance affecting features. Researching of file system aging can by done in two stages, aging the file system and testing of the aged instance. In the first stage, observation about evolution of fragmentation and performance can be made. The second stage brings insight into state of performance of aged file system. In the first part, this thesis describes implementation of two flexible tests, which correspond with aforementioned stages. 1 1. Introduction In the first-stage testing, file system is aged using open-source benchmark fs-drift. While aging the file system, fragmentation of free space and latency of IO operations is periodically recorded. After the aging process, additional information about file systems block layout is recorded as well. Once this information is gathered, image of aged file system instance is created. To save space, only metadata of created file system is used, since content of created files israndom and therefore irrelevant. Created images are then used in second-stage testing. By using created images, higher stability and consistency of results is achieved. By reloading the file system image on the device, it is possible to bring the file system to the original, aged, state therefore for every performance test the same file system layout is available. In the second-stage testing, file system is brought to the aged state by reloading image corresponding with desired testing configuration. Before every performance test, some volume can be released from file system, so the performance test has space to test on. After environment initialization (sync, fstrim), the performance test can begin. Testing of fresh instance of file system is done as well to conclude ifany performance change occur. In the latter parts of this thesis, usage of developed tests on different configurations of file systems and storage is demonstrated.

Load more