Ext4 File System Performance Analysis in Linux Environment
Total Page:16
File Type:pdf, Size:1020Kb
Recent Advances in Applied & Biomedical Informatics and Computational Engineering in Systems Applications Ext4 file system Performance Analysis in Linux Environment BORISLAV DJORDJEVIC, VALENTINA TIMCENKO Mihailo Pupin Institute University of Belgrade Volgina 15, Belgrade SERBIA [email protected] , [email protected] Abstract: - This paper considers the characteristics and behavior of the modern 64-bit ext4 file system under the Linux operating system, kernel version 2.6. It also provides the performance comparison of ext4 file system with earlier ext3 and ext2 file systems. The testing procedures and further analysis are performed using the Postmark benchmark application. Key-Words: - Linux, FileSystems, ext4/ext3/ext2, journaling, inodes, file block allocation, disk performances 1 Introduction large number of file operations (creation, reading, Ext4 file system is the ext3 file system successor. It writing, appending and file deletion). It comprises is supported on today’s most popular Linux large number of files and data transactions. distributions (RedHat, Ubuntu, Fedora). In contrast Workload can be generated synthetically, as a result to the 32-bit ext3 file system [1] [2], that has only of applying benchmark software, or as a result of working with some real data applications. some features added to its predecessor ext2 and Workload characterization is a hard research maintains a data structure as in the ext2 file system, problem, as arbitrarily complex patterns can the ext4 file system has integrated more substantial frequently occur. In particular, some authors chose changes compared to ext3. Ext4 has improved data to emphasize support for spatial locality in the form structure and enhanced features, which brought of runs of requests to contiguous data, and temporal more reliability and efficiency. It is a 64-bit, locality on the form of bursty arrival patterns. allowing the file size of up to 16 TB [3], [4], [5], Some authors [7] model three different arrival [6]. Great efforts that have been put into the process processes: of ext4 development resulted in new features and • Constant: the interarrival time between requests techniques: extents, journaling check summing, is fixed simultaneous allocation of multiple units, delayed • Poisson: the interarrival time between requests allocation, faster fsck (file system check), and online is independent and exponentially distributed defragmentation of small and large size directories. • Bursty: some of the requests arrive sufficiently This way formed folders can have up to 64,000 close to each other so that their interarrival time files. is less than the service time. 2 Problem Formulation The main objective of this study was to examine the 2.2 Access time for ext4 and ext3 file systems performance of modern file system, ext4, and to In this subchapter we will present the access time compare its characteristics and performances with value comparison for 64-bit ext4 and its 32-bit its predecessor, ext3 and ext2 file systems. Ext4 file predecessors, ext3 and ext2. Expected access time system includes many improvements, especially for file system, for specified workload, comprises when comparing to ext3 file system, but being the following components: 64-bit file system, ext4 is much cumbersome. This characteristic opens the possibility to obtain some = directoryEaccessFSE op]_[]_[ + unexpected results after performing the test metadataE op]_[ + (1) procedures. + accessfiledirectEmngmlistfreeE ]__[]__[ 2.1 Workload specification Where FS_access represents total time that File-based workloads are designed for testing workload needs to carry out all operations, procedures of file systems, and usually consist of directory_op represent total time for all operations ISBN: 978-1-61804-028-2 288 Recent Advances in Applied & Biomedical Informatics and Computational Engineering in Systems Applications related to working with directories (search, new mechanism. These components are: (1) the seek object creation and delete of existing objects), time (SeekTime), which is the amount of time metadata_op represents total time needed for needed to move the disk heads to the desired performing metadata operations (metadata search cylinder; (2) the rotational latency operations, metadata cache forming, and metadata (RotationalLatency), which is the time that the objects modifications), free_list_mngm presents platter needs to rotate to the desired sector; and (3) total time needed for performing operations with the transfer time (TT) same as Tmedia time, which free blocks and inode lists (file expansion/shrinking, is the time needed to transfer the data from the disk creation of new objects or deletion of existing mechanism to the next higher level. Since these are objects), and direct_file_access is the time required typically independent variables, we can approximate for direct file blocks operations (read/write). the expected value of the disk mechanism service Directory operations, metadata operations and time as: direct file accesses are cache based accesses and their performances directly depend on cache. Under E[__][] Mech Service Time= E SeekTime + these conditions, the expected effective access time E[RotationalLatency]+ E [ TT [ request _size ]] would be: (5) E Cache Service Time [__] ≈ The transfer time (TT) is a function of two E[_] request size parameters, the Transfer_rate, which is the transfer Hit_* prob + (2) Cache __Transfer Rate rate of data off/onto the disk and E[request_size]. Function can be approximated as: Miss_[__] prob⋅ E Mech Service Time TT[_] request size = Mech_Service_Time is the total time needed to (6) perform disk data transfer. E[_]/[_] request size E Tranfer rate In the case of cache-miss, performances are strictly Variations will occur as a result of track and dependent on disc characteristics and consist of cylinder switches and different track sizes in number of time based components: different zones on the disk. Mech __Service time = Taccess _ time The SeekTime can be approximated as the (3) following function of dis, which is the number of + Tmedia _ time + TInterface _ time cylinders to be travelled: Where Taccess_time is total time needed for ⎧0 dis = 0 ⎫ mechanical components of disk transfer, ⎪ ⎪ (7) Tmedia_time is total time needed for write/read SeekTime[]dis = ⎨a+ b dis 0 p dis≤ e⎬ ⎪ ⎪ operstions from disc medium, and Tinterface_time ⎩c+ d ⋅ dis disf e ⎭ is total time needed for read/write operations from disc cache buffer. where a, b, c, d, and e are disk -specific parameters. Taccess _ time = CommandOverhead If we assume that the requests are randomly + SeekTime + (4) distributed on the sectors of the given cylinder using a uniform distribution, than: SettleTime + RotationalLatency E[Rotational _]latency = revolution _time / 2 (8) Where CommandOverhead time is time required for disc commands decoding, SeekTime is time 2.4.1 Expected behaviour of ext4/ext3 file systems needed for disc servo system positioning, Starting from formula (1) which is appropriate for SettleTime is time required for disc head non-journaling file systems, for the case of stabilization, and RotationalLatency is time wasted journaling based file systems, as ext4 and ext3 are, on disc rotation latency. one additional member has to be added, E[ journaling _]time , which can increase There are three dominant components, whose sum can be presented as Mech Service Time, which reliability but can also have negative impact to the is the service time for a request related to the disk global performances: ISBN: 978-1-61804-028-2 289 Recent Advances in Applied & Biomedical Informatics and Computational Engineering in Systems Applications = directoryEaccessFSE + metadataEop op]_[]_[]_[ (9) 3.1 Postmark Test1 (small files) + mngmlistfreeE ]__[ + Files used for the purpose of performing this testing filedirectE accesses + E[]__[ journaling time]_ procedure are relatively small, ranging from 1K to 100K. Appropriate Postmark configuration used for journaling_time is total time needed for this test was: file size range from 1000 to 100000, performance of journaling operations (metadata number of generated files was 4000, and number of writes to the log, and metadata log discharge). performed transactions was 50000. Performance At first glance we could assume that perhaps results for each file operation (creation, journaling techniques will decrease file system creation/alone, creation/with transactions, reading, performances, but the fact is that this statement can appending, deleting, deleting/alone, deleting/with not be taken for sure. Instead of math simulated transactions) are given on the Figures 3.1a and 3.1b. workload, we have applied synthetically generated Operations delete/alone and create/alone are workload in Postmark benchmark environment. performed in special benchmark phases, and do not suppose transactions. Though, they do have high 3 Problem Solution values. For purposes of testing the chosen file systems we have used the Postmark Benchmark [8] software. It File Operation - results simulates the Internet Mail server load. Postmark del/with tr creates large initial set (pool) of randomly generated deleting ext4 files, and saves them in any location in the file appending ext3 system. Over this set of files Postmark and the read ext2 operating system perform operations of creation, creat/with tr reading, registration and deletion of files and creation determine the time required to perform these files/s operations. The operations are performed