IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

File Systems Performance on Solid-State Drive

Julian Fejzaj1, Kristo Kapshtica2, Denis Saatciu3, Igli Tafa4 and Endri Xhina5

1 Department of Informatics, Faculty of Natural Sciences, University of Tirana [email protected]

2Computer Enginnering Department, Faculty of Information and Technology, Polytechnic University2

3 Department of Informatics, Faculty of Natural Sciences, University of Tirana [email protected]

4 Department of Informatics, Faculty of Natural Sciences, University of Tirana [email protected]

5Department of Informatics, Faculty of Natural Sciences, University of Tirana [email protected]

Abstract drive used NAND-based that means is Most people have to make a decision for their computer to non-volatile so you can switch off it and the disk can choice between Solid State Drive and Hard Disk Drive as a “remind” all the data stored on it after hundred years data storage. But, who would not like a data storage with unlike of the HDD wich can lost data after olnly a few several ways to speed up their computer moreover with low years. As a conlusion the data storage on SSD can live enough price such as SSD (Solid-state Drive) that is a data more than you. Based on their architechture SSD storage device wich has the same functionality as a hard disk drive (HDD). But we need to know moreover I’m interested to consume less power(less 2W vs 6W for an HDD) know wich of file systems have better performance on SSD because not use electricity to rotate the platters like a architectures. Aprops I will demonstrate this using hard disk and consequently no more heat and noise (that and Linux file systems using means more battery for notebook). Usage of NAND- describet below . It’s important to know wich we based flash memory store data withour power. For must use for our SSD. applications requiring fast access, but not necessarily Keywords: linux file systems, solid state drive, and ssd, data persistence after power loss. Such devices may bonnie++ employ separate power sources, such as batteries, to maintain data after power loss. As I said previously SSD 1. Introduction it’s a memory cheap wich is constructed from integrated circuits (controller, cache and capacitor)with an A solid-state drive is mechanically, electrically and interface conector and this make the SSD more compatible with a conventional hard drive[4]. lighter than HDD wich contains platter(rotating SSD vs HDD The difference is that the storage is not disk,spindle and motor)[8][9][10]. On the other side magnetic (HDD) or optical (CD) but solid state HDD catches a capacity of 500MB-2TB better than semiconductor such as RAM, PRAM or other SSD with not larger than 512 MB capacity. But the electrically erasable RAM. This provides faster access biggest disadvantages for solid state drive is a huge than hard drive because the SSD data can be randomly difference in price, so the SSD costs 1$ per gigabyte accessed in the same time whatever the storage location compared with 0.075$ per gigabyte on HDD[5][6]. that they may have[4]. Solid state drive store However solid-state drives is being introduced powerful information in microchip like a memory stick, so not on the territory raised strong for decades by hard drive. have moving parts such as hard drive disk that used a As a conclusion if your money are secondary and your mechanical arm wich read information fromstorage computer performance, fast bootable ect. are primary I platter with read and write head moving around. This suggest you to use Solide-State Drive. architecture make the SSD faster than HDD. Solid state IJCSMS www.ijcsms.com 6

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

2. Related Works

As we know except linux operating system in our case there are some other Operating Systems such as Windows, Mac, ect. Each of these have different file systems like fat, fat32, for windows and hfs+ also on mac. The different operating systems have different ways to test wich of their file systems heve better performance on solid-state drive disk. One of this ways is the work [12] made by Patrick Schmid and Achim Roos who have tested windows file systems like fat32, ntfs and exFAT with various programs wich are ‘as ssd’, ‘cristal disk mark’, ‘iometer’ and ‘pcmark 7’, more over they have been used two different ssd for great accuracy. As a conclusion almost all of these programs have the same results for file systems regardless of different SSDs. Another work but this time in lunux operating system wich gave me a great assistance is the benchmarking test made by Phoronix Media[7] with their program called PTS (phoronix test sute) wich work on linux and serves to test the computer hardware. In this work they have tested the most usable file systems on linux like a , , xts and raiserfs concerning on read/write performance of data, creation Figure 1. Kernel-user architecture and deletion of files or data, synchronisation, number of threats, disk transactions ect. Phoronix also have been On user space are located the application and dhe glibc use two SSDs to increase accurancy of the test and both (provides the user interface for file system call: open, SSDs had almost the same results in all the above read, write, close). The system call interface work like a test.[7] Differently from phoronix I have been use switch: make a relationship and pass system calls from bonnie++ for testing my Crucial SSD because is an user space to the appropriate endpoints in kernel open source and more specific for linux file systems on system[1]. (VFS) exports a group of storage disk. interfaces and separate them to the individual file systems wich proceed differently from one other. There are two caches for file systems object inodes and 3. Theoritical Phase directory wich provides a stack of file system objects used recently. Individual file systems like a ext4, nilfs, We know that the most file system code exists in ect. expots a grup of interfaces that is used by VFS. the kernel space and a part on user space. Below is The buffer cashe managed as group of LRU(least shown the architecture of relationship between file recently uset) lists. This enable requests between file systems in kernel and user space. systems and device drivers that they manipulate such as read and write request for faster access.

3.1. File Systems

A file system is an organization of data and metadata that an operating systems uses to keep track of file on a storage device. The system used in this paper is Linux and in this case bring to my mind that phrase: "On a system, everything is a file; if something is not a file, it is a process." The proc file system (pseudo- filesystem which provides an interface to kernel data structure) is mounted on /proc we can find it in the file IJCSMS www.ijcsms.com - 7 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

/proc/filesystem wich file systems currently supports XFS is a journaling filesystem, developed by SGI to 64- our kernel. In order to use them, we have to mount it. bit file system. Was designed to maintain high performance with large files, that was integrated into Linux in kernel 2.4.20. JFS is a journaling filesystem, developed by IBM to work in high performance environments, that was integrated into Linux in kernel 2.4.24. ntfs include a number of userspace called ntfsprogs such as mkntfs, ntfsundelete and btrfs is a new copy on write filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration nilfs is a new implementation of a log-structured file system (LFS) supporting continuous snapshotting[2][3]. can support access control list (ACLs), was designed by Sun Microsystem and include file system and logical volume manager

4. Experimental Phase

In this phase I demonstrate an experiment wich will test Figure 2. Linux file systems same of most available file systems and wich of those have a better performance on Solid-State Drive Below I present a short description for some of the most available filesystems: A. Hardware environment minix is the filesystem used in the Minix operating system. The oldest but the most reliable. This file Hardware environment wich I used has these parameters: systems is quite limited in features. It remains useful for  CPU: Intel(R) Core(TM) i3 CPU 2.4GHz (4 floppies and RAM disks. CPUs) ext is a modification of the minix that lifts the limits  RAM: 4GB Kinston on the filesystem size. Its not very popular but work  SSD: 128 GB Crucial, SATA-3 well. Has been removed from the kernel (in 2.1.21). is the most featureful and the high performance B. Software environment disk filesystem for fixed disks as well as removable media. Was designed as an extension of ext to be easely As I mentioned before the operating system that I have compatible that means the new version of the filesystem been used is Ubuntu 12.04 LTS and Bonnie++ as a doesn’t order remaking the existing filesystem . ext2 benchmark tools wich compare linux file systems offers the best performance in speed and CPU usage. performance on SSD. Bonnie works in two ways: has all characteristics of ext2 but just has added Testing the IO and trying to simulate how some journaling wich improves performance and recovery hardware, database or application works and the second time when system crashed. It’s more populare then ext2 is test of reading, creating and deleting many small ext4 is a advanced level of the ext3 with a set of files.[11] I have installed this programe using the code upgrades wich includ substantial performance and below: reliability enhancements for supporting large increases sudo apt-get install bonnie++ in volume, file (64 bit), and directory size limits. after installation define where will be stored results of Reiserfs one of the most powerful file system designed the standard Bonnie++ test with this command: by Hans Reiser is a journaling filesystem wich reduce bonnie -d /dir/on/ssd/partition -n loss of data. Journaling is a mechanism whereby saved Now run the command for read-write performance test all of operations wich are performed. This allows the in the directory where my partition is mounted in my file system to rebuild itself after demage. case dev/sda6/ wich is [root@slashroot2 ~]# bonnie -y - s and reveals the following results: IJCSMS www.ijcsms.com - 8 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

Figura 4. Random seek test Figure 3. Read –write performance test

The final line shows sequential output-input and random As we can see from the chart the best file systems for seek. I need to use –y to synchronize data on SSD and – random seeks is ext4 wich is better than and xfs s to specify wich file size used to test. The file systems with the same result. Those file systems are the best for was created before each test on the prepared partition low-latency systems, while other file systems are and some data was copied in a small part of this slowest with btrfs 10% slower, then ext3 and ext2. Zfs partition to simulate used filesystem. is the worst file system 6 time slower from ext4. The second test is for one of the most importante performances wich is read/write performance and the C. The Best File System results is: As I said above Bonnie++ test IO, on Solid-state drive preferred IO scheduler is “noop”, so in kernel is not available any IO scheduling, in this case we count on scheduling logic in the hardware the good choice for SSD disks. Some of the file systeme I cant use, such as nilfs because is not ready to use yet and I cant tested it but is with a promises future and raiser4 cant supported from ubuntu kernel. Now let we see wich of those linux file systems have better performance compare with each other on Solid-state drive with this scheduler choosen. This benchmark performance test was performed for most available file systems on Ubuntu 12.04 and the tests that I have made were for the most importat situation wich we needed. For all of test the value are separated in thousands operations per second and bigger is the best The first test is about random seeks wich is important for low latency systems and the benchmark Figure 5. Write performance performace result is:

IJCSMS www.ijcsms.com - 9 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

creating file and not suggested to use for the kind of SSD wich are usable to create files.

Figure 6. Read performance Figure 8. Deleting file

If we look above, both charts are almost equale as On this chart is impressive to see ext2 file system much regards reading and writing of data. So btrfs file system better then other for the first time on all of is much better for reading and writing data on SSD than benchmarking test, moreover is seven time faster then other parts of file systems, after btrfs comes reiser and every file systems, this is unbelievable. Even in this ext4 wich are good enough. Only ext2 and zfs have the chart btrfs is one of the best file systems followed by worst performance notably on random read wich is ext3, 4 and by reiser. Xfs and zfs are in the end On the essential in modern computers usage. Another test is other side btrfs is the best again for randomly deletion about creation and deletion of files, more valuable for files. After this ext3 and ext4 are almos equal the disk storage, usb storage or other strorge. difference is just about 10%. While two other xfs and The chart is: zfs as always are the slowest file systems.

5. Conclusion

Solid-state drive is being emulated hard disk drive last years by reason of their advantages wich I mentions at the beginning of this article. But it’s important to know wich file systems we should use, in this case on Ubuntu 12.04 linux kernel for a better performance of the SSD.Based on charts just two should consider the best of all file systems that I have used: btrfs and ext4 filesystems perform excellent on SSD disk making consumers feeling happy about speed that provides compare with normal disk. Btrfs is better for reading/writing of data and deletion of files, while ext4 is better for randomly seek and cration of files. Raiserfs is good enough and the most mature of them. Xfs and ext3 are almost good in some of the test such as random Figure 7. Creating file seek and reading/writin tests. While ext2 and zfs has the worst results and is not a good option al all. Ext4 is Ext4 in both sequentially and random creating files is being the defult file systems for ubuntu 9.1 which have the fastest file system followed by ext3 in sequentially an SSD disc, this file system boot really fast. But btrfs is creating and by btrfs in random creating. Reiser is good being even faster than ext4 that means an excellent enough, while other file systems are really slow in future because of features that given like a: nice internal algorithms and structures, compression, integrated IJCSMS www.ijcsms.com - 10 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com volume management, snapshot file system and roll back to it or explore version file, mirroring and stripping. Eventually after this benchmarking I’m going to migrate my system partition in this virtual machine to BTRFS file system and use this on all my SSD disc.

6. Future Work

I was interested to choose this work because of large usage that is taking SSD in front of HDD which has decades that leads the market even though it costs more and I would like to know wich file system had the best performance on Solid- state drive disk. In my next work I would try wich scheduler works better on file systems that I used in this work always on Linux kernel.

References

[1] Anatomy of the Linux file system - M. Tim Jones .

[2] The Linux Programming Interface - Michael

Kerrisk.

[3] eMMC/SSD File System Tuning Methodology - M.

Filippov, K. Kozhevnikov, D. Semyonov.

[4] What’s an SSD(editor 2002) - Zsolt Kerekes.

[5] Solid State Drive vs. Hard Disk Drive Price and

Performance Study - Vamsee Kasavajhala.

[6] Hard disk drives vs. solid-state drives: Are SSDs

finally worth the money? - Lucas Mearian.

[7] Phoronix media - www.phoronix.com

[8] Bertrand Dufrasne, Kerstin Blum, Uwe Dubberke -

Introducing Solid State Drives

[9] Your Guide To Solid State Drives - Lachlan Roy

[10] SSD vs HDD - Why Solid State Drive – OCZ

[11] Bonnie++ - Googluxe [12] Does Your SSD's File System Affect Performance? - Patrick Schmid, Achim Roos

IJCSMS www.ijcsms.com - 11 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

APENDIX

Standart code used on Bonnie++ to display linux file b_bonnie () { if [ -x "`which bonnie++`" ]; then systems performance benchmarking test for $DEBUG mkdir -p $DIR/bonnie Solid-state drive: 1>/dev/null $DEBUG chown -R nobody:nogroup $DIR/bonnie a) Configuration 1>/dev/null $DEBUG bonnie++ -f -s "$SIZE"m -r 0 -n \\ file that I want to test 100:10240:10 -m $i-$j -u nobody:nogroup -q -d DEV="/dev/sda6" $DIR/bonnie 1> $RESULTS/bonnie-$i-$j.csv \\ the path for saving results 2>>$RESULTS/error_bonnie.log RESULTS="/root/benchmark_results" else \\ File system wich I set for testing by order echo "Bonnie++ skipped!" FS="btrfs ext2 ext3 ext4 reiserfs xfs zfs " fi} \\ Bonnie was used for testing \\ this is test for linux BM="bonnie generic" b_generic () { \\ Comand to display graph images of the results Bonnie -d /dir/on/ssd/partition -n 950:450 TARBALL="/root/trash/linux-3.5.7.tar.bz2" \\ size of file for testing (MB) $DEBUG mkdir -p $DIR/generic/tarball 1>/dev/null SIZE="1024" $DEBUG cp $TARBALL $DIR/generic/linux.tar.bz2 \\loopdevice cd $DIR/generic 1>/dev/null /usr/bin/time LOOP="/dev/loop5" -f "TAR -xjf @ $i-$j: %e real,%U user,%S sys" $DEBUG tar -xjf linux.tar.bz2 -C tarball/ b) Function 2> $RESULTS/generic-$i-$j.txt $DEBUG sync 1>/dev/null \\ data mount on fs /usr/bin/time -f "FIND . @ $i-$j: %e real,%U m_btrfs () { $DEBUG mkbtrfs-qff $1 1>/dev/null user,%S sys" MOUNTOPTIONS="-o noatime,notail" $DEBUG find tarball 1>/dev/null m_ext2 () { $DEBUG mkfs.ext2 -Fq $1 2>> $RESULTS/generic-$i-$j.txt MOUNTOPTIONS="-o noatime"} $DEBUG sync 1>/dev/null m_ext3 () { $DEBUG mkfs.ext3 -Fq $1 /usr/bin/time -f "CP @ $i-$j: %e real,%U MOUNTOPTIONS="-o user,%S sys" noatime,data=ordered"} $DEBUG cp -a tarball tarball_copy 1>/dev/null m_ext4 () { $DEBUG mkfs.ext4 -Fq $1 2>> $RESULTS/generic-$i-$j.txt MOUNTOPTIONS="-o noatime"} $DEBUG sync 1>/dev/null m_reiserfs () { $DEBUG mkreiserfs -qff $1 /usr/bin/time -f "RM -rf @ $i-$j: %e real,%U 1>/dev/null MOUNTOPTIONS="-o user,%S sys" noatime,notail" $DEBUG rm -rf tarball 1>/dev/null m_xfs () { $DEBUG mkfs.xfs -fq $1 2>> $RESULTS/generic-$i-$j.txt MOUNTOPTIONS="-o noatime" $DEBUG sync 1>/dev/null m_zfs () { $DEBUG mkfs. -q $1 1>/dev/null /usr/bin/time -f "DD "$SIZE"MB to disk @ $i-$j: MOUNTOPTIONS="-o noatime,integrity" %e real,%U user,%S sys" \\ before benchmark $DEBUG \dd if=/dev/zero of=test.img b_pre () { # DEBUG! bs=1M count="$SIZE" 1>/dev/null #set -x 2>> $RESULTS/generic-$i-$j.txt $DEBUG dmesg -n1 $DEBUG sync 1>/dev/null $DEBUG mount $MOUNTOPTIONS -t "$i" "$1" /usr/bin/time -f "DD "$SIZE"MB from disk @ $i-$j: "$DIR" %e real,%U user,%S sys" $DEBUG dmesg -n7 $DEBUG \dd if=test.img of=/dev/null #set +x} bs=1M count="$SIZE" 1>/dev/null \\ benchmarking 2>> $RESULTS/generic-$i-$j.txt IJCSMS www.ijcsms.com - 12 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

$DEBUG sync 1>/dev/null if [ -x "`which bonnie++ 2>/dev/null`" ]; then cd $HOME && $DEBUG rm -rf $DIR/generic} echo "One or more of the following packages are \\ After benchmarking missing:" echo " - bonnie++" b_post () { exit 1fi $DEBUG sync 1>/dev/null \\ is DEV or LOOP mounted? $DEBUG sleep 5s 1>/dev/null if [ -n "`mount | grep $DEV`" -o -n "`mount | grep $DEBUG umount $DIR $LOOP`" ]; then echo ""} echo "" && echo "" \\ Setup loop-aes echo "Something went wrong!" s_crypto_pre () { echo "$DEV or $LOOP is mounted, #if [ ! -b $LOOP ]; then please fix this before running this script again." $DEBUG dmesg -n1 echo "" $DEBUG rmmod loop_blowfish exit 1else loop_serpent loop_twofish loop 2>/dev/null echo "Beginning tests..."fi $DEBUG modprobe loop \\ $DIR creations lo_prealloc="$PREALLOC" $DEBUG mkdir -pm 0755 $DIR 1>/dev/null $DEBUG modprobe $module $DEBUG mkdir -pm 0755 $RESULTS/old $DEBUG dmesg -n7 1>/dev/null $DEBUG rm -rf $RESULTS/old/* 1>/dev/null \\ put keyfile and password $DEBUG mv $RESULTS/* $RESULTS/old echo "Pd1eXapMJk0XAJnNSIzE" > $PWFILE >/dev/null 2>&1 - | head -n 66 | tail -n 65 | gpg -q --passphrase-file \\ setup loop $PWFILE --symmetric -a > $KEYFILE if [ -n "$ALG" ]; then head -c 2925 /dev/urandom | uuencode –m s_crypto_pre - | head -n 66 | tail -n 65 | gpg -q --passphrase-fd 3 else echo "loop-aes is NOT used!" --symmetric -a > $KEYFILE 3<$PWFILE} fi \\ create file systems \\ command for attaching loop-device for i in $FS; do s_crypto () { $DEBUG losetup -p 3 -e $j –K echo “------$i ------$KEYFILE $LOOP $DEV 3<$PWFILE} --" m_$i $DEV \\ command for detaching loop-device \\ run benchmark s_crypto_post () { $DEBUG losetup -d $LOOP for j in $BM; do 2>/dev/null} echo "------$i / $j " b_pre $DEV STATUS="$?" \\ STATUS c) Program will know if something goes wrong \\ after successful mounting, execute benchmark \\ path for publication of the program later if [ "`grep $DIR /proc/mounts`" -o "$DEBUG" -a if [ ! "`echo "$0" | grep ^\/`" ]; then "$STATUS" = 0 ]; then b_$j else echo "Please specify the full path to "$0"!" echo "Something went wrong prior exit 1fi benchmarking $i-$j" \\ not advisable reusing old password fi b_post if [ -f $KEYFILE -o -f $PWFILE ]; then done echo "$KEYFILE and/or $PWFILE still exist, \\ Comand for attaching loop-devices, please delete them!" \\ make/mount filesystems exit 1fi for j in $ALG; do if [ "$1" = "-F" -o "$2" = "-F" ]; then s_crypto_post # detach loop-device (just to be echo "Execution forced with -F!" sure) else echo -n "This procedure will delete everything s_crypto $j on $DEV , continue? (Yes|No) " && read m_$i $LOOP "a" if [ ! "$a" = "Yes" ]; then \\ run benchmark echo "Aborted on your request" exit 1 for k in $BM; do fifi /sbin/losetup -a > /dev/null 2>&1 || (echo echo "" "Please install loop-aes-utils" && exit 1) echo "------$i / $k - $j "

IJCSMS www.ijcsms.com - 13 -

IJCSMS (International Journal of Computer Science & Management Studies) Vol. 14, Issue 11 Publishing Month: November 2014 (An Indexed and Referred Journal) ISSN (Online): 2231 –5268 www.ijcsms.com

b_pre $LOOP \\ after successful mounting, execute benchmark if [ "`grep $DIR /proc/mounts`" -o "$DEBUG" ]; then b_$k else echo "Something went wrong upon benchmarking $i-$j-$k" fi b_post done s_crypto_post done echo ""done \\ information about file systems echo "**** btrfsprogs: `fsck.ext2 -V 2>&1 | head - n1`" >> $RESULTS/info echo "**** : `fsck.ext2 -V 2>&1 | head - n1`" >> $RESULTS/info echo "**** e3fsprogs: `fsck.ext2 -V 2>&1 | head - n1`" >> $RESULTS/info echo "**** e4fsprogs: `fsck.ext2 -V 2>&1 | head - n1`" >> $RESULTS/info echo "**** reiserfsprogs: `fsck.reiserfs -V 2>&1 | head -n1`" >> $RESULTS/info echo "**** xfsprogs: `xfs_repair -V`" >> $RESULTS/info echo "**** zfsutils: `fsck.jfs -V | head -n1`" >> $RESULTS/info echo "**** mount: `mount -V`" >> $RESULTS/info echo "**** bonnie: `bonnie++ 2>&1 | grep -i version`" >> $RESULTS/info \\ the final results for file systems performance cd $RESULTS $DEBUG mkdir -p public 1>/dev/null for i in generic-*-*; do echo && echo "------“$i" && cat "$i"|grep -v records; done > public/generic.txt cat bonnie-* | bon_csv2html > public/bonnie.html $DEBUG mkdir public/raw $DEBUG mv *.txt *.csv public/raw $DEBUG mv config.gz dmesg info public/ $DEBUG cp "$0" public/bench.sh.txt # clean up $DEBUG rm -f $PWFILE $KEYFILE $DEBUG losetup -d $LOOP > /dev/null 2>&1

IJCSMS www.ijcsms.com - 14 -