Issues on Using SSD Under Linux

Issues on Using SSD Under Linux

About SSD Dongjun Shin Samsung Electronics Outline SSD primer Optimal I/O for SSD Benchmarking Linux FS on SSD Case study: ext4, btrfs, xfs Design consideration for SSD What's next? New interfaces for SSD Parallel processing of small I/O SSD Primer (1/2) Physical unit of flash memory PageNAND ± unit for read & write BlockNAND ± unit for erase (a.k.a erasable block) Physical characteristics Erase before re-write Sequential write within an erasable block LBA space (visible to OS) Flash Translation Layer Flash memory space NAND page (2-4kB) NAND block = 64-128 NAND pages SSD Primer (2/2) Internal organization: 2-dimensional (NxM parallelism) Similar to RAID-0 (stripe size = sector or pageNAND) Effective page & block size is multiplied by NxM (max) 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 Ch0 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 Chip0 Chip1 Chip2 Chip3 1 5 9 13 17 21 25 29 Ch1 33 37 41 45 49 53 57 61 SSD Host I/F Controller N-channel running (ex. SATA) 2 6 10 14 18 22 26 30 (striping) F/W(FTL) Ch2 34 38 42 46 50 54 58 62 3 7 10 15 Ch3 35 39 43 47 M-way (pipelining) Optimal I/O for SSD Key points Parallelism · The larger the size of I/O request, the better Match with physical characteristics · Alignment with page or block size of NAND* · Segmented sequential write (within an erasable block) What about Linux? HDD also favors larger I/O read-ahead, deferred aggregated write Segmented FS layout good if aligned with erasable block boundary Write optimization FS dependent (ex. allocation policy) * Usually, partition layout is not aligned (1st partition at LBA 63) Test environment (1/2) Hardware Intel Core 2 Duo [email protected], 1GB RAM Software Fedora 7 (Kernel 2.6.24) Benchmark: postmark Filesystems No journaling - ext2 Journaling - ext3, ext4, reiserfs, xfs · ext3, ext4: data=writeback,barrier=1[,extents] · xfs: logbsize=128k COW, log-structured - btrfs (latest unstable, 4k block), nilfs (testing-8) SSD Vendor M (32GB, SATA): read 100MB/s, write 80MB/s Test partition starts at LBA 16384 (8MB, aligned) Test environment (2/2) Postmark workload Ref: Evaluating Block-level Optimization through the IO Path (USENIX 2007) Workload File size # of file # of Total app (work-set) transaction read/write SS 9-15K 10,000 100,000 630M/755M* SL 9-15K 100,000 100,000 600M/1.8G LS 0.1-3M 1,000 10,000 9.7G/12G LL 0.1-3M 4,250 10,000 9G/17G * Mostly write-only Benchmark results (1/2) Small file size (SS, SL) 2500 2000 1500 1000 transaction/sec 500 0 SS SL ext2 ext3 ext4 reiserfs xfs btrfs nilfs Benchmark results (2/2) Large file size (LS, LL) 30 25 20 15 transaction/sec 10 5 0 LS LL ext2 ext3 ext4 reiserfs xfs btrfs nilfs I/O statistics (1/2) Average size of I/O 140 120 100 80 60 Avgsize I/O (Kbytes) 40 20 0 SS SL LS LL SS SL LS LL read write ext2 ext3 ext4 reiserfs xfs btrfs nilfs I/O statistics (2/2) Segmented sequentiality of write I/O (segment: 1MB) 20.00% 100% 100% 100% 100% 18.00% 16.00% 14.00% 12.00% 10.00% 8.00% 6.00% 4.00% 2.00% 0.00% SS SL LS LL ext2 ext3 ext4 reiserfs xfs btrfs nilfs Case study - ext4 Condition data=ordered, allocation: default/noreservation/oldalloc 1200 1000 800 1. Almost no difference between allocation policies 600 2. Why data=ordered is transaction/sec 400 better for SL? 200 0 SS SL ext4-wb ext4-ord ext4-nores ext4-olda Case study - btrfs Condition Block size: 4k/16k, allocation: ssd option on/off 1800 1600 1400 1. 4k is better than 16k 1200 (sequentiality = 12% : 2%) 1000 2. ssd option is effective 800 transaction/sec (10-40% improvement) 600 400 200 0 SS SL LS LL btrfs-4k btrfs-16k btrfs-ssd-4k Case study - xfs Condition Mount with barrier on/off 800 700 Large barrier overhead... 600 500 400 transaction/sec 300 200 100 0 SS SL LS LL xfs-bar xfs-nobar Design consideration for SSD Lessons from flash FS (ex. logfs) Sequential writing at multiple logging points Wandering tree · Trace-off between sequentiality vs. amount of write · Cf. space map (Sun ZFS) Need to optimize garbage collection overhead · Either FS itself or FTL in SSD Next topic: End-to-end optimization Exchange info with SSD (trim, SSD identification) Make best use of parallelism New interfaces for SSD (t13.org) Trim command Let device know which LBA range is not used · This will be helpful for optimizing FTL Should be passed through: FS bio scsi libata · Passing bio with no data · What about I/O reordering & I/O queuing? SSD identification (added to ªATA identifyº) Report size of page and erasable block · Physical or effective? Useful for FS and volume manager Parallel processing of small I/O Make better use of I/O queuing (TCQ or NCQ) Parallel processing of small I/O Desktop environment? Barrier? request A B C D A B C D queue 1 2 3 4 1 2 A Ch0 A Ch0 B,C Ch1 B,C Ch1 Ch2 Ch2 D Ch3 D Ch3 without I/O queuing, 4 steps with I/O queuing, 2 steps chip is busy chip is idle Summary Optimization for SSD Alignment is important Segmented sequentiality Make better use of parallelism (either small or large) · I/O barrier may stall the pipelined processing What can you do? File system: alignment, allocation policy, design (ex. COW) Block layer: bio w/ hint, barrier, I/O queueing, scheduler(?) Volume manager: alignment, allocation Virtual memory: read-ahead References T13 spec for SSD http://www.t13.org/documents/UploadedDocuments/docs2007/e07153r0-Soild_State_Drive_Identify_Proposal.doc http://www.t13.org/documents/UploadedDocuments/docs2007/e07154r0-Notification_for_Deleted_Data_Proposal_for_ATA-ACS2.doc Introduction to SSD and flash memory http://download.microsoft.com/download/a/f/d/afdfd50d-6eb9-425e-84e1-b4085a80e34e/WNS-T432_WH07.pptx http://download.microsoft.com/download/d/f/6/df6accd5-4bf2-4984-8285-f4f23b7b1f37/WinHEC2007_Micron_NAND_FlashMemory.doc http://download.microsoft.com/download/a/f/d/afdfd50d-6eb9-425e-84e1-b4085a80e34e/SS-S486_WH07.pptx FTL description & optimization BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage (FAST '08) Appendix. I/O Pattern SS workload ± ext4, xfs Appendix. I/O Pattern SS workload ± btrfs, nilfs Appendix. I/O Pattern SL workload ± ext4, xfs Appendix. I/O Pattern SL workload ± btrfs, nilfs Appendix. I/O Pattern LS workload ± ext4, reiserfs, xfs Appendix. I/O Pattern LS workload ± btrfs, nilfs Appendix. I/O Pattern LL workload ± ext4, reiserfs, xfs Appendix. I/O Pattern LL workload ± btrfs, nilfs.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    27 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us