Improving Performance and Lifetime of the SSD RAID-based Host Cache through a Log-structured Approach

Yongseok Oh ([email protected]) Jongmoo Choi, Donghee Lee, Sam H. Noh Dankook University, University of , SSD Data Cache

• SSD based caches are being deployed § Host side cache for network storage § SSD + HDD hybrid approach for servers and desktop PCs • SSD caches are a cost-effective solution § Getting cheaper and cheaper

DRAM

Cheaper/ Faster SSD Cache Larger capacity

Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 2 SSD Cache Reliability

• Data loss may happen § Bit error, SSD wear out, FTL firmware bugs § Device failure, physical destruction • Use of write-back policy causes disastrous data loss § File system becomes in an inconsistent state • One solution is RAID! File System

Data Loss

SSD Cache FS Metadata, User Data, DB

Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 3 Taking advantages of RAID into SSD Cache

• Multiple cheap SSDs § High performance, large capacity, low cost ($) • Reliability by retaining redundancy § Drive failure, replacement, capacity management

File System

SSD RAID Cache SSD SSD SSD SSD

Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 4 Possible Solutions for SSD Caches

RAID-0: No protection RAID-1: Low hit ratio

D0 D1 D2 D3 D0 D1 D0 D1 Cached blocks Cached blocks Mirror

RAID-4, -5: Reasonable

D0 D1 D2 P

Cached blocks Parity

INFLOW‘13 (In Conjunction with SOSP’13) 5 Problem of RAID-5

• Small write problem (parity update overhead) § Exacerbate the performance of SSD cache § Shorten the lifetime of SSD cache Write D0’ Additional I/Os

Cached D0 D1 D2 P Parity block SSD SSD SSD SSD SSD Cache Layer

Small write problem in RAID-5 based SSD Cache

INFLOW‘13 (In Conjunction with SOSP’13) 6 Problem of RAID-5

• Small write problem (parity update overhead) Our goal is to reduce parity overhead § Exacerbate the performance of SSD cache § Shortenwith the equivalent lifetime of SSD to cache RAID-5 reliability Write D0’ Additional I/Os

Cached D0 D1 D2 P Parity block SSD SSD SSD SSD SSD Cache Layer

Small write problem in RAID-5 based SSD Cache

INFLOW‘13 (In Conjunction with SOSP’13) 7 Outline

• SSD RAID-based Cache (SRC)

• Performance Evaluation

• Conclusion

INFLOW‘13 (In Conjunction with SOSP’13) 8 Basic Layout of SRC File System

SSD SSD SSD SSD Cache Cache Cache Cache 0 1 2 3 SSD SSD SSD SSD Cache Cache Cache Cache Stripe 0 0 1 2 3

SRC Stripe 1 D D D P Layer … D D D P Stripe Log-structuredWrite Stripe N-1 Cached data Parity

Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 9 Basic Operations and Assumptions

• Cached data are categorized into two types § Read data are copied from primary storage § Write data are newly written by upper layer • Write-back policy is used instead of write-through policy § High random write performance • Cache replacement is done in a stripe unit § Data blocks in victim stripe are evicted together

INFLOW‘13 (In Conjunction with SOSP’13) 10 Key Features of SRC

• Log-structured approach § Eliminates write-modify-writes • Destage approach instead of garbage collection § Less data movement from SSDs to SSDs • Separated striping § No parity for read (clean) data • Fixed parity (similar to RAID-4) § Easier management

INFLOW‘13 (In Conjunction with SOSP’13) 11 Log-structured Approach

• Goal: eliminate all read-modify writes Incoming writes: 0’, 4’, 8’

Stripe0 0 1 2 P

Stripe1 3 4 5 p Stripe2 6 7 8 p

0’ 4’ 8’ XOR Free Log-structuredwrite stripe 0’ 4’ 8’ P Parity overhead is minimized!

INFLOW‘13 (In Conjunction with SOSP’13) 12 Partial Stripe Parity Update

• If stripe is not filled up with data § Partial stripe parity is periodically written

Incoming writes: 0’, 1’

0’ 1’ XOR Paral stripe parity (0’⊕1’) stripe 0’ 1’ P’

Incoming writes: 2’

P’ 2’ XOR Complete stripe parity (0’ ⊕1’⊕2’) stripe 0’ 1’ 2’ P’’ P’

INFLOW‘13 (In Conjunction with SOSP’13) 13 Problem of Garbage Collection

a) Garbage collection b) Destage (our approach) Normal I/O Normal I/O GC I/O

SRC Layer SSD SSD SSD SSD SSD SSD SSD SSD Destage I/O • Achilles’ heel of LFS • Worsen storage performance • Shortened lifetime of SSDs Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 14 Destage Scheme

• Per stripe clock replacement for read requests • Goal § Minimize data movement from SSDs to SSDs § Improve read performance by keeping hot data

Clock Bitmap Stripe0 Invalid 1 2 P 1 0 Stripe1 3 Invalid 5 p 1 0 Stripe2 6 7 Invalid p 0 0 Victim ptr Stripe3 0’ 4’ 8’ P 0

Primary Storage

INFLOW‘13 (In Conjunction with SOSP’13) 15 Separated Striping Scheme a) Mixed Striping Scheme b) Separated Striping Scheme (Simple approach) (Our approach) Write data Read data SSD0 SSD1 SSD2 SSD3 SSD0 SSD1 SSD2 SSD3 Read 0 1 2 3 Mixed 0 4’ 5’ P Stripe Stripe Write 1 2 6’ P 4’ 5’ 6’ P Stripe No parity Read data are excessively same contents are stored protected by parity in primary storage

0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7 Primary Storage Primary Storage INFLOW‘13 (In Conjunction with SOSP’13) 16 Parity Distribution

a) Rotated Parity Distribuon b) Fixed Parity Distribuon (General approach) (Our approach)

Parity Writes Parity Writes

Cache SSD SSD SSD SSD SSD SSD SSD SSD Layer Replace Replace New New New New New SSD SSD SSD SSD SSD

• Similar to RAID-5 • Similar to RAID-4, DiffRAID • All SSDs are evenly worn-out • Parity SSD is quickly worn-out • Drive replacement is complex • Drive replacement is simple

INFLOW‘13 (In Conjunction with SOSP’13) 17 Summary of Key Features

Block Type: Read Write Parity invalid

SSD0 SSD1 SSD2 SSD3 Read R R R R Stripe ③Separated Write Striping I W W P Stripe

R R I R Log-structured

① W I W P

④ Fixed ② Destage to HDD parity

INFLOW‘13 (In Conjunction with SOSP’13) 18 Outline

• SSD RAID-based Cache (SRC)

• Performance Evaluation

• Conclusion

INFLOW‘13 (In Conjunction with SOSP’13) 19 Architecture of SRC

I/O Requests

Sequenal Page Mapping Parity Detector Replacer Manager Writer SRC Layer SSD SSD SSD SSD

SATA/iSCSI/SAN

Primary Storage Disk Disk Disk Disk Disk Layer

INFLOW‘13 (In Conjunction with SOSP’13) 20 Configurations

• CMU DiskSim + MSR SSD Extension § 4 x SSDs (SSD cache) § 5 x Disks (primary storage) (10,000rpm) • SSD cache schemes § RAID-0 scheme: caching space is managed by RAID-0 § RAID-5 scheme: caching space is managed by RAID-5 § SRC scheme that we propose • Workload traces § Financial [UMASS] (19% reads) § Exchange [SNIA] (48% reads) § MSN [SNIA] (6% reads) § Web [UMASS] (99% reads)

INFLOW‘13 (In Conjunction with SOSP’13) 21 SRC vs Conventional RAID

• SRC is on average 59% better than RAID-5 § SRC provides almost same reliability as RAID-5 • RAID-0 shows the best performance § Does not provide any data protection

RAID-0 RAID-5 SRC 1

0.8 55% 68% 56% 59% 0.6

0.4

0.2

Response Time (Relative) ResponseTime 0 Financial Exchange MSN Web

INFLOW‘13 (In Conjunction with SOSP’13) 22 Analysis of Parity Overhead

• RAID-0 has no parity overhead § RAID-5 suffers from considerable parity overhead • SRC shows reduced parity overhead

Parity Read Parity Write Reduced parity! 30 20 No parity overhead 10 0 SSD0 SSD1 SSD2 SSD3 SSD0 SSD1 SSD2 SSD3 SSD0 SSD1 SSD2 SSD3

Aggregate I/O (GB) AggregateI/O RAID-0 RAID-5 SRC

Exchange workload

INFLOW‘13 (In Conjunction with SOSP’13) 23 Effect on SSD Lifetime

• SRC shows reduced erase count compared to RAID-5

SSD0 SSD1 SSD2 SSD3 250 38% average 200

150

100

50

Average Erase Count Average 0 RAID-0 RAID-5 SRC Exchange workload

INFLOW‘13 (In Conjunction with SOSP’13) 24 Allocation Schemes for SRC

SSD0 SSD1 SSD2 SSD3 Stripe 0 R W W P R W W P

Stripe 1 R W R P R W P R a) Mixed – Fixed b) Mixed – Rotated

R R R R R R R R

W W W P W W P W c) Separated – Fixed d) Separated – Rotated Our approach

INFLOW‘13 (In Conjunction with SOSP’13) 25 Comparison of Allocation Schemes

• Fixed scheme is better than Rotated scheme § Mixed-Fixed scheme is the best for write intensive traces § Separated-Fixed scheme is a winner for read dominant trace

Mixed-Fixed (M-F) 2 Mixed-Rotated (M-R) Separated-Fixed (S-F) 1.5 Separated-Rotated (S-R)

1

0.5 Response Time (Relative) ResponseTime 0 Financial Exchange MSN Web Write mostly Read mostly INFLOW‘13 (In Conjunction with SOSP’13) 26 Effect of Fixed Parity Distribution

• Fixing parity to an SSD improves the performance § Heavy parity updates go to parity SSD • Rotating parity increases all device response times

Parity I/Os Normal & 5 Parity I/Os 4 Normal I/Os 3

(ms) 2 1

Device Response 0 Mixed-Fixed Mixed-Rotated SSD0 SSD1 SSD2 SSD3 Financial Workload

INFLOW‘13 (In Conjunction with SOSP’13) 27 Effect of Separated Striping Scheme

• Separating Striping scheme has less parity overhead § No parity for read stripes

Data Read Data Write SSD0 SSD1 SSD2 SSD3 Parity Read Parity Write Average 6 Less parity 10 5 overhead 4 8 3 6 2 1 4 Extended 0 lifetime 2 Aggregate I/O (GB) AggregateI/O SSD0 SSD1 SSD2 SSD3 SSD0 SSD1 SSD2 SSD3

Average Erase Count Average 0 Mixed-Fixed Separated-Fixed Mixed-Fixed Separated-Fixed Web Workload (composed of 99% read I/O request) INFLOW‘13 (In Conjunction with SOSP’13) 28 Concluding Remarks • SRC (SSD RAID-based Cache) § Log-structured approach § Destage policy § Separated striping scheme § Fixed parity distribution • SRC is better than RAID-5 cache § Performance: 59% better than RAID-5 § Lifetime: 47% better than RAID-5 • Future direction § Memory efficient metadata management § Failure handling § Implementation (Linux MD layer) • Our poster presentation § On Monday

INFLOW‘13 (In Conjunction with SOSP’13) 29 Improving Performance and Lifetime of the SSD RAID-based Host Cache through a Log-structured Approach

Yongseok Oh! Jongmoo Choi Donghee Lee Sam H. Noh

University of Seoul Dankook University Hongik University {ysoh, dhl_express} [email protected] [email protected] @uos.ac.kr

INFLOW‘13 (In Conjunction with SOSP’13) 30 Acknowledgements

• Anonymous reviewers § https://sites.google.com/site/2013inflow/program-commitee • CMU DiskSim team § http://www.pdl.cmu.edu/DiskSim • MSR SSD extension team § http://research.microsoft.com/en-us/downloads/b41019e2-1d2b-44d 8-b512-ba35ab814cd4/default.aspx • DiskSim parameter provided by Anjo § http://www.mpi-sws.org/~vahldiek • UMASS traces § http://traces.cs.umass.edu/index.php/Storage/Storage • SNIA block traces § http://iotta.snia.org/tracetypes/3

INFLOW‘13 (In Conjunction with SOSP’13) 31