Performance of WD 250GB SATA Drives + 3ware Controller Jim Gray, 7 Oct 2003

We are building inexpensive bricks to host the TerraServer. The idea is to use massive redundancy: 2 sites, 3 replicas per site each replica is raid10 (so 12 instances of every byte). So, cheap storage is important to us. For loading and for data mining we want good sequential bandwidth (1GB/s would be nice), and for online service, good random IO is desirable.

ATA drives typically cost 800$/TB (PATA) and 1.2k$/TB (SATA). Controllers, power, cooling, processors, and sheet metal bring the “packaged” price up to about 2k$/TB (or 3k$/TB for branded systems).

We have been concerned about reliability and performance of storage bricks built in this way. Tom Barclay and Wyman Chong are experimenting with high-end bricks (16 drives == 4TB per cabinet.) I have been doing some more modest experiments in my office with: Dell 1Ghz tower running WindowsXP . 3ware Escalade 6506-4: a 64/33 4 channel controller (245$ == 81$/port) 4xWestern Digital 250GB SATA WD2500JD, 7,200 RPM, 8MB cache, (299$) This nets out to about 1.5k$/TB and rises to 2.1k$/TB when the Dell system price (609$) is included.

I have been running an SQLIO stress test on the devices for a month now without any problems (and I did not do anything special about cooling – just the standard (not very cool) Dell packaging in a drive bay. The performance results are shown in the graphs and table below.

disks type depth BW IOPS % 500 MBps 0 sequential 0 0 WinXP Sata Western Digital 1 read 1 53 834 100% 450 1 read 4 54 841 101% 250GB 7200 rpm 3ware JBOD 2 read 1 104 1,620 97% 2 read 4 104 1,620 97% 400 3 read 1 153 2,397 96% Random Writes IO/s 3 read 4 154 2,400 96% Random Reads IO/s 350 4 read 1 199 3,191 93% Sequential Bandwidth MB/s s

4 read 4 204 3,256 95% p O I

m 300

1 write 1 54 838 100% o d n

1 write 4 54 841 101% a r

2 write 1 104 1,620 97% , 250 h t

2 write 4 104 1,622 97% d i

3 write 1 154 2,400 96% w d

n 200 3 write 4 153 2,397 96% a B 4 write 1 200 3,194 93% q e

4 write 4 199 3,186 93% S random 150 1 read 1 0.6 76 100% 1 read 4 0.6 77 101% 100 2 read 1 1.2 150 98% 2 read 4 1.2 150 99% 3 read 1 1.7 222 97% 50 3 read 4 1.7 223 98% 4 read 1 2.3 294 97% 4 read 4 2.3 294 97% 0 0 1 2 3 4 1 write 1 1.1 138 100% Disks 1 write 4 1.1 135 98% 2 write 1 2.1 271 99% 2 write 4 2.1 268 97% 3 write 1 3.0 386 93% 3 write 4 3.0 386 93% 4 write 1 3.8 488 89% 4 write 4 3.8 483 88% Wyman Chong and Tom Barclay let me have access to their system that has a 8-drive 3Ware 8506-8 controller and 8 Western Digital WD2500JD 250 GB SATA drives. I wanted to see if the throughput continues to rise with more than 8 disks. In some other experiments I have seen the controller able to handle about 3300 IOps, but I was curious to see what the total bandwidth was – the bus is rated at 320 MBps, would the 64bit-64Mhz PCIX bus deliver that throughput for sequential scans? These experiments are done on a dual 2GHz Xeon running Windows Server 2003.

So, using SQLIO I filled the disks with data and then ran 4-deep read-write sequential-random tests each lasting 30 seconds. The perfmon picture below tells the story – sequential read saturates at about 230 MBps while sequential write saturates at 200MBps. If you are doing sequential IO, there is little reason to have more than 4 drives per 3ware controller. It also shows that the controller can easily manage the write traffic for up to 8 disks with random IO and that it is good at buffering and optimizing write traffic.

The first set of experiments shows sequential 256KB sequential reads of 1 to 8 disks all on one 3ware controller. The read rate starts at about 50BMps and ramps up to 5 disks and 230MBps where the controller or bus saturates. Sequential writes show a similar story with saturation at 4 disks. The processor is nearly idle during all this (the red line). The thin green line shows the disk queue length. The main features are that both reads and writes plateau at about 4 drives and at about 200MBps.

The second set of experiments use 8KB 4-deep random IO to 1, …, 8 disks on one controller. First the reads ramp up to about 650 IOps and then the writes ramp up to over 1000 IOps. The CPU is loafing and the MBps are in the .5 to 4.0 range (near zero compared to sequential IO bandwidth).

The spreadsheets and chart for this are on the next page. SEQUENTIAL TESTS -- 4 deep 256KB IO no buffer IO/sec MB/sec % Sequential Throughput MBps Read 196 49 100% 3ware Escalade 8506-8 SATA Ctlr Read 399 100 203% 8 Western Digital 250 GB SATA disks 250 Read 585 146 298%

Read 790 197 402% 200 Read 896 224 456% Read 909 227 463% 150 s p B

Read 900 225 458% M 100 Read 906 226 461% Sequential Read MBps Write 196 49 100% 50 Sequential Write MBps Write 397 99 202% Write 591 148 301% 0 1 2 3 4 5 6 7 8 Write 794 198 405% disks Write 794 199 405% Write 806 201 411% Write 795 199 405% Write 805 201 410% RANDOM TESTS Random Throughput IOps -- 4 deep 8KB IO no buffer 3ware Escalade 8506-8 SATA Ctlr Read 74 0.57 100% 8 Western Digital 250 GB SATA disks 1200 Read 148 1.15 201% Random Read IOps Read 220 1.71 298% 1000 Random Write IOps Read 292 2.28 396% 800 Read 362 2.82 491% s p 600 O

Read 441 3.44 597% I

Read 497 3.88 673% 400 Read 600 4.69 814% 200 Write 126 0.98 100% Write 256 2.00 203% 0 Write 382 2.98 303% 1 2 3 4disks5 6 7 8 Write 515 4.02 408% Write 626 4.89 496% Write 764 5.96 605% Write 867 6.77 688% Write 1018 7.95 807%