Accelerating Data Center Workloads with Solid-State Drives
Total Page:16
File Type:pdf, Size:1020Kb
IT@Intel White Paper Intel IT IT Best Practices Solid-State Drives July 2012 Accelerating Data Center Workloads with Solid-State Drives Executive Overview In performance trials, Intel Interest continues to grow for the use of solid-state drives (SSDs) in data centers, IT determined that some of particularly for high random I/O applications where SSDs excel. The greatest obstacles the latest Intel® Solid-State to widespread adoption are cost and concerns about SSD endurance, specifically Drives provide significant cost, their ability to withstand large amounts of data writes. In performance trials in our reliability, performance, and test environment and with our security-compliance application database, Intel IT determined that the latest SSDs provide significant cost, reliability, performance, and endurance advantages over endurance advantages over traditional enterprise class hard disk drives (HDDs). traditional enterprise-quality hard disk drives. The impetus for our research was the • Justify higher initial costs through reduction observation that 15K revolutions per minute of IT time spent dealing with the effects HDDs presented a serious performance of maximum disk queue depths, drive bottleneck to the 100-percent random endurance equivalent to a 25-year workload generated by our security- lifespan, elimination of compliance issues compliance application database. The amount involving backlogs in recording monitoring of HDD head travel required for the patching data, and reduction of potential losses and reporting functions of this database from delays in patching monitored systems. slowed performance to an unacceptable level. • Lower disk power demands by more Our goal was to find a solution to speed up than 50 percent while producing one-third the database without incurring excessive less heat. costs or increasing complexity. In this paper, we discuss our test methodology, Our evaluation found that switching to the trial results, and actual results when deployed latest SSDs can: in our data center. We also provide guidance • Eliminate storage performance bottlenecks, on how to determine whether a particular increasing disk performance up to 5x on application is a good fit for running on SSDs. random disk I/O tasks. Based on our results, Intel IT is now researching • Reduce read latency by up to 10x, write additional SSD use cases, particularly for known latency by up to 7x, and maximum latency high random I/O workloads. by up to 8x, for faster response to patching Christian Black and compliance data read/write requests. Enterprise Architect, Intel IT • Provide faster transition from idle to active Darrin Chen state, plus display no I/O penalties (longer Senior Systems Engineer, Intel IT seek times) as drives fill to capacity. IT@Intel White Paper Accelerating Data Center Workloads with Solid-State Drives Contents BACKGROUND • Lower heat generation. Systems with SSDS have less heat dissipation. Executive Overview ............................. 1 As interest continues to grow for using solid-state drives (SSDs) in • Silent operation. SSDs have no moving Background ............................................ 2 data centers, there is also increasing parts to make noise. Comparing the Causes of focus on SSDs’ advantages and write A less well-known advantage of SSDs comes Hard Drive Failure .............................. 2 endurance—the ability of an SSD to in running database applications that rely Recent Improvements in withstand large amounts of data heavily on I/O operations per second (IOPS) to SSD Endurance .................................. 3 writes. In the past, endurance has determine performance. Because SSDs incur been a potential concern, but recent no head seeking to read or write data, they Solution ................................................. 3 improvement in SSD endurance and deliver higher IOPS and lower access times Test Methodology ............................. 3 other compelling SSD benefits now than enterprise—15K revolutions per minute Measuring the Existing seem to justify their greater use. (RPM)—HDDs. Workload ............................................. 4 One disadvantage often cited for SSDs is cost. Modeling the Workload in At Intel, as in many other organizations, A serial-attached SCSI (SAS) SSD can cost up Iometer ................................................ 5 data in servers is often stored on hard disk to three times more than a traditional 15K drives (HDDs), which write to and read from Test Results ........................................... 6 RPM SAS HDD of equivalent capacity. SSD magnetic disks. SSDs, on the other hand, prices continue to improve though, especially Validating Model Workload use semiconductor-based memory to store Against Real Workload ..................... 6 as the cost of NAND flash memory drops and data. This memory is usually NAND (short for production volumes increase. Determining Top Line Performance... 7 “NOT AND”) flash memory, the same storage Calculating Endurance ...................... 7 medium used for USB thumb drives. NAND Calculating the Uncorrectable memory is ideal because unlike RAM, it is Comparing the Causes of Bit Error Rate ..................................... 8 non-volatile and data is not lost when the Hard Drive Failure Power and Heat Comparisons ........ 9 device is powered down. The main source of failure for HDDs is mechanical. The most frequent failure is a While the form factor and interface of most Production Environment Results ...10 head crash where the actuator arm physically SSDs are compatible with HDDs, SSDs have contacts the disk platter causing data loss. Conclusion ............................................11 no moving parts. With no moving platters or Over time, other components simply wear an actuator arm to read and write data, there Acronyms ..............................................12 out: platters vibrate due to bearing wear, is nothing mechanical in an SSD to wear out. actuators lose precision, and lubricants This provides a number of advantages. evaporate. The result is more retries, more • High reliability. SSDs have a mean time corrupted data requiring error correction code between failure of 2 million hours. (ECC) recovery, higher drive temperatures, • Fast access. With SSDs, there is no greater power draw, and eventually failure. waiting for the drive to come up to speed The magnetic media itself has virtually from idle or perform head seek operations. no limit on the number of writes, but the magnetic bit strength has a half-life of just IT@INteL • Excellent resistance to impact and five to seven years. Nonetheless, a HDD The IT@Intel program connects IT vibration. SSDs can withstand shock and will most likely fail for mechanical reasons professionals around the world with their vibration while maintaining data integrity. peers inside our organization – sharing long before magnetic bit strength has any lessons learned, methods and strategies. • Low power consumption. SSDs consume appreciable effect on performance or data Our goal is simple: Share Intel IT best over 50 percent less power compared to integrity. practices that create business value and an HDD. make IT a competitive advantage. Visit us today at www.intel.com/IT or contact your local Intel representative if you’d like to learn more. 2 www.intel.com/IT Accelerating Data Center Workloads with Solid-State Drives IT@Intel White Paper . environments and other endurance-focused advantage of the faster read/write applications, extends the endurance of speeds of SSDs for this kind of data. SSDs, in comparison, can occasionally fail SSDs by using endurance-validated MLC because of data retention or read-error issues NAND. HET’s silicon-level and system-level For our tests, we selected high endurance on individual NAND flash cells. Each block of optimizations enable it to use beyond-ECC HET-based SSDs—the Intel® Solid-State Drive a flash-based SSD provides one million to error recovery steps to extend MLC NAND (Intel® SSD) 710 series. Our goal was to first five million write cycles—granted, a very large capability to a higher P/E cycle count and test the performance of these SSDs in a number—before it fails. This is known as write employ special programming sequences controlled environment, and then, based on the endurance. These errors occur only after to mitigate program state disturb issues data, measure their performance while running reaching the maximum number of program/ that may occur. Though a scheme called the production security-compliance database. erase (P/E) cycles for any individual block on an background data refresh, an SSD with SSD. For many use cases, this is hardly an issue. HET moves data around during periods of Test Methodology Recent Improvements in inactivity to re-allocate areas that have To determine the viability and advantages of incurred heavy reads. Additionally, an SSD using SSDs for a high I/O application such as SSD Endurance with HET comes with a spare area that our security-compliance database, we chose The endurance of an SSD is dependent on its lowers write amplification. The combined the following methodology: overall capacity and the amount of P/E cycles effect of these items enables an SSD with its NAND flash cells can support. When a host 1. Measure the existing database server HET to deliver the endurance and data issues a write command to an SSD, the SSD workload using Perfmon, which is a retention necessary for many data center data management scheme may consume performance monitoring tool. applications.