Optimizing NAND Flash-Based Ssds Via Retention Relaxation
Total Page:16
File Type:pdf, Size:1020Kb
Optimizing NAND Flash-Based SSDs via Retention Relaxation Ren-Shuo Liu∗, Chia-Lin Yang∗, and Wei Wu† ∗National Taiwan University and †Intel Corporation [email protected], [email protected], [email protected] Abstract durance have to retain data for 1 year. As NAND Flash technology continues to scale down and more bits are As NAND Flash technology continues to scale down and stored in a cell, the raw reliability of NAND Flash de- more bits are stored in a cell, the raw reliability of NAND creases substantially. To meet the retention specifica- Flash memories degrades inevitably. To meet the reten- tion for a reliable storage system, we see a trend of tion capability required for a reliable storage system, we longer write latency and more complex ECCs required in see a trend of longer write latency and more complex SSDs. For example, comparing recent 2-bit MLC NAND ECCs employed in an SSD storage system. These greatly Flash memories with previous SLC ones, page write la- impact the performance of future SSDs. In this paper, we tency increased from 200 ms [34] to 1800 ms [39], and present the first work to improve SSD performance via the required strength of ECCs went from single-error- retention relaxation. NAND Flash is typically required correcting Hamming codes [34] to 24-error-correcting to retain data for 1 to 10 years according to industrial Bose-Chaudhuri-Hocquenghem (BCH) codes [8,18,28]. standards. However, we observe that many data are over- In the near future, more complex ECC codes such as low- written in hours or days in several popular workloads in density parity-check (LDPC) [15] codes will be required datacenters. The gap between the specification guarantee to reliably operate NAND Flash memories [13, 28, 41]. and actual programs’ needs can be exploited to improve To overcome the design challenge for future SSDs, in write speed or ECCs’ cost and performance. To exploit this paper, we present retention relaxation, the first work this opportunity, we propose a system design that allows on optimizing SSDs via relaxing NAND Flash’s reten- data to be written in various latencies or protected by dif- tion capability. We observe that in typical datacenter ferent ECC codes without hampering reliability. Simula- workloads, e.g., proxy and MapReduce, many data writ- tion results show that via write speed optimization, we ten into storage are updated quite soon, thereby, requir- can achieve 1:8–5:7× write response time speedup. We ing only days or even hours of data retention, which is also show that for future SSDs, retention relaxation can much shorter than the retention time typically specified bring both performance and cost benefits to the ECC ar- for NAND Flash. In this paper, we exploit the gap be- chitecture. tween the specification guarantee and actual programs’ 1 Introduction needs for SSD optimization. We make the following con- tributions: For the past few years, NAND Flash memories have been widely used in portable devices such as media players • We propose a NAND Flash model that captures the and mobile phones. Due to their high density, low power relationship between raw bit error rates and reten- and high I/O performance, in recent years, NAND Flash tion time based on empirical measurement data. This memories begun to make the transition from portable de- model allows us to explore the interplay between re- vices to laptops, PCs and datacenters [6, 35]. As the tention capability and other NAND Flash parameters semiconductor industry continues scaling memory tech- such as the program step voltage for write operations. nology and lowering per-bit cost, NAND Flash is ex- • A set of datacenter workloads are characterized for pected to replace the role of hard disk drives and funda- their retention time requirements. Since I/O traces are mentally change the storage hierarchy in future computer usually gathered in days or weeks, to analyze reten- systems [14, 16]. tion time requirements in a time span beyond the trace A reliable storage system needs to provide a retention period, we present a retention time projection method guarantee. Therefore, Flash memories have to meet the based on two characteristics obtained from the traces, retention specification in industrial standards. For exam- the write amount and the write working set size. Char- ple, according to the JEDEC standard JESD47G.01 [19], acterization results show that for 15 of the 16 traces NAND Flash blocks cycled to 10% of the maximum analyzed, 49–99% of writes require less than 1-week specified endurance must retain data for 10 years, and retention time. blocks cycled to 100% of the maximum specified en- • We explore the benefits of retention relaxation for Erase 10's Verify 00's Verify 01's Verify Erase 10's Verify 00's Verify 01's Verify State Level Level Level State Level Level Level 11 10 00 01 11 10 00 01 Target Vth Target Vth Level Level 11 10 00 01 11 10 00 01 Vth Vth Increment = Precision = Increment = Precision = ΔVP_large ΔVP_large ΔVP_small ΔVP_small (a) ISPP with large DVP (b) ISPP with small DVP Figure 1: Incremental step pulse programming (ISPP) for programming NAND Flash speeding up write operations. We increase the pro- age increment (i.e., DVP) and stops once Vth is greater gram step voltage so that NAND Flash memories are than the desired threshold voltage. Because NAND Flash programmed faster but with shorter retention guaran- cells have different starting Vth, the resulting Vth spreads tees. Experimental results show that 1:8–5:7× SSD across a range, which determines the precision of cells’ write response time speedup is achievable. Vth distributions. The smaller DVP is, the more precise the • We show how retention relaxation can benefit ECC resulting Vth is. On the other hand, smaller DVP means designs for future SSDs which require concatenated more steps are required to reach the target Vth, thereby, BCH-LDPC codes. We propose an ECC architecture resulting in longer write latency [26]. where data are encoded by variable ECC codes based NAND Flash memories are prone to errors. That is, on their retention requirements. In our ECC architec- the Vth level of a cell may be different from the intended ture, time-consuming LDPC is removed from the crit- one. The fraction of bits which contain incorrect data is ical performance path. Therefore, retention relaxation referred to as the raw bit error rate (RBER). Figure 2(a) can bring both performance and cost benefits to the shows measured RBER of 63–72nm 2-bit MLC NAND ECC architecture. Flash memories under room temperature following 10K program/erase (P/E) cycles [27]. The RBER at reten- The rest of the paper is organized as follows. Sec- tion time = 0 is attributed to write errors. Write errors tion 2 provides background about NAND Flash. Sec- have been shown mostly caused by cells with higher Vth tion 3 presents our NAND Flash model and the benefits than intended because the causes of write errors, such of retention relaxation. Section 4 analyzes data reten- as program disturb and random telegraph noise, tend to tion requirements in real-world workloads. Section 5 de- over-program Vth. The increment of RBER after writing scribes the proposed system designs. Section 6 presents data (retention time > 0) is attributed to retention errors. evaluation results regarding the designs in the previous Retention errors are caused by charge losses which de- section. Section 7 describes related work, and Section 8 crease Vth. Therefore, retention errors are dominated by concludes the paper. cells with lower Vth than intended. Figure 2(b) illustrates these two error sources: write errors mainly correspond 2 Background to the tail at the high-Vth side; retention errors correspond NAND Flash memories comprise an array of floating to the tail at the low-Vth side. In Section 3.1, we model gate transistors. The threshold voltage (Vth) of the tran- NAND Flash considering these error characteristics. sistors can be programmed to different levels by injecting A common approach to handle NAND Flash errors is different amounts of charge on the floating gates. Differ- to adopt ECCs (error correction codes). ECCs supple- ent Vth levels represent different data. For example, to ment user data with redundant parity bits to form code- store N bits data in a cell, its Vth is programmed to one of words. With ECC protection, a codeword with a certain N its 2 different Vth levels. amount of bits corrupted can be reconstructed. There- To program Vth to the desired level, the incremen- fore, ECCs can greatly reduce the bit error rate. We refer tal step pulse programming (ISPP) scheme is commonly to the bit error rate after applying ECCs as the uncor- used [26, 37]. As shown in Figure 1, ISPP increases the rectable bit error rate (UBER). The following equation Vth of NAND Flash cells step-by-step by a certain volt- gives the relationship between UBER and RBER [27]: 1.00E-04 P(v) 1.00E-05 1.00E-06 Lower Intended Higher Write errors Level Level Level 1.00E-07 Vth Measured (Manufacturer-1) Raw Bit Error Rate Error Bit Raw Measured (Manufacturer-2) P(v) Measured (Manufacturer-3) 1.00E-08 Fitting (m=1.08) Fitting (m=1.33) fit y = a + bxm Fitting (m=1.25) 1.00E-09 Lower Intended Higher Retention errors 0 100 200 300 400 500 Level Level Level Retention Time (Days) Vth (a) Measured RBER(t) and fitting to power-law trends for (b) Write errors and retention errors 63–72 nm 2-bit MLC NAND Flash.