Autossd: an Autonomic SSD Architecture Bryan S

Total Page:16

File Type:pdf, Size:1020Kb

Autossd: an Autonomic SSD Architecture Bryan S AutoSSD: an Autonomic SSD Architecture Bryan S. Kim, Seoul National University; Hyun Suk Yang, Hongik University; Sang Lyul Min, Seoul National University https://www.usenix.org/conference/atc18/presentation/kim This paper is included in the Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA, USA ISBN 978-1-939133-02-1 Open access to the Proceedings of the 2018 USENIX Annual Technical Conference is sponsored by USENIX. AutoSSD: an Autonomic SSD Architecture Bryan S.Kim Hyun Suk Yang Sang Lyul Min Seoul National University Hongik University Seoul National University 1.2 Abstract 1 From small mobile devices to large-scale storage arrays, 0.8 flash memory-based storage systems have gained a lot of 0.6 popularity in recent years. However, the uncoordinated 0.4 use of resources by competing tasks in the flash trans- 0.2 lation layer (FTL) makes it difficult to guarantee pre- Normalized throughput 0 dictable performance. Time In this paper, we present AutoSSD, an autonomic SSD Figure 1: Performance drop and variation under 4KB random architecture that self-manages FTL tasks to maintain a writes. high-level of QoS performance. In AutoSSD, each FTL task is given an illusion of a dedicated flash memory sub- low latency and collectively massive parallelism make system, allowing tasks to be implemented oblivious to flash storage suitable for high-performance storage for others and making it easy to integrate new tasks to han- mission-critical applications. As multi-level cell tech- dle future flash memory quirks. Furthermore, each task is nology [5] and 3D stacking [38] continue to lower the allocated a share that represents its relative importance, cost per GB, flash storage will not only remain compet- and its utilization is enforced by a simple and effective itive in the data storage market, but also will enable the scheduling scheme that limits the number of outstand- emergence of new applications in this Age of Big Data. ing flash memory requests for each task. The shares are Large-scale deployments and user experiences, how- dynamically adjusted through feedback control by mon- ever, reveal that despite its low latency and massive par- itoring key system states and reacting to their changes to allelism, flash storage exhibits high performance insta- coordinate the progress of FTL tasks. bilities and variations [9, 17]. Garbage collection (GC) We demonstrate the effectiveness of AutoSSD by has been pointed out as the main source of the problem holistically considering multiple facets of SSD internal [9, 25, 28, 29, 45], and Figure 1 illustrates this case. It management, and by evaluating it across diverse work- shows the performance degradation of our SSD model loads. Compared to state-of-the-art techniques, our de- under small random writes, and it closely resembles mea- sign reduces the average response time by up to 18.0%, sured results from commercial SSDs [2,23]. Initially, the the 3 nines (99.9%) QoS by up to 67.2%, and the 6 nines SSD’s performance is good because all the resources of (99.9999%) QoS by up to 76.6% for QoS-sensitive small the flash memory subsystem can be used to service host reads. requests. But as the flash memory blocks are consumed by host writes, GC needs to reclaim space by compact- 1 Introduction ing data spread across blocks and erasing unused blocks. Consequently, host and GC compete for resources, and Flash memory-based storage systems have become pop- the host performance inevitably suffers. ular across a wide range of applications from mobile However, garbage collection is a necessary evil for systems to enterprise data storages. Flash memory’s the flash storage. Simply putting off space reclamation small size, resistance to shock and vibration, and low or treating GC as a low priority task will lead to larger power consumption make it the de facto storage medium performance degradations, as host writes will eventually in mobile devices. On the other hand, flash memory’s block and wait for GC to reclaim space. Instead, garbage USENIX Association 2018 USENIX Annual Technical Conference 677 collection must be judiciously scheduled with host re- thetic and real I/O workloads. x 6 discusses our design quests to ensure that there is enough free space for fu- in relation to prior work, and finally x 7 concludes. ture requests, while meeting the performance demands of current requests. This principle of harmonious coexis- 2 Background tence, in fact, extends to every internal management task. Map caching [15] that selectively keeps mapping data in For flash memory to be used as storage, several of its memory generates flash memory traffic on cache misses, limitations need to be addressed. First, it does not allow but this is a mandatory step for locating host data. Read in-place updates, mandating a mapping table between scrubbing [16] that preventively migrates data before its the logical and the physical address space. Second, corruption also creates traffic when blocks are repeatedly the granularities of the two state-modifying operations— read, but failure to perform its duty on time can lead to program and erase—are different in size, making it nec- data loss. As more tasks with unique responsibilities are essary to perform garbage collection (GC) that copies added to the system, it becomes increasingly difficult to valid data to another location for reclaiming space. These design a system that meets its performance and reliability internal management schemes, collectively known as the requirements [13]. flash translation layer (FTL) [11], hide the limitations In this paper, we present an autonomic SSD archi- of flash memory and provide an illusion of a traditional tecture called AutoSSD that self-manages its manage- block storage interface. ment tasks to maintain a high-level of QoS performance. The role of the FTL has become increasingly im- In our design, each task is given a virtualized view of portant as hiding the error-prone nature of flash mem- the flash memory subsystem by hiding the details of ory can be challenging when relying solely on hard- flash memory request scheduling. Each task is allo- ware techniques such as error correction code (ECC) cated a share that represents the amount of progress it and RAID-like parity schemes. Data stored in the flash can make, and a simple yet effective scheduling scheme array may become corrupt in a wide variety of ways. enforces resource arbitration according to the allotted Bits in a cell may be disturbed when neighboring cells shares. The shares are dynamically and automatically ad- are accessed [12, 41, 44], and the electrons in the float- justed through feedback control by monitoring key sys- ing gate that represent data may gradually leak over tem states and reacting to their changes. This achieves time [6, 35, 44]. Sudden power loss can increase bit er- predictable performance by maintaining a stable system. ror rates beyond the error correction capabilities [44,47], We show that for small read requests, AutoSSD reduces and error rates increase as flash memory blocks wear the average response time by up to 18.0%, the 3 nines out [6, 12, 19]. As flash memory becomes less reliable (99.9%) QoS by up to 67.2%, and the 6 nines (99.9999%) in favor of high-density [13], more sophisticated FTL al- QoS by up to 76.6% compared to state-of-the-art tech- gorithms are needed to complement existing reliability niques. Our contributions are as follows: enhancement techniques. Even though modern flash storages are equipped with • We present AutoSSD, an autonomic SSD architec- sophisticated FTLs and powerful controllers, meeting ture that dynamically manages internal housekeep- performance requirements have three main challenges. ing tasks to maintain a stable system state. (x 3) First, as new quirks of flash memory are introduced, • We holistically consider multiple facets of SSD in- more FTL tasks are added to hide the limitations, thereby ternal management, including not only garbage col- increasing the complexity of the system. Furthermore, lection and host request handling, but also mapping existing FTL algorithms need to be fine-tuned for ev- management and read scrubbing. (x 4) ery new generation of flash memory, making it difficult to design a system that universally meets performance • We evaluate our design and compare it to the state- requirements. Second, multiple FTL tasks generate se- of-the-art techniques across diverse workloads, an- quences of flash memory requests that contend for the alyze causes for long tail latencies, and demonstrate resources of the shared flash memory subsystem. This the advantages of dynamic management. (x 5) resource contention creates queueing delays that increase response times and causes long-tail latencies. Lastly, de- The remainder of this paper is organized as follows. pending on the state of the flash storage, the importance x 2 gives a background on understanding why flash stor- of FTL tasks dynamically changes. For example, if the ages exhibit performance unpredictability. x 3 presents flash storage runs out of free blocks for writing host data, the overall architecture of AutoSSD and explains our de- host request handling stalls and waits for garbage col- sign choices. x 4 describes the evaluation methodology lection to reclaim free space. On the other hand, with and the SSD model that implements various FTL tasks, sufficient free blocks, there is no incentive prioritizing and x 5 presents the experimental results under both syn- garbage collection over host request handling. 678 2018 USENIX Annual Technical Conference USENIX Association Autonomic SSD Task queues Flash Flash channel Flash Translation Layer memory subsystem Flash Flash Flash Host queue chip chip chip Host system request Flash channel handling Garbage Flash Flash Flash collection Scheduling Flash memory chip chip chip Subsystem Subsystem Other mgmt tasks Flash channel Read scrubbing Flash Flash Flash chip chip chip Key system states Share (# of clean blocks, weight read count, etc) Share controller Figure 2: Overall architecture of AutoSSD and its components.
Recommended publications
  • Advanced NAND Flash Memory Single-Chip Storage Solution
    SOLUTION SHEET: ADVANCED SOLID STATE MEMORY CONTROLLER Advanced NAND Flash Memory Single-Chip Storage Solution ? The growing complexity of flash memory management algorithms has made controller designs complex and has impacted the performance and the diversity of NAND devices that a single controller can support. As the complexity of the controller design increases, there comes a point where hardware acceleration is necessary. We see this trend for partial FTL (Flash Translation Layer) layers, addressing and wear leveling, and command queuing. At the same time, algorithms and flash device characteristics constantly change through the life of the controller. But data center managers expect the controller to constantly make design tradeoffs in order to improve in-system performance. These changes are also impacted by the traffic patterns of the application. The challenge is to provide hardware-accelerated algorithms that can change frequently in the field. The Context Solid-state storage based on NAND flash technology has become an important new lever for improving data-center performance and predictability. But the solid-state drives also present cost and predictability issues of their own. Surprisingly, these issues can be addressed through design of the flash memory controller. The NAND array has become the dominating factor for the cost of the drive. Increasing the life expectancy of the NAND array obviously reduces the total cost of ownership of the system. Also, the ability to reconfigure the controller to implement improved algorithms or to adapt to a new generation of flash chips extends the system life. Performance predictability and throughput are key concerns for data-center managers.
    [Show full text]
  • Using Data Postcompensation and Predistortion to Tolerate Cell-To-Cell
    2718 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 57, NO. 10, OCTOBER 2010 Using Data Postcompensation and Predistortion to Tolerate Cell-to-Cell Interference in MLC NAND Flash Memory Guiqiang Dong, Student Member, IEEE, Shu Li, and Tong Zhang, Senior Member, IEEE Abstract—With the appealing storage-density advantage, mul- than 1 bit in each memory cell (or floating-gate MOS tran- tilevel-per-cell (MLC) NAND Flash memory that stores more than sistor) by programming the cell threshold voltage into one 1 bit in each memory cell now largely dominates the global Flash of voltage windows, has been widely used to further memory market. However, due to the inherent smaller noise NAND margin, the MLC NAND Flash memory is more subject to various improve the Flash memory storage density. Because of device/circuit variability and noise, particularly as the industry is its obvious advantages in terms of storage density and, hence, pushing the limit of technology scaling and a more aggressive use cost, MLC NAND Flash memory now largely dominates the of MLC storage. Cell-to-cell interference has been well recognized global Flash memory market. In current design practice, most as a major noise source responsible for raw-memory-storage MLC NAND Flash memories store 2 bits per cell, while 3- and reliability degradation. Leveraging the fact that cell-to-cell in- terference is a deterministic data-dependent process and can even 4-bit-per-cell NAND Flash memories have been recently be mathematically described with a simple formula, we present reported in the open literature [2]–[6].
    [Show full text]
  • Speeding Data Access with the Sun Flash Accelerator F20 Pcie Card
    An Oracle White Paper June 2010 Speeding Data Access with the Sun Flash Accelerator F20 PCIe Card Speeding Data Access with the Sun Flash Accelerator F20 PCIe Card Introduction..........................................................................................1 Accelerating Application Performance with Enterprise-Quality Sun FlashFire Technology ..........................................................................................3 Sun Flash Accelerator F20 PCIe Card Highlights ...........................4 Sun Flash Accelerator F20 PCIe Card Architecture ............................5 Sun Flash Accelerator F20 PCIe Card Components...........................5 Disk on Module Design....................................................................6 Enterprise-Quality NAND Flash for Reliability........................................ 8 Serial-Attached SCSI Controller ......................................................8 Twelve-Port Serial-Attached SCSI Expander ..................................8 Supercapacitor Module....................................................................8 Supercapacitors versus Batteries .......................................................... 8 Reliability, Availability, and Serviceability Features.............................9 Deploying the Sun Flash Accelerator F20 PCIe Card .......................11 Flash Technology Considerations .................................................11 Aligning Data Transfers and Block Size............................................... 12 Database Acceleration
    [Show full text]
  • Using Test Beds to Probe Non- Volatile Memories' Dark
    Beyond the Datasheet: Using Test Beds to Probe Non- Volatile Memories’ Dark Secrets Laura M. Grupp*, Adrian M. Caulfield*, Joel Coburn*, John D. Davis†, and Steven Swanson* *University of California, San Diego †Microsoft Research, Silicon Valley Lab [lgrupp, acaulfie, jdcoburn, swanson]@cs.ucsd.edu [email protected] latency, bandwidth, bit error rate, and energy efficiency will all combine to determine which designs are viable Abstract and which are not. Second, they must understand how Non-volatile memories (such as NAND flash and phase the non-volatile memory array will affect the rest of the change memories) have the potential to revolutionize system. If a new storage technology significantly computer systems. However, these technologies have increases performance, it may expose bottlenecks in complex behavior in terms of performance, reliability, other parts of the system. and energy consumption that make fully exploiting Addressing the first challenge requires a their potential a complicated task. As device engineers comprehensive understanding of non-volatile memory push bit densities higher, this complexity will only performance, failure modes, and energy consumption. increase. Managing and exploiting the complex and at This information is difficult to come by. Device times surprising behavior of these memories requires a manufacturers hide the details of their devices behind deep understanding of the devices grounded in non-disclosure agreements, and even when datasheets experimental results. Our research groups have are freely available, they provide scant details: average developed several hardware test beds for flash and performance characteristics, vague recommendations other memories that allow us to both characterize these about error correction requirements, and inexact memories and experimentally evaluate their bounds on energy consumption.
    [Show full text]
  • Flash Memory Guide
    Flash Memory Guide Portable Flash memory for computers, digital cameras, mobile phones and other devices Flash Memory Guide Kingston®, the world’s leading independent manufacturer of memory products, offers a broad range of Flash cards, USB Flash drives and Solid-State Drives (SSD) (collectively called Flash storage devices) that employ Flash memory chips for storage. The purpose of this guide is to explain the various technologies and Flash memory offerings that are available. Note: Due to Flash technology changes, specifications listed in this document are subject to change without notice 1.0 Flash Memory: Empowering A New Generation of Flash Storage Devices Toshiba invented Flash memory in the 1980s as a new memory technology that allowed stored data to be saved even when the memory device was disconnected from its power source. Since then, Flash memory technology has evolved into the preferred storage media for a variety of consumer and industrial devices. In consumer devices, Flash memory is widely used in: • Notebook computers • Personal computers • Tablets • Digital cameras • Global Positioning Systems (GPS) • Mobile phones • Solid-state music players such as • Electronic musical instruments MP3 players • Television set-top boxes • Portable and Home Video game consoles Flash memory is also used in many industrial applications where reliability and data retention in power-off situations are key requirements, such as in: • Security systems/IP Cameras • Military systems • Embedded computers • Set-top boxes • Networking and communication products • Wireless communication devices • Retail management products • Point of Sale Devices (e.g., handheld scanners) Please Note: Most Kingston Flash memory is designed and tested for compatibility with consumer devices.
    [Show full text]
  • Flash Memory Controller 1TB SSD Controller Gbdriver RS4 Series with 3 Gbps Serial ATA
    Flash Memory Controller 1TB SSD Controller GBDriver RS4 Series with 3 Gbps Serial ATA • Supports latest 16 kB/page structure as well as 8 kB/page and 4 kB/page NAND type flash memory, latest 1 TB SSD • Highly reliable high-speed SATA SSD controller with Enhanced ECC and read retry functions November 6, 2012 TDK Corporation has developed the GBDriver RS4 series of NAND type flash memory controllers with support for 3 Gbps serial ATA. Sales will begin in January 2013. With an effective access speed of 180 MB/sec, the GBDriver RS4 is an advanced high- speed SATA controller chip. It is fully compatible not only with the latest 1xnm generation and 2xnm generation 16 kB/page SLC (single level cell) and MLC (multi level cell) NAND type flash memory, but also with 8 kB/page and 4 kB/page structures. This allows the customer to select the optimal flash memory for the respective application, providing the flexibility to configure high-speed SATA storage with a capacity ranging from 512 MB to 1 TB. Along with high-speed control, a read retry function has been added which is essential for ensuring data reliability when using the latest MLC type NAND flash memory. Together with the auto-recovery, data randomizer, and auto-refresh functions of the existing GBDriver series, this design accommodates also future developments in the flash memory format. Data reliability is significantly enhanced by the highly acclaimed GBDriver power interruption tolerance algorithm. For customers requiring an even higher level of data reliability, the Enhanced ECC function makes it possible to expand ECC to 71 bits per 512 byte block.
    [Show full text]
  • 88ALP01 PCI to NAND, SD and Camera Host Controller Datasheet
    Cover 88ALP01 PCI to NAND, SD and Camera Host Controller Datasheet Doc. No. MV-S103921-00, Rev. – July 17, 2007 Marvell. Moving Forward Faster Document Classification: Proprietary Information 88ALP01 Datasheet Document Conventions Note: Provides related information or information of special importance. Caution: Indicates potential damage to hardware or software, or loss of data. Warning: Indicates a risk of personal injury. Document Status Doc Status: Preliminary Technical Publication: x.xx For more information, visit our website at: www.marvell.com Disclaimer No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose, without the express written permission of Marvell. Marvell retains the right to make changes to this document at any time, without notice. Marvell makes no warranty of any kind, expressed or implied, with regard to any information contained in this document, including, but not limited to, the implied warranties of merchantability or fitness for any particular purpose. Further, Marvell does not warrant the accuracy or completeness of the information, text, graphics, or other items contained within this document. Marvell products are not designed for use in life-support equipment or applications that would cause a life-threatening situation if any such products failed. Do not use Marvell products in these types of equipment or applications. With respect to the products described herein, the user or recipient, in the absence of appropriate U.S. government authorization, agrees: 1) Not to re-export or release any such information consisting of technology, software or source code controlled for national security reasons by the U.S.
    [Show full text]
  • NAND Flash Controller 13 2014.08.18
    NAND Flash Controller 13 2014.08.18 a10_54010 Subscribe Send Feedback The hard processor system (HPS) provides a NAND flash controller to interface with external NAND flash memory in Altera® system-on-a-chip (SoC) FPGA systems. You can use external flash memory to store a processor boot image, software, or as extra storage capacity for large applications or user data. The HPS NAND flash controller is based on the Cadence® Design IP® NAND Flash Memory Controller. NAND Flash Controller Features The NAND flash controller provides the following functionality and features: • Supports one x8 or x16 NAND flash device Note: The HPS supports booting only from a x8 NAND flash device. • Supports Open NAND Flash Interface (ONFI) 1.0 • Supports NAND flash memories from Hynix, Samsung, Toshiba, Micron, and STMicroelectronics • Supports programmable 512 byte (4-, 8-, or 16-bit correction) or 1024 byte (24-bit correction) error correction code (ECC) sector size • Supports pipeline read-ahead and write commands for enhanced read/write throughput • Supports devices with 32, 64, 128, 256, 384, or 512 pages per block • Supports multiplane devices • Supports page sizes of 512 bytes, 2 kilobytes (KB), 4 KB, or 8 KB • Supports single-level cell (SLC) and multi-level cell (MLC) devices with programmable correction capabilities • Provides internal direct memory access (DMA) • Provides programmable access timing • Provides three NAND FIFOs—ECC buffer, Write FIFO, and Read FIFO, using single-error detection/correction and double-error detection NAND Flash Controller Block Diagram and System Integration The flash controller receives commands and data from the host through the command and data slave interface.
    [Show full text]
  • 2013 Annual Report
    SILICON MOTION TECHNOLOGY CORPORATION Annual Report 2013 Financial Highlights Product Mix % of Net Sales (2013) Mobile Storage 82 % Mobile Communications 14 % Others 4 % Net Sales US$ million 2009 89 2010 133 2011 224 2012 281 2013 225 Contents Financial Highlights 01 Letter to Shareholders 02 About Silicon Motion 04 Form 20-F Directors and Executives Gross Margin % of Net Sales 2009 41% 2010 47% 2011 48% 2012 47% 2013 47% Diluted Earnings per ADS US$ 2009 - 2.54 2010 - 0.18 2011 1.28 2012 1.40 2013 0.81 01 Letter to Shareholders Dear Shareholders, 2013 was a transitional year for Silicon Motion as sales of removable storage controllers for mature card and USB flash drive markets declined while sales of SSD plus embedded storage controllers, especially for eMMC growth markets, grew rapidly. 2013 was also a transitional year for our LTE transceiver products as the market quickly moved to the significantly enhanced LTE-Advanced standard. While the negative effects of the transitions were dramatic from the perspective of revenue and profitability, we believe we have built a strong foundation for renewed, long-term growth by focusing on eMMC, SSD, and other embedded storage applications and are confident this will lead to strong growth in upcoming years. In 2013, sales of our SSD plus embedded products grew almost 100% year-over-year to account for almost half of our Mobile Storage sales. Sales of eMMC controllers, our largest SSD plus embedded product, are now already larger than our card and USB flash drive controllers sales, and account for roughly a third of our total revenue.
    [Show full text]
  • Flash Memory Guide
    Flash Memory Guide Portable Flash memory for computers, digital cameras, mobile phones and other devices Flash Memory Guide Kingston®, the world’s leading independent manufacturer of memory products, offers a broad range of Flash cards, USB Flash drives and Solid-State Drives (SSD) (collectively called Flash storage devices) that employ Flash memory chips for storage. The purpose of this guide is to explain the various technologies and Flash memory offerings that are available. Note: Due to Flash technology changes, specifications listed in this document are subject to change without notice 1.0 Flash Memory: Empowering A New Generation of Flash Storage Devices Toshiba invented Flash memory in the 1980s as a new memory technology that allowed stored data to be saved even when the memory device was disconnected from its power source. Since then, Flash memory technology has evolved into the preferred storage media for a variety of consumer and industrial devices. In consumer devices, Flash memory is widely used in: • Notebook computers • Personal computers • Tablets • Digital cameras • Global Positioning Systems (GPS) • Mobile phones • Solid-state music players such as • Electronic musical instruments MP3 players • Television set-top boxes Flash memory is also used in many industrial applications where reliability and data retention in power-off situations are key requirements, such as in: • Security systems • Military systems • Embedded computers • Solid-state drives • Networking and communication products • Wireless communication devices • Retail management products • Point of Sale Devices (e.g., handheld scanners) 2.0 SSD, Flash Card and USB Flash Drive Capacity Some of a Flash storage device’s listed capacity is used for formatting and other functions and thus is not available for data storage.
    [Show full text]
  • X1 SSD Flash Memory Controller
    X1 SSD Flash Memory Controller Hyperstone GmbH Hyperstone Inc. - USA Hyperstone Asia Pacific - Taiwan Line-Eid-Strasse 3 486 N Patterson Ave Suite 301 3F., No. 501, Sec.2, Tiding Blvd. 78467 Konstanz Winston-Salem, NC 27101 Neihu District, Taipei City 114 Germany USA Taiwan, R.O.C. Phone: +49 7531 980 30 Phone: +1 336 744 0724 Phone: +886 2 8751 0203 Email: [email protected] Email: [email protected] Email: [email protected] X1 www.hyperstone.com Content is subject to change without prior notice. X1, hyReliability and hyMap are trademarks of Hyperstone GmbH. Other brand, product, or company names are property of the respective holder. Warranties, implied or expressed as well as liabilities for any damage resulting from using provided information in this document are excluded until part of a separate written contract. HS-Mkt-LH-19-01-25 X1 SSD Flash Memory Controller ■ The Hyperstone X1 family of SATA SSD NAND flash controllers together with provided firmware, reference designs and tools offers an X1 Block Diagram Controller & CPU easy-to-use turnkey solution for industrial, high endurance and robust Solid State Drives, or modules compatible to host systems with ■■High performance dual-core microprocessor based on the ■■Flash memory power down logic and write protect control SATA interface. The X1 family enables SATA storage systems with the latest flash memory, advanced security and highest reliability. Hyperstone architecture ■■Firmware storage in flash, loaded into controller by boot ROM ■■16 GPIO pins provide flexibility
    [Show full text]
  • Scalable Parallel Flash Firmware for Many-Core
    Scalable Parallel Flash Firmware for Many-core Architectures Jie Zhang and Miryeong Kwon, KAIST; Michael Swift, University of Wisconsin-Madison; Myoungsoo Jung, KAIST https://www.usenix.org/conference/fast20/presentation/zhang-jie This paper is included in the Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST ’20) February 25–27, 2020 • Santa Clara, CA, USA 978-1-939133-12-0 Open access to the Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST ’20) is sponsored by Scalable Parallel Flash Firmware for Many-core Architectures Jie Zhang1, Miryeong Kwon1, Michael Swift2, Myoungsoo Jung1 Computer Architecture and Memory Systems Laboratory, Korea Advanced Institute of Science and Technology (KAIST)1, University of Wisconsin at Madison2 http://camelab.org Abstract requests are sent to tens or hundreds of flash packages. This enables assigning queues to different applications; multiple NVMe is designed to unshackle flash from a traditional stor- deep NVMe queues allow the host to employ many threads age bus by allowing hosts to employ many threads to achieve thereby maximizing the storage utilization. higher bandwidth. While NVMe enables users to fully exploit An SSD should handle many concurrent requests with its all levels of parallelism offered by modern SSDs, current massive internal parallelism [12,31,33,34,61]. However, it firmware designs are not scalable and have difficulty in han- is difficult for a single storage device to manage the tremen- dling a large number of I/O requests in parallel due to its dous number of I/O requests arriving in parallel over many limited computation power and many hardware contentions.
    [Show full text]