Mqsim: a Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices
Total Page:16
File Type:pdf, Size:1020Kb
MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol, Juan Gómez-Luna, and Mohammad Sadrosadati, ETH Zürich; Saugata Ghose, Carnegie Mellon University; Onur Mutlu, ETH Zürich and Carnegie Mellon University https://www.usenix.org/conference/fast18/presentation/tavakkol This paper is included in the Proceedings of the 16th USENIX Conference on File and Storage Technologies. February 12–15, 2018 • Oakland, CA, USA ISBN 978-1-931971-42-3 Open access to the Proceedings of the 16th USENIX Conference on File and Storage Technologies is sponsored by USENIX. MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol†, Juan Gomez-Luna´ †, Mohammad Sadrosadati†, Saugata Ghose‡, Onur Mutlu†‡ †ETH Zurich¨ ‡Carnegie Mellon University Abstract sponse time, and decreasing cost, SSDs have replaced traditional magnetic hard disk drives (HDDs) in many Solid-state drives (SSDs) are used in a wide array of datacenters and enterprise servers, as well as in consumer computer systems today, including in datacenters and en- devices. As the I/O demand of both enterprise and con- terprise servers. As the I/O demands of these systems sumer applications continues to grow, SSD architectures continue to increase, manufacturers are evolving SSD ar- are rapidly evolving to deliver improved performance. chitectures to keep up with this demand. For example, manufacturers have introduced new high-bandwidth in- For example, a major innovation has been the intro- terfaces to replace the conventional SATA host–interface duction of new host interfaces to the SSD. In the past, protocol. These new interfaces, such as the NVMe proto- many SSDs made use of the Serial Advanced Technology col, are designed specifically to enable the high amounts Attachment (SATA) protocol [67], which was originally of concurrent I/O bandwidth that SSDs are capable of designed for HDDs. Over time, SATA has proven to be delivering. inefficient for SSDs, as it cannot enable the fast I/O ac- While modern SSDs with sophisticated features such cesses and millions of I/O operations per second (IOPS) as the NVMe protocol are already on the market, exist- that contemporary SSDs are capable of delivering. New ing SSD simulation tools have fallen behind, as they do protocols such as NVMe [63] overcome these barriers not capture these new features. We find that state-of-the- as they are designed specifically for the high through- art SSD simulators have three shortcomings that prevent put available in SSDs. NVMe enables high throughput them from accurately modeling the performance of real and low latency for I/O requests through its use of the off-the-shelf SSDs. First, these simulators do not model multi-queue SSD (MQ-SSD) concept. While SATA ex- critical features of new protocols (e.g., NVMe), such as poses only a single request port to the OS, MQ-SSD pro- their use of multiple application-level queues for requests tocols provide multiple request queues to directly expose and the elimination of OS intervention for I/O request applications to the SSD device controller. This allows processing. Second, these simulators often do not accu- (1) an application to bypass OS intervention for I/O re- rately capture the impact of advanced SSD maintenance quest processing, and (2) the SSD controller to schedule algorithms (e.g., garbage collection), as they do not prop- I/O requests based on how busy the SSD’s resources are. erly or quickly emulate steady-state conditions that can As a result, the SSD can make higher-performance I/O significantly change the behavior of these algorithms in request scheduling decisions. real SSDs. Third, these simulators do not capture the As SSDs and their associated protocols evolve to keep full end-to-end latency of I/O requests, which can incor- pace with changing system demands, the research com- rectly skew the results reported for SSDs that make use munity needs simulation tools that reliably model these of emerging non-volatile memory technologies. By not new features. Unfortunately, state-of-the-art SSD simu- accurately modeling these three features, existing sim- lators do not model a number of key properties of mod- ulators report results that deviate significantly from real ern SSDs that are already on the market. We evaluate SSD performance. several real modern SSDs, and find that state-of-the-art In this work, we introduce a new simulator, called simulators do not capture three features that are critical MQSim, that accurately models the performance of to accurately model modern SSD behavior. both modern SSDs and conventional SATA-based SSDs. First, these simulators do not correctly model the MQSim faithfully models new high-bandwidth protocol multi-queue approach used in modern SSD protocols. In- implementations, steady-state SSD conditions, and the stead, they implement only the single-queue approach full end-to-end latency of requests in modern SSDs. We used in HDD-based protocols such as SATA. As a result, validate MQSim, showing that it reports performance re- existing simulators do not capture (1) the high amount of sults that are only 6%-18% apart from the measured ac- request-level parallelism and (2) the lack of OS interven- tual performance of four real state-of-the-art SSDs. We tion in modern SSDs. show that by modeling critical features of modern SSDs, Second, many simulators do not adequately model MQSim uncovers several real and important issues that steady-state behavior within a reasonable amount of sim- were not captured by existing simulators, such as the per- ulation time. A number of fundamental SSD main- formance impact of inter-flow interference. We have re- tenance algorithms, such as garbage collection [11– leased MQSim as an open-source tool, and we hope that 13, 23], are not executed when an SSD is new (i.e., no it can enable researchers to explore directions in new and data has been written to the drive). As a result, manufac- different areas. turers design these maintenance algorithms to work best when an SSD reaches the steady-state operating point 1 Introduction (i.e., after all of the pages within the SSD have been Solid-state drives (SSDs) are widely used in today’s written to at least once) [71]. However, simulators that computer systems. Due to their high throughput, low re- cannot capture steady-state behavior (within a reasonable USENIX Association 16th USENIX Conference on File and Storage Technologies 49 simulation time) perform these maintenance algorithms down each flow unequally) in modern SSDs. This is on a new SSD. As such, many existing simulators do a major concern, as fairness is a first-class design goal not adequately capture algorithm behavior under realistic in modern computing platforms [4, 17, 19, 31, 37, 56– conditions, and often report unrealistic SSD performance 60, 66, 73–76, 80, 84, 88]. Unfairness reduces the pre- results (as we discuss in Section 3.2). dictability of the I/O latency and throughput for each Third, these simulators do not capture the full end-to- flow, and can allow a malicious flow to deny or delay end latency of performing I/O requests. Existing sim- I/O service to other, benign flows. ulators capture only the part of the request latency that We have made MQSim available as an open source takes place during intra-SSD operations. However, many tool to the research community [1]. We hope that emerging high-speed non-volatile memories greatly re- MQSim enables researchers to explore directions in sev- duce the latency of intra-SSD operations, and, thus, the eral new and different areas. uncaptured parts of the latency now make up a signif- We make the following key contributions in this work: icant portion of the overall request latency. For exam- • We use real off-the-shelf SSDs to show that state- ple, in Intel Optane SSDs, which make use of 3D XPoint of-the-art SSD simulators do not adequately capture memory [9, 25], the overhead of processing a request and three important properties of modern SSDs: (1) the transferring data over the system I/O bus (e.g., PCIe) is multi-queue model used by modern host–interface much higher than the memory access latency [16]. By protocols such as NVMe, (2) steady-state SSD behav- not capturing the full end-to-end latency, existing simu- ior, and (3) the end-to-end I/O request latency. lators do not report the true performance of SSDs with • We introduce MQSim, a simulator that accurately new and emerging memory technologies. models both modern NVMe-based and conventional Based on our evaluation of real modern SSDs, we find SATA-based SSDs. To our knowledge, MQSim is that these three features are essential for a simulator to the first publicly-available SSD simulator to faithfully capture. Because existing simulators do not model these model the NVMe protocol. We validate the results re- features adequately, their results deviate significantly ported by MQSim against several real state-of-the-art from the performance of real SSDs. Our goal in this multi-queue SSDs. work is to develop a new SSD simulator that can faith- • We demonstrate how MQSim can uncover important fully model the features and performance of both modern issues in modern SSDs that existing simulators cannot multi-queue SSDs and conventional SATA-based SSDs. capture, such as the impact of inter-flow interference To this end, we introduce MQSim, a new simulator that on fairness and system performance. provides an accurate and flexible framework for evaluat- ing SSDs. MQSim addresses the three shortcomings we 2 Background found in existing simulators, by (1) providing detailed In this section, we provide a brief background on multi- models of both conventional (e.g., SATA) and modern queue SSD (MQ-SSD) devices.