Flash-Aware Database Management Systems Hardock, Sergej (2020)

Flash-aware Database Management Systems Hardock, Sergej (2020) DOI (TUprints): https://doi.org/10.25534/tuprints-00014476 License: CC-BY-SA 4.0 International - Creative Commons, Attribution Share-alike Publication type: Ph.D. Thesis Division: 20 Department of Computer Science Original source: https://tuprints.ulb.tu-darmstadt.de/14476 Flash-aware Database Management Systems Zur Erlangung des akademischen Grades Doktor-Ingenieur (Dr.-Ing.) Genehmigte Dissertation von Sergej Hardock aus Kiew, Ukraine Tag der Einreichung: 30.7.2020, Tag der Prüfung: 4.11.2020 1. Gutachten: Prof. Dr. Carsten Binnig 2. Gutachten: Prof. Dr.-Ing. Ilia Petrov 3. Gutachten: Alejandro Buchmann, Ph.D. Darmstadt Computer Science Department Technical University of Darmstadt Data Management Lab Flash-aware Database Management Systems Accepted doctoral thesis by Sergej Hardock 1. Review: Prof. Dr. Carsten Binnig 2. Review: Prof. Dr.-Ing. Ilia Petrov 3. Review: Alejandro Buchmann, Ph.D. Date of submission: 30.7.2020 Date of thesis defense: 4.11.2020 Darmstadt Bitte zitieren Sie dieses Dokument als: URN: urn:nbn:de:tuda-tuprints-144769 URL: http://tuprints.ulb.tu-darmstadt.de/id/eprint/14476 Dieses Dokument wird bereitgestellt von tuprints, E-Publishing-Service der TU Darmstadt http://tuprints.ulb.tu-darmstadt.de [email protected] Die Veröffentlichung steht unter folgender Creative Commons Lizenz: Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International https://creativecommons.org/licenses/by-sa/4.0 This dissertation is dedicated to my parents who encouraged me to follow my dreams. Erklärungen laut Promotionsordnung §8 Abs. 1 lit. c PromO Ich versichere hiermit, dass die elektronische Version meiner Dissertation mit der schriftli- chen Version übereinstimmt. §8 Abs. 1 lit. d PromO Ich versichere hiermit, dass zu einem vorherigen Zeitpunkt noch keine Promotion versucht wurde. In diesem Fall sind nähere Angaben über Zeitpunkt, Hochschule, Dissertationsthe- ma und Ergebnis dieses Versuchs mitzuteilen. §9 Abs. 1 PromO Ich versichere hiermit, dass die vorliegende Dissertation selbstständig und nur unter Verwendung der angegebenen Quellen verfasst wurde. §9 Abs. 2 PromO Die Arbeit hat bisher noch nicht zu Prüfungszwecken gedient. Darmstadt, 30.7.2020 S. Hardock v Acknowledgments First, I would like to thank my supervisors Ilia Petrov and Alejandro Buchmann. My interest in database management systems, which continuously evolved over the years, began back in year 2010 with your exciting lectures. After different projects and Master thesis done under your supervision, it was a great pleasure to get an offer to work as PhD student in your team. Without your support, help and guidance I would not be able to finish this dissertation. I wish to thank all other members of defense committee Carsten Binnig, Andreas Koch, Christian Bischof and Max Mühlhäuser for meaningful and fruitful discussions, comments and suggestions. Many thanks to my former colleagues Robert Gottstein, Daniel Bausch, Dejan Petkov, Alexander Frömmgen and Robert Rehner. Your support has helped me a lot during the research. I really appreciate the work and spare time we spent together at university and on conferences. I like to express my gratitude to Maria Tiedemann for the help in lots of organisational topics. I had great pleasure of working with undergraduate students Sruthi Parvathy Subra- manian, Mohan Kanth Dayanandan and Elena-Madalina Cososchi. Thank you for your contribution to this project. Special thanks to my family, especially to my dear parents. Your support and belief in me helped me in various difficult moments over these 6 years. You did not let me giveup at the final, most difficult steps. vii Abstract Flash SSDs are becoming the primary storage technology for single servers and large data centers. In contrast to conventional magnetic disks, which were dominating the storage market for more than 40 years, Flash offers significantly more performance, consumes less energy and has lower cost per IOPS (I/O Operations Per Second). Besides these advantages, an important role in establishment and quick proliferation of Flash storage was played by the black-box design of SSDs, which guaranteed their backwards compatibility with the traditional hard disk drives. This makes the replacement of HDDs seamless as the software stack, including the application, does not require any adjustment. However, such design of SSDs has multiple disadvantages, which become especially critical for database management systems. The backwards compatibility of SSDs is encapsulated in the so-called Flash translation layer (FTL). FTL is a set of Flash management tasks that typically run on device and mask the native behavior of Flash memory. In other words, FTL creates a black-box over Flash memory and emulates the behavior of HDDs. The fact that the database system has no knowledge about FTL, and has no control over the physical data placement on Flash, results in high I/O overhead, which is caused by suboptimal realization of Flash management tasks and functional redundancy along the critical I/O path. Thus, write-amplification of conventional SSDs used in traditional ’cooked’ storage architecture (i.e., with file system indirection) can be as high as 15x, i.e., a single 4KB write request submitted by the DBMS can turn into 60KB being physically written on Flash storage. As a result, the effective I/O throughput and longevity expectations of SSDs are significantly lower than those of Flash memory encapsulated in these SSDs. In this work we describe our approach - the NoFTL storage architecture - that aims to solve the aforementioned disadvantages of modern Flash SSDs. The basic idea behind the NoFTL is to give the full control over the underlying Flash storage to the database management system, which in turn assumes elimination of all intermediate abstraction layers (file system, block device layer and FTL) between the DBMS and physical storage. NoFTL consists of three main elements - (i) native Flash interface; (ii) integration of Flash management into subsystems of the DBMS; and (iii) the concept of configurable Flash storage. The interplay of them allows us to realize the whole performance potential of ix Flash memory. Native Flash interface allows the DBMS to control physical data placement on Flash storage, and to utilize the computational power of the SSD to perform near- data processing. Integration of typical Flash management tasks (address translation, garbage collection and wear leveling) into different subsystems of the DBMS leads toan optimization of these tasks and of native DBMS algorithms. The concept of configurable Flash storage is a unique approach to organize and manage data on Flash SSDs. With the help of novel storage abstractions, the database system can perform intelligent data placement by clustering objects into different regions. Moreover, for each such region the DBMS can apply a separate set of Flash management algorithms, which would be optimal for data assigned to that region. All this reduces the write-amplification of SSDs to a minimum (up to 15x reduction for OLTP workloads), improves the overall system performance, and significantly increases the lifetime of Flash SSDs (up to 30x improvement). We have realized the NoFTL prototype on an open-source database engine and evaluated it under various scenarios and on different testbeds. x Zusammenfassung Flash-SSDs werden zur primären Speichertechnologie für einzelne Server und große Rechenzentren. Im Gegensatz zu herkömmlichen Festplatten, die mehr als 40 Jahre lang den Speichermarkt dominierten, bietet Flash deutlich mehr Leistung, verbraucht weniger Energie und hat geringere Kosten pro IOPS (I/O-Operationen pro Sekunde). Eine wichtige Rolle bei der Etablierung und schnellen Verbreitung von Flash-Speichern spielte neben diesen Vorteilen auch das Black-Box-Design von SSDs, wodurch deren Abwärtskompatibi- lität mit den herkömmlichen Festplatten garantiert wurde. Dies macht den Austausch von Festplatten nahtlos, da der Software-Stack einschließlich der Anwendung keine Anpassung erfordert. Ein solches Design von SSDs weist jedoch mehrere Nachteile auf, die für die Datenbankverwaltungssysteme besonders kritisch werden. Die Abwärtskompatibilität von SSDs ist in der sogenannten Flash Translation Layer (FTL) eingekapselt. FTL ist eine Reihe von Flash-Verwaltungsaufgaben, die normalerweise auf dem Gerät ausgeführt werden und das native Verhalten des Flash-Speichers mas- kieren. Mit anderen Worten, FTL erstellt eine Blackbox über dem Flash-Speicher und emuliert das Verhalten von Festplatten. Die Tatsache, dass das Datenbanksystem keine Kenntnisse über FTL und auch keine Kontrolle über die physische Datenplatzierung auf dem Flash hat, führt zu einem hohen I/O-Overhead, der durch die suboptimale Realisie- rung von Flash-Verwaltungsaufgaben und funktionale Redundanz entlang des kritischen I/O-Pfades verursacht wird. Somit kann der Schreibfaktor (write amplification) herkömm- licher SSDs, die in der üblichen "cookedSSpeicherarchitektur verwendet werden (d.h. mit Dateisystem-Indirektion), bis zu 15x sein. So kann eine einzelne 4KB Schreibanforderung, die vom DBMS gesendet wird, bis zu 60KB werden, die physisch auf dem Flash-Speicher geschrieben sind. Infolgedessen sind der effektive I/O-Durchsatz und die Langlebigkeits- erwartungen von SSDs erheblich niedriger als die von in diesen SSDs eingekapselten Flash-Speichern. In dieser Arbeit beschreiben wir unseren Ansatz - die NoFTL-Speicherarchitektur, der darauf abzielt, die

Flash-Aware Database Management Systems Hardock, Sergej (2020)

PC Gamers Win Big

SPDK FTL & Marvell OCSSD for Noisy Neighbor Problem

Information Hiding at SSD NAND Flash Memory Physical Layer

Micron: NAND Flash Architecture and Specification Trends

SSD ABC Guide – OCZ Forum Ii © 2014 OCZ Storage Solutions

Intelligent Controllers Key to Solid State Storage Systems

Arc 100 Series Product Highlights

Radeon™ R7 Series Solid State Drives

The SSD Relapse: Understanding and Choosing the Best SSD

Manual 417045 OCZ VTX3-25SAT3-480G

Onfi: Achieving Breakthrough NAND Performance

Lessons in Transitioning Research Simulations Into Hardware Systems