Adopting Zoned Storage in Distributed Storage Systems Abutalib Aghayev

Total Page:16

File Type:pdf, Size:1020Kb

Adopting Zoned Storage in Distributed Storage Systems Abutalib Aghayev Adopting Zoned Storage in Distributed Storage Systems Abutalib Aghayev CMU-CS-òý-Ôçý August òýòý Computer Science Department School of Computer Science Carnegie Mellon University Pittsburgh, PA Ô òÔç esis Committee: George Amvrosiadis, Chair Gregory R. Ganger Garth A. Gibson Peter J. Desnoyers, Northeastern University Remzi H. Arpaci-Dusseau, University of Wisconsin–Madison Sage A. Weil, Red Hat, Inc. Submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy. Copyright © òýòý Abutalib Aghayev is research was sponsored by the Los Alamos National Laboratory award number: çÀ¥Àýç; by a Hima and Jive Fellowship in Computer Science for International Students; and by a Carnegie Mellon University Parallel Data Lab (PDL) Entrepreneurship Fellowship. e views and conclusions contained in this document are those of the author and should not be interpreted as representing the ocial policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: le systems, distributed storage systems, shingled magnetic recording, zoned names- paces, zoned storage, hard disk drives, solid-state drives Abstract Hard disk drives and solid-state drives are the workhorses of modern storage sys- tems. For the past several decades, storage systems soware has communicated with these drives using the block interface. e block interface was introduced early on with hard disk drives, and as a result, almost every storage system in use today was built for the block interface. erefore, when ash memory based solid-state drives recently became viable, the industry chose to emulate the block interface on top of ash mem- ory by running a translation layer inside the solid-state drives. is translation layer was necessary because the block interface was not a direct t for the ash memory. More recently, hard disk drives are shiing to shingled magnetic recording, which in- creases capacity but also violates the block interface. us, emerging hard disk drives are also emulating the block interface by running a translation layer inside the drive. Emulating the block interface using a translation layer, however, is becoming a source of signicant performance and cost problems in distributed storage systems. In this dissertation, we argue for the elimination of the translation layer—and consequently the block interface. We propose adopting the emerging zone interface instead—a natural t for both high-capacity hard disk drives and solid-state drives— and rewriting the storage backend component of distributed storage systems to use this new interface. Our thesis is that adopting the zone interface using a special- purpose storage backend will improve the cost-eectiveness of data storage and the predictability of performance in distributed storage systems. We provide the following evidence to support our thesis. First, we introduce Sky- light, a novel technique to reverse engineer the translation layers of modern hard disk drives and demonstrate the high garbage collection overhead of the translation layers. Second, based on the insights from Skylight we develop ext¥-lazy, an extension of the popular ext¥ le system, which is used as a storage backend in many distributed storage systems. Ext¥-lazy signicantly improves performance over ext¥ on hard disk drives with a translation layer, but it also shows that in the presence of a translation layer it is hard to achieve the full potential of a drive with evolutionary le system changes. ird, we show that even in the absence of a translation layer, the abstrac- tions provided by general-purpose le systems such as ext¥ are inappropriate for a storage backend. To this end, we study a decade-long evolution of a widely used dis- tributed storage system, Ceph, and pinpoint technical reasons that render general- purpose le systems unt for a storage backend. Fourth, to show the advantage of a special-purpose storage backend in adopting the zone interface, as well as the advan- tages of the zone interface itself, we extend BlueStore—Ceph’s special-purpose storage backend—to work on zoned devices. As a result of this work, we demonstrate how having a special-purpose backend in Ceph enables quick adoption of the zone inter- face, how the zone interface eliminates in-device garbage collection when running RocksDB (a key-value database used for storing metadata in BlueStore), and how the zone interface enables Ceph to reduce tail latency and increase cost-eectiveness of data storage without sacricing performance. iv Acknowledgments My doctoral journey has been far from the ordinary: it spanned two schools, ve advisors, and more years than I care to admit. In hindsight, it was a blessing in disguise—I met many great people, some of whom became life long friends. Below, I will try to acknowledge them, and I apologize if I miss anyone. I am grateful to George Amvrosiadis and Greg Ganger for agreeing to advise me at a critical stage of my studies and for helping me cross the nish line. It was a packed two years of work, but George and Greg’s laid-back and cheerful attitude made them bearable and fun. I thank Garth Gibson and Eric Xing for agreeing to advise me upon my arrival at CMU and for introducing me to the eld of machine learning systems. Garth was instrumental in my coming to CMU, and I received his unwavering support throughout the years, for which I am grateful. I am grateful to Peter Desnoyers for ad- vising me at Northeastern, for supporting my decision to reapply to graduate schools late into my studies, and for always being there for me. Remzi Arpaci-Dusseau has been my unocial advisor since that lucky day when I rst met him at SOSP ’Ôç. I am grateful for his continuous support inside and outside of academia and for his serving in my thesis committee. While at CMU, I also briey worked with Phil Gibbons and Ehsan Totoni, and I am grateful to them for the opportunity. I was also fortunate to have great collaborators from the industry. Sage Weil and eodore Ts’oare two super hackers, respectively responsible for leading the develop- ment of a distributed storage system and a le system used by millions. Yetthey found time to meet with me regularly, to patiently answer my questions, and to discuss re- search directions. Towards the end of my studies, I asked Sage to serve on my thesis committee, and he kindly agreed. I am grateful to Sage and Ted for their support. Matias Bjørling turned my request for his slides into an internship, and thus started our collaboration on zoned storage, which turned out to be crucial to my dissertation. Tim Feldman thoroughly answered my questions as I was getting started with my re- search on shingled magnetic recording disks. I am grateful to Matias, Tim, and many other brilliant engineers and hackers, including Samuel Just, Kefu Chai, Igor Fedo- tov, Mark Nelson, Hans Holmberg, Damien Le Moal, Marc Acosta, Mark Callaghan, and Siying Dong, whose answers to my countless queries greatly contributed to my technical knowledge base. As a member of the Parallel Data Lab (PDL) I beneted immensely from the op- portunities it provided, and I am therefore grateful to all who make the PDL a great lab to be a part of. ank you Jason Boles, Chad Dougherty, Mitch Franzos, and Chuck Cranor for keeping the PDL clusters running and for xing my never-ending techni- cal problems. ank you Joan Digney for always polishing my slides and posters and proofreading my papers. ank you Karen Lindenfelser, Bill Courtright, and Greg Ganger for organizing the wonderful PDL events and for keeping it all running, and Garth Gibson, for founding the PDL. ank you all the organizations that contribute to the PDL, and thank you Hugo Patterson for the PDL Entrepreneurship Fellowship. I am thankful to the sta of the Computer Science Department (CSD), including Deb Cavlovich for her help with the administrative aspects of the doctoral process, and Catherine Copetas for her help with organizing CSD events. I am thankful to the anonymous donor who established the Hima and Jive Fellowship in Computer Science for International Students. During my studies I made many friends. It was a privilege to share an oce with Samuel Yeom from whom I learned so much, ranging from the ner details of the U.S. political system and why Ruth Bader Ginsburg is not retiring yet, to countless obscure facts, thanks to our daily ritual of solving e New York Times crossword aer lunch. (To be fair, Sam would mostly solve it and I would try to catch up and feel elated when contributing an answer or two.) Michael Kuchnik, Jin Kyu Kim, and Aurick Qiao were the best collaborators one could wish for: smart, humble, hard working, and fun. Saurabh Kadekodi and Jinliang Wei showed me the ropes around the campus and the department when I rst arrived at CMU. As I was preparing to leave CMU, Anuj Kalia shared with me his experience of navigating the academic job market and gave me useful tips. I thank all my friends, colleagues, and professors at CMU and Northeastern, including Umut Acar, Maruan Al-Shedivat, David Andersen, Joy Arul- raj, Kapil Arya, Mehmet Berat Aydemir, Ahmed Badawi, Nathan Beckmann, Naama Ben-David, Daniel Berger, Brandon Bohrer, Sol Boucher, Harsh Raju Chamarthi, Do- minic Chen, Andrew Chung, Gene Cooperman, Henggang Cui, Wei Dai, Omar De- len, Chris Fallin, Pratik Fegade, Matthias Felleisen, Mohammed Hossein Hajkazemi, Mor Harchol-Balter, Aaron Harlap, Kevin Hsieh, Angela Jiang, David Kahn, Rajat Kateja, Ryan Kavanagh, Jack Kosaian, Yixin Luo, Lin Ma, Pete Manolios, Sara McAl- lister, Prashanth Menon, Alan Mislove, Mansour Shafaei Moghaddam, Todd Mowry, Ravi Teja Mullapudi, Onur Mutlu, Guevara Noubir, Jun Woo Park, İlqar Ramazanlı, Alexey Tumanov, David Wajc, Christo Wilson, Pengtao Xie, Jason Yang, Hao Zhang, Huanchen Zhang, and Qing Zheng, for their help and support and random fun chats over the years.
Recommended publications
  • News, Information, Rumors, Opinions, Etc
    http://www.physics.miami.edu/~chris/srchmrk_nws.html ● Miami-Dade/Broward/Palm Beach News ❍ Miami Herald Online Sun Sentinel Palm Beach Post ❍ Miami ABC, Ch 10 Miami NBC, Ch 6 ❍ Miami CBS, Ch 4 Miami Fox, WSVN Ch 7 ● Local Government, Schools, Universities, Culture ❍ Miami Dade County Government ❍ Village of Pinecrest ❍ Miami Dade County Public Schools ❍ ❍ University of Miami UM Arts & Sciences ❍ e-Veritas Univ of Miami Faculty and Staff "news" ❍ The Hurricane online University of Miami Student Newspaper ❍ Tropic Culture Miami ❍ Culture Shock Miami ● Local Traffic ❍ Traffic Conditions in Miami from SmartTraveler. ❍ Traffic Conditions in Florida from FHP. ❍ Traffic Conditions in Miami from MSN Autos. ❍ Yahoo Traffic for Miami. ❍ Road/Highway Construction in Florida from Florida DOT. ❍ WSVN's (Fox, local Channel 7) live Traffic conditions in Miami via RealPlayer. News, Information, Rumors, Opinions, Etc. ● Science News ❍ EurekAlert! ❍ New York Times Science/Health ❍ BBC Science/Technology ❍ Popular Science ❍ Space.com ❍ CNN Space ❍ ABC News Science/Technology ❍ CBS News Sci/Tech ❍ LA Times Science ❍ Scientific American ❍ Science News ❍ MIT Technology Review ❍ New Scientist ❍ Physorg.com ❍ PhysicsToday.org ❍ Sky and Telescope News ❍ ENN - Environmental News Network ● Technology/Computer News/Rumors/Opinions ❍ Google Tech/Sci News or Yahoo Tech News or Google Top Stories ❍ ArsTechnica Wired ❍ SlashDot Digg DoggDot.us ❍ reddit digglicious.com Technorati ❍ del.ic.ious furl.net ❍ New York Times Technology San Jose Mercury News Technology Washington
    [Show full text]
  • Technology, Media, and Telecommunications Predictions
    Technology, Media, and Telecommunications Predictions 2019 Deloitte’s Technology, Media, and Telecommunications (TMT) group brings together one of the world’s largest pools of industry experts—respected for helping companies of all shapes and sizes thrive in a digital world. Deloitte’s TMT specialists can help companies take advantage of the ever- changing industry through a broad array of services designed to meet companies wherever they are, across the value chain and around the globe. Contact the authors for more information or read more on Deloitte.com. Technology, Media, and Telecommunications Predictions 2019 Contents Foreword | 2 5G: The new network arrives | 4 Artificial intelligence: From expert-only to everywhere | 14 Smart speakers: Growth at a discount | 24 Does TV sports have a future? Bet on it | 36 On your marks, get set, game! eSports and the shape of media in 2019 | 50 Radio: Revenue, reach, and resilience | 60 3D printing growth accelerates again | 70 China, by design: World-leading connectivity nurtures new digital business models | 78 China inside: Chinese semiconductors will power artificial intelligence | 86 Quantum computers: The next supercomputers, but not the next laptops | 96 Deloitte Australia key contacts | 108 1 Technology, Media, and Telecommunications Predictions 2019 Foreword Dear reader, Welcome to Deloitte Global’s Technology, Media, and Telecommunications Predictions for 2019. The theme this year is one of continuity—as evolution rather than stasis. Predictions has been published since 2001. Back in 2009 and 2010, we wrote about the launch of exciting new fourth-generation wireless networks called 4G (aka LTE). A decade later, we’re now making predic- tions about 5G networks that will be launching this year.
    [Show full text]
  • PC Gamers Win Big
    CASE STUDY Intel® Solid-State Drives Performance, Storage and the Customer Experience PC Gamers Win Big Intel® Solid-State Drives (SSDs) deliver the ultimate gaming experience, providing dramatic visual and runtime improvements Intel® Solid-State Drives (SSDs) represent a revolutionary breakthrough, delivering a giant leap in storage performance. Designed to satisfy the most demanding gamers, media creators, and technology enthusiasts, Intel® SSDs bring a high level of performance and reliability to notebook and desktop PC storage. Faster load times and improved graphics performance such as increased detail in textures, higher resolution geometry, smoother animation, and more characters on the screen make for a better gaming experience. Developers are now taking advantage of these features in their new game designs. SCREAMING LOAD TiMES AND SMOOTH GRAPHICS With no moving parts, high reliability, and a longer life span than traditional hard drives, Intel Solid-State Drives (SSDs) dramatically improve the computer gaming experience. Load times are substantially faster. When compared with Western Digital VelociRaptor* 10K hard disk drives (HDDs), gamers experienced up to 78 percent load time improvements using Intel SSDs. Graphics are smooth and uninterrupted, even at the highest graphics settings. To see the performance difference in a head-to- head video comparing the Intel® X25-M SATA SSD with a 10,000 RPM HDD, go to www.intelssdgaming.com. When comparing frame-to-frame coherency with the Western Digital VelociRaptor 10K HDD, the Intel X25-M responds with zero hitching while the WD VelociRaptor shows hitching seven percent of the time. This means gamers experience smoother visual transitions with Intel SSDs.
    [Show full text]
  • Lecture 1: CS/ECE 3810 Introduction
    Lecture 1: CS/ECE 3810 Introduction • Today’s topics: . Why computer organization is important . Logistics . Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer Organization 3 Image credits: gizmodo Why Computer Organization • Embarrassing if you are a BS in CS/CE and can’t make sense of the following terms: DRAM, pipelining, cache hierarchies, I/O, virtual memory, … • Embarrassing if you are a BS in CS/CE and can’t decide which processor to buy: 4.4 GHz Intel Core i9 or 4.7 GHz AMD Ryzen 9 (reason about performance/power) • Obvious first step for chip designers, compiler/OS writers • Will knowledge of the hardware help you write better and more secure programs? 4 Must a Programmer Care About Hardware? • Must know how to reason about program performance and energy and security • Memory management: if we understand how/where data is placed, we can help ensure that relevant data is nearby • Thread management: if we understand how threads interact, we can write smarter multi-threaded programs Why do we care about multi-threaded programs? 5 Example 200x speedup for matrix vector multiplication • Data level parallelism: 3.8x • Loop unrolling and out-of-order execution: 2.3x • Cache blocking: 2.5x • Thread level parallelism: 14x Further, can use accelerators to get an additional 100x. 6 Key Topics • Moore’s Law, power wall • Use of abstractions • Assembly language • Computer arithmetic • Pipelining • Using predictions • Memory hierarchies • Accelerators • Reliability and Security 7 Logistics
    [Show full text]
  • Ceph: Distributed Storage for Cloud Infrastructure
    ceph: distributed storage for cloud infrastructure sage weil msst – april 16, 2012 outline ● motivation ● practical guide, demo ● overview ● hardware ● ● how it works installation ● failure and recovery ● architecture ● rbd ● data distribution ● libvirt ● rados ● ● rbd project status ● distributed file system storage requirements ● scale ● terabytes, petabytes, exabytes ● heterogeneous hardware ● reliability and fault tolerance ● diverse storage needs ● object storage ● block devices ● shared file system (POSIX, coherent caches) ● structured data time ● ease of administration ● no manual data migration, load balancing ● painless scaling ● expansion and contraction ● seamless migration money ● low cost per gigabyte ● no vendor lock-in ● software solution ● commodity hardware ● open source ceph: unified storage system ● objects ● small or large ● multi-protocol Netflix VM Hadoop ● block devices radosgw RBD Ceph DFS ● snapshots, cloning RADOS ● files ● cache coherent ● snapshots ● usage accounting open source ● LGPLv2 ● copyleft ● free to link to proprietary code ● no copyright assignment ● no dual licensing ● no “enterprise-only” feature set distributed storage system ● data center (not geo) scale ● 10s to 10,000s of machines ● terabytes to exabytes ● fault tolerant ● no SPoF ● commodity hardware – ethernet, SATA/SAS, HDD/SSD – RAID, SAN probably a waste of time, power, and money architecture ● monitors (ceph-mon) ● 1s-10s, paxos ● lightweight process ● authentication, cluster membership, critical cluster state ● object storage daemons (ceph-osd)
    [Show full text]
  • Survey and Benchmarking of Machine Learning Accelerators
    1 Survey and Benchmarking of Machine Learning Accelerators Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner MIT Lincoln Laboratory Supercomputing Center Lexington, MA, USA freuther,pmichaleas,michael.jones,vijayg,sid,[email protected] Abstract—Advances in multicore processors and accelerators components play a major role in the success or failure of an have opened the flood gates to greater exploration and application AI system. of machine learning techniques to a variety of applications. These advances, along with breakdowns of several trends including Moore’s Law, have prompted an explosion of processors and accelerators that promise even greater computational and ma- chine learning capabilities. These processors and accelerators are coming in many forms, from CPUs and GPUs to ASICs, FPGAs, and dataflow accelerators. This paper surveys the current state of these processors and accelerators that have been publicly announced with performance and power consumption numbers. The performance and power values are plotted on a scatter graph and a number of dimensions and observations from the trends on this plot are discussed and analyzed. For instance, there are interesting trends in the plot regarding power consumption, numerical precision, and inference versus training. We then select and benchmark two commercially- available low size, weight, and power (SWaP) accelerators as these processors are the most interesting for embedded and Fig. 1. Canonical AI architecture consists of sensors, data conditioning, mobile machine learning inference applications that are most algorithms, modern computing, robust AI, human-machine teaming, and users (missions). Each step is critical in developing end-to-end AI applications and applicable to the DoD and other SWaP constrained users.
    [Show full text]
  • A Guide to Smartphone Astrophotography National Aeronautics and Space Administration
    National Aeronautics and Space Administration A Guide to Smartphone Astrophotography National Aeronautics and Space Administration A Guide to Smartphone Astrophotography A Guide to Smartphone Astrophotography Dr. Sten Odenwald NASA Space Science Education Consortium Goddard Space Flight Center Greenbelt, Maryland Cover designs and editing by Abbey Interrante Cover illustrations Front: Aurora (Elizabeth Macdonald), moon (Spencer Collins), star trails (Donald Noor), Orion nebula (Christian Harris), solar eclipse (Christopher Jones), Milky Way (Shun-Chia Yang), satellite streaks (Stanislav Kaniansky),sunspot (Michael Seeboerger-Weichselbaum),sun dogs (Billy Heather). Back: Milky Way (Gabriel Clark) Two front cover designs are provided with this book. To conserve toner, begin document printing with the second cover. This product is supported by NASA under cooperative agreement number NNH15ZDA004C. [1] Table of Contents Introduction.................................................................................................................................................... 5 How to use this book ..................................................................................................................................... 9 1.0 Light Pollution ....................................................................................................................................... 12 2.0 Cameras ................................................................................................................................................
    [Show full text]
  • Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques
    Comparative Analysis of Distributed and Parallel File Systems’ Internal Techniques Viacheslav Dubeyko Content 1 TERMINOLOGY AND ABBREVIATIONS ................................................................................ 4 2 INTRODUCTION......................................................................................................................... 5 3 COMPARATIVE ANALYSIS METHODOLOGY ....................................................................... 5 4 FILE SYSTEM FEATURES CLASSIFICATION ........................................................................ 5 4.1 Distributed File Systems ............................................................................................................................ 6 4.1.1 HDFS ..................................................................................................................................................... 6 4.1.2 GFS (Google File System) ....................................................................................................................... 7 4.1.3 InterMezzo ............................................................................................................................................ 9 4.1.4 CodA .................................................................................................................................................... 10 4.1.5 Ceph.................................................................................................................................................... 12 4.1.6 DDFS ..................................................................................................................................................
    [Show full text]
  • Performance and Scalability of a Sensor Data Storage Framework
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Aaltodoc Publication Archive Aalto University School of Science Degree Programme in Computer Science and Engineering Paul Tötterman Performance and Scalability of a Sensor Data Storage Framework Master’s Thesis Helsinki, March 10, 2015 Supervisor: Associate Professor Keijo Heljanko Advisor: Lasse Rasinen M.Sc. (Tech.) Aalto University School of Science ABSTRACT OF Degree Programme in Computer Science and Engineering MASTER’S THESIS Author: Paul Tötterman Title: Performance and Scalability of a Sensor Data Storage Framework Date: March 10, 2015 Pages: 67 Major: Software Technology Code: T-110 Supervisor: Associate Professor Keijo Heljanko Advisor: Lasse Rasinen M.Sc. (Tech.) Modern artificial intelligence and machine learning applications build on analysis and training using large datasets. New research and development does not always start with existing big datasets, but accumulate data over time. The same storage solution does not necessarily cover the scale during the lifetime of the research, especially if scaling up from using common workgroup storage technologies. The storage infrastructure at ZenRobotics has grown using standard workgroup technologies. The current approach is starting to show its limits, while the storage growth is predicted to continue and accelerate. Successful capacity planning and expansion requires a better understanding of the patterns of the use of storage and its growth. We have examined the current storage architecture and stored data from different perspectives in order to gain a better understanding of the situation. By performing a number of experiments we determine key properties of the employed technologies. The combination of these factors allows us to make informed decisions about future storage solutions.
    [Show full text]
  • Open Source Software
    OPEN SOURCE SOFTWARE Product AVD9x0x0-0010 Firmware and version 3.4.57 Notes on Open Source software The TCS device uses Open Source software that has been released under specific licensing requirements such as the ”General Public License“ (GPL) Version 2 or 3, the ”Lesser General Public License“ (LGPL), the ”Apache License“ or similar licenses. Any accompanying material such as instruction manuals, handbooks etc. contain copyright notes, conditions of use or licensing requirements that contradict any applicable Open Source license, these conditions are inapplicable. The use and distribution of any Open Source software used in the product is exclusively governed by the respective Open Source license. The programmers provided their software without ANY WARRANTY, whether implied or expressed, of any fitness for a particular purpose. Further, the programmers DECLINE ALL LIABILITY for damages, direct or indirect, that result from the using this software. This disclaimer does not affect the responsibility of TCS. AVAILABILITY OF SOURCE CODE: Where the applicable license entitles you to the source code of such software and/or other additional data, you may obtain it for a period of three years after purchasing this product, and, if required by the license conditions, for as long as we offer customer support for the device. If this document is available at the website of TCS and TCS offers a download of the firmware, the offer to receive the source code is valid for a period of three years and, if required by the license conditions, for as long as TCS offers customer support for the device. TCS provides an option for obtaining the source code: You may obtain any version of the source code that has been distributed by TCS for the cost of reproduction of the physical copy.
    [Show full text]
  • Digitalocean
    UPDATE SAGE WEIL – RED HAT SC16 CEPH BOF – 2016.11.16 2 JEWEL APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT RADOSGWRADOSGW RBDRBD CEPHCEPH FSFS LIBRADOSLIBRADOS AA bucket-based bucket-based AA reliable reliable and and fully- fully- AA POSIX-compliant POSIX-compliant AA library library allowing allowing RESTREST gateway, gateway, distributeddistributed block block distributeddistributed file file appsapps to to directly directly compatiblecompatible with with S3 S3 device,device, with with a a Linux Linux system,system, with with a a accessaccess RADOS, RADOS, andand Swift Swift kernelkernel client client and and a a LinuxLinux kernel kernel client client withwith support support for for QEMU/KVMQEMU/KVM driver driver andand support support for for C,C, C++, C++, Java, Java, FUSEFUSE Python,Python, Ruby, Ruby, andand PHP PHP AWESOME AWESOME NEARLY AWESOME AWESOME RADOSRADOS AWESOME AA reliable, reliable, autonomous, autonomous, distributed distributed object object store store comprised comprised of of self-healing, self-healing, self-managing, self-managing, intelligentintelligent storage storage nodes nodes 3 2016 = FULLY AWESOME OBJECT BLOCK FILE RGW RBD CEPHFS S3 and Swift compatible A virtual block device with A distributed POSIX file object storage with object snapshots, copy-on-write system with coherent versioning, multi-site clones, and multi-site caches and snapshots on federation, and replication replication any directory LIBRADOS A library allowing apps to direct access RADOS (C, C++, Java, Python, Ruby, PHP) RADOS A software-based,
    [Show full text]
  • SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems
    SMA 2020, September 17-19, Jeju, Republic of Korea, S. Chum et al. SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems Sopanhapich Chum Jerry Li Department of Computer Science Memory Solutions Lab. Dankook University Samsung Semiconductor Inc. Yongin, Korea San Jose, CA, USA [email protected] [email protected] Heekwon Park Jongmoo Choi Memory Solutions Lab. Department of Computer Science Samsung Semiconductor Inc. Dankook University San Jose, CA, USA Yongin, Korea [email protected] [email protected] ABSTRACT by 2025, with a compounded annual growth rate of 61% [23]. In As data are processed by diverse clients ranging from urgent time- addition, these data are processed instantly to extract information critical to best-effort, supporting different QoS (Quality of Service) for decision making, recommendation and autonomous control becomes a vital component in a distributed storage system. In this using various analytic models and frameworks [1, 8, 11, 20, 27, 30]. paper, we propose a novel SLA (Service Level Agreement)-aware We indeed live in Bigdata era [24]. adaptive mapping scheme that can differentiate between urgent Distributed storage (also called as Cloud storage) plays a key and normal clients based on their I/O requirements. The scheme ba- role in Bigdata era. There are various distributed storage systems sically divides storage into two regions, normal and urgent, which including GFS [14], HDFS [26], Ceph [2, 33], Azure Storage [16], makes it feasible to isolate urgent clients from normal ones. In Amazon S3 [22], Openstack Swift [18], Haystack [6], Lustre [35], addition, it changes the size of the isolated region in an adaptive GlusterFS [17] and so on.
    [Show full text]