Etriks Analytical Environment: a Practical Platform for Medical Big Data Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Etriks Analytical Environment: a Practical Platform for Medical Big Data Analysis Imperial College of Science, Technology and Medicine Department of Computing eTRIKS Analytical Environment: A Practical Platform for Medical Big Data Analysis Axel Oehmichen Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing of Imperial College London December 2018 Abstract Personalised medicine and translational research have become sciences driven by Big Data. Healthcare and medical research are generating more and more complex data, encompassing clinical investigations, 'omics, imaging, pharmacokinetics, Next Generation Sequencing and beyond. In addition to traditional collection methods, economical and numerous information sensing IoT devices such as mobile devices, smart sensors, cameras or connected medical de- vices have created a deluge of data that research institutes and hospitals have difficulties to deal with. While the collection of data is greatly accelerating, improving patient care by devel- oping personalised therapies and new drugs depends increasingly on an organization's ability to rapidly and intelligently leverage complex molecular and clinical data from that variety of large-scale heterogeneous data sources. As a result, the analysis of these datasets has become increasingly computationally expensive and has laid bare the limitations of current systems. From the patient perspective, the advent of electronic medical records coupled with so much personal data being collected have raised concerns about privacy. Many countries have intro- duced laws to protect people's privacy, however, many of these laws have proven to be less effective in practice. Therefore, along with the capacity to process the humongous amount of medical data, the addition of privacy preserving features to protect patients' privacy has become a necessity. In this thesis, our first contribution is the development a new platform called the eTRIKS Analytical Environment (eAE) as an answer to those needs of analysing and exploring massive amounts of medical data in a privacy preserving fashion with the constraint of enabling the broadest audience, ranging from medical doctors to advanced coders, to easily and intuitively exploit this new resource. We will present the use of location data in the context of public health research, the work done in the context of data privacy for location data and the extension of the eAE to support privacy preserving analytics. Our second contribution is the implementation of new workflows for tranSMART that leverage the eAE and the support of novel life science approaches for features extraction using deep learning models in the context of sleep research. Finally, we demonstrate the universality and extensibility of the architecture to other research domains by proposing a model aiming at the identification of relevant features for characterizing political deception on Twitter. i ii Copyright Declaration The copyright of this thesis rests with the author. Unless otherwise indicated, its contents are licensed under a Creative Commons Attribution-NonCommercial 4.0 International Licence (CC BY-NC). Under this licence, you may copy and redistribute the material in any medium or format. You may also create and distribute modified versions of the work. This is on the condition that: you credit the author and do not use it, or any derivative works, for a commercial purpose. When reusing or sharing this work, ensure you make the licence terms clear to others by naming the licence and linking to the licence text. Where a work has been adapted, you should indicate that the work has been changed and describe those changes. Please seek permission from the copyright holder for uses of this work that are not included in this licence or permitted under UK Copyright Law. iii iv Acknowledgements I would like to take this opportunity to express my thanks to all of those who have always been by my side and supported me through that adventure. Firstly, I must thank my supervisor, Professor Yi-ke Guo, without whom none of this would have been possible. I am deeply grateful for his professional guidance and sharing his wisdom. I would like to give a special thanks to Florian Guitton who has been a close collaborator and friend from whom I have learned and shared so much. I am thankful to Dr Heinis and Dr de Montjoye for their invaluable support and guidance. All my friends and colleagues in Imperial College London, Diana O'Malley, Kai Sun, Miguel Molina-Solana, Shubham Jain, Arnaud Tournier, Florimond Houssiau, Akara Supratak, Ioannis Pandis, Lei Nie, Hao Dong, Paul Agapow, Susan Mulcahy, Juan G´omezRomero, Jean Grizet, Kevin Hua, Julio Amador D´ıazL´opez, Pierre Richemond, Ali Farzaneh, David Akroyd, Shicai Wang, Chao Wu, Bertan Kavuncu and Ibrahim Emam. I would like to thank C´edric Wahl who encouraged me to follow this path. I would like to express my gratitude to the eTRIKS and OPAL projects for supporting this work. Finally, I would like to give my deepest thanks to C´ecileand my mother for their constant support, patience, love and encouragement. v vi Dedication To my mother vii Vi Veri Veniversum Vivus Vici viii Contents Abstract i Copyright Declaration iii Acknowledgements v 1 Introduction 1 1.1 Motivation and objectives . .1 1.2 Contributions . .2 1.3 Impact and adoption of the research . .3 1.4 Thesis organisation . .4 1.5 Statement of Originality . .5 1.6 Publications . .5 2 Background 11 2.1 Towards large scale data analysis in Life Science . 11 2.1.1 A deluge of data . 12 2.1.2 Moving away from a pure symptom-based medicine . 13 2.1.3 Complexity of computing infrastructures in Life Science . 18 ix x CONTENTS 2.2 Scalability in distributed systems . 19 2.2.1 Introduction . 19 2.2.2 Scheduling and management scalability . 21 2.2.3 Storage scalability . 23 2.2.4 Computational scalability . 26 2.3 Architectures to support machine intelligence . 28 2.3.1 Machine Learning . 29 2.3.2 Deep Learning . 30 2.3.3 Hardware acceleration for AI research . 32 2.4 Compliance and security in distributed systems . 34 2.4.1 GDPR and privacy of patient data . 34 2.4.2 Security of the data . 37 2.4.3 Privacy of companies . 38 2.5 General-purpose analytical platforms for Life Science . 39 2.5.1 Introduction . 40 2.5.2 Existing architectures . 41 2.5.3 Conclusion . 45 3 eTRIKS Analytical Environment: Design Principles and Core Concepts 46 3.1 Introduction and users' needs . 46 3.2 Existing knowledge management platforms and their limitations . 49 3.3 eTRIKS Analytical Environment . 52 3.3.1 Introduction . 52 CONTENTS xi 3.3.2 General Environment . 53 3.3.3 Endpoints Layer . 54 3.3.4 Storage Layer . 56 3.3.5 Management Layer . 57 3.3.6 Computation Layer . 60 3.3.7 Interaction between Layers . 62 3.3.8 Security of the architecture . 63 4 Implementation of the eTRIKS Analytical Environment 65 4.1 Implementation . 65 4.1.1 General Environment . 65 4.1.2 Endpoints layer . 66 4.1.3 Storage Layer . 71 4.1.4 Management layer . 73 4.1.5 Computation Layer . 74 4.2 Benchmarking and Scalability . 75 4.2.1 Resource usage . 75 4.2.2 Scheduler . 76 4.2.3 Compute Scalability . 77 4.2.4 Storage Scalability . 79 4.2.5 Summary . 81 4.3 TensorDB: Database Infrastructure for Continuous Machine Learning . 82 4.3.1 Introduction . 83 xii CONTENTS 4.3.2 Related work . 84 4.3.3 Architecture . 85 4.3.4 Application Evaluation . 89 4.3.5 Conclusion . 90 5 eTRIKS Analytical Environment with Privacy 91 5.1 Building Privacy capabilities . 91 5.1.1 Location data as a support for public health . 91 5.1.2 Attempts at sharing location data . 95 5.1.3 Sensitivity of location data . 96 5.2 Privacy preserving eTRIKS Analytical Environment . 98 5.2.1 New services and features . 98 5.2.2 Scalability of the platform . 104 5.2.3 Privacy of the platform . 108 5.2.4 Algorithms on the platform . 111 5.2.5 Privacy module for density .......................... 113 5.2.6 Related work . 119 5.3 Discussion and future work . 121 6 Analytics Developed using the eTRIKS Analytical Environment 122 6.1 Analytics for tranSMART . 122 6.1.1 Iterative Model Generation and Cross-validation Pipeline . 123 6.1.2 General statistics . 126 6.1.3 Pathway Enrichment . 128 6.2 DeepSleepNet . 130 6.2.1 Introduction . 130 6.2.2 Tackle class imbalances . 132 6.2.3 Results . 132 6.3 Characterizing Political Deception On Twitter . 137 6.3.1 Background . 138 6.3.2 Data and Methodology . 141 6.3.3 Feature Selection . 147 6.3.4 Fake news classification . 158 6.3.5 Conclusion . 164 7 eTRIKS Analytical Environment supporting Open Science 166 7.1 Sustainability of the platform . 166 7.1.1 Hosting of the project and supporting the users . 166 7.1.2 Continuous integration and system deployment . ..
Recommended publications
  • Accordion: Better Memory Organization for LSM Key-Value Stores
    Accordion: Better Memory Organization for LSM Key-Value Stores Edward Bortnikov Anastasia Braginsky Eshcar Hillel Yahoo Research Yahoo Research Yahoo Research [email protected] [email protected] [email protected] Idit Keidar Gali Sheffi Technion and Yahoo Research Yahoo Research [email protected] gsheffi@oath.com ABSTRACT of applications for which they are used continuously in- Log-structured merge (LSM) stores have emerged as the tech- creases. A small sample of recently published use cases in- nology of choice for building scalable write-intensive key- cludes massive-scale online analytics (Airbnb/ Airstream [2], value storage systems. An LSM store replaces random I/O Yahoo/Flurry [7]), product search and recommendation (Al- with sequential I/O by accumulating large batches of writes ibaba [13]), graph storage (Facebook/Dragon [5], Pinter- in a memory store prior to flushing them to log-structured est/Zen [19]), and many more. disk storage; the latter is continuously re-organized in the The leading approach for implementing write-intensive background through a compaction process for efficiency of key-value storage is log-structured merge (LSM) stores [31]. reads. Though inherent to the LSM design, frequent com- This technology is ubiquitously used by popular key-value pactions are a major pain point because they slow down storage platforms [9, 14, 16, 22,4,1, 10, 11]. The premise data store operations, primarily writes, and also increase for using LSM stores is the major disk access bottleneck, disk wear. Another performance bottleneck in today's state- exhibited even with today's SSD hardware [14, 33, 34].
    [Show full text]
  • Artificial Intelligence for Understanding Large and Complex
    Artificial Intelligence for Understanding Large and Complex Datacenters by Pengfei Zheng Department of Computer Science Duke University Date: Approved: Benjamin C. Lee, Advisor Bruce M. Maggs Jeffrey S. Chase Jun Yang Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Abstract Artificial Intelligence for Understanding Large and Complex Datacenters by Pengfei Zheng Department of Computer Science Duke University Date: Approved: Benjamin C. Lee, Advisor Bruce M. Maggs Jeffrey S. Chase Jun Yang An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Copyright © 2020 by Pengfei Zheng All rights reserved except the rights granted by the Creative Commons Attribution-Noncommercial Licence Abstract As the democratization of global-scale web applications and cloud computing, under- standing the performance of a live production datacenter becomes a prerequisite for making strategic decisions related to datacenter design and optimization. Advances in monitoring, tracing, and profiling large, complex systems provide rich datasets and establish a rigorous foundation for performance understanding and reasoning. But the sheer volume and complexity of collected data challenges existing techniques, which rely heavily on human intervention, expert knowledge, and simple statistics. In this dissertation, we address this challenge using artificial intelligence and make the case for two important problems, datacenter performance diagnosis and datacenter workload characterization. The first thrust of this dissertation is the use of statistical causal inference and Bayesian probabilistic model for datacenter straggler diagnosis.
    [Show full text]
  • Learning Key-Value Store Design
    Learning Key-Value Store Design Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, Zichen Zhu Harvard University ABSTRACT We introduce the concept of design continuums for the data Key-Value Stores layout of key-value stores. A design continuum unifies major Machine Databases K V K V … K V distinct data structure designs under the same model. The Table critical insight and potential long-term impact is that such unifying models 1) render what we consider up to now as Learning Data Structures fundamentally different data structures to be seen as \views" B-Tree Table of the very same overall design space, and 2) allow \seeing" Graph LSM new data structure designs with performance properties that Store Hash are not feasible by existing designs. The core intuition be- hind the construction of design continuums is that all data Performance structures arise from the very same set of fundamental de- Update sign principles, i.e., a small set of data layout design con- Data Trade-offs cepts out of which we can synthesize any design that exists Access Patterns in the literature as well as new ones. We show how to con- Hardware struct, evaluate, and expand, design continuums and we also Cloud costs present the first continuum that unifies major data structure Read Memory designs, i.e., B+tree, Btree, LSM-tree, and LSH-table. Figure 1: From performance trade-offs to data structures, The practical benefit of a design continuum is that it cre- key-value stores and rich applications.
    [Show full text]
  • Myrocks in Mariadb
    MyRocks in MariaDB Sergei Petrunia <[email protected]> MariaDB Shenzhen Meetup November 2017 2 What is MyRocks ● #include <Yoshinori’s talk> ● This talk is about MyRocks in MariaDB 3 MyRocks lives in Facebook’s MySQL branch ● github.com/facebook/mysql-5.6 – Will call this “FB/MySQL” ● MyRocks lives there in storage/rocksdb ● FB/MySQL is easy to use if you are Facebook ● Not so easy if you are not :-) 4 FB/mysql-5.6 – user perspective ● No binaries, no packages – Compile yourself from source ● Dependencies, etc. ● No releases – (Is the latest git revision ok?) ● Has extra features – e.g. extra counters “confuse” monitoring tools. 5 FB/mysql-5.6 – dev perspective ● Targets a CentOS-type OS – Compiler, cmake version, etc. – Others may or may not [periodically] work ● MariaDB/Percona file pull requests to fix ● Special command to compile – https://github.com/facebook/mysql-5.6/wiki/Build-Steps ● Special command to run tests – Test suite assumes a big machine ● Some tests even a release build 6 Putting MyRocks in MariaDB ● Goals – Wider adoption – Ease of use – Ease of development – Have MyRocks in MariaDB ● Use it with MariaDB features ● Means – Port MyRocks into MariaDB – Provide binaries and packages 7 Status of MyRocks in MariaDB 8 Status of MyRocks in MariaDB ● MariaDB 10.2 is GA (as of May, 2017) ● It includes an ALPHA version of MyRocks plugin – Working to improve maturity ● It’s a loadable plugin (ha_rocksdb.so) ● Packages – Bintar, deb, rpm, win64 zip + MSI – deb/rpm have MyRocks .so and tools in a separate package. 9 Packaging for MyRocks in MariaDB 10 MyRocks and RocksDB library ● MyRocks is tied RocksDB@revno MariaDB – RocksDB is a github submodule – No compatibility with other versions MyRocks ● RocksDB is always compiled with RocksDB MyRocks S Z n ● l i And linked-in statically a b p ● p Distros have a RocksDB package y – Not using it.
    [Show full text]
  • Myrocks Deployment at Facebook and Roadmaps
    MyRocks deployment at Facebook and Roadmaps Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom Agenda ▪ MySQL at Facebook ▪ MyRocks overview ▪ Production Deployment ▪ Future Plans MySQL “User Database (UDB)” at Facebook▪ Storing Social Graph ▪ Massively Sharded ▪ Low latency ▪ Automated Operations ▪ Pure Flash Storage (Constrained by space, not by CPU/IOPS) What is MyRocks ▪ MySQL on top of RocksDB (RocksDB storage engine) ▪ Open Source, distributed from MariaDB and Percona as well MySQL Clients SQL/Connector Parser Optimizer Replication etc InnoDB RocksDB MySQL http://myrocks.io/ MyRocks Initial Goal at Facebook InnoDB in main database MyRocks in main database CPU IO Space CPU IO Space Machine limit Machine limit 45% 90% 21% 15% 45% 20% 15% 21% 15% MyRocks features ▪ Clustered Index (same as InnoDB) ▪ Bloom Filter and Column Family ▪ Transactions, including consistency between binlog and RocksDB ▪ Faster data loading, deletes and replication ▪ Dynamic Options ▪ TTL ▪ Online logical and binary backup MyRocks vs InnoDB ▪ MyRocks pros ▪ Much smaller space (half compared to compressed InnoDB) ▪ Gives better cache hit rate ▪ Writes are faster = Faster Replication ▪ Much smaller bytes written (can use more afordable fash storage) ▪ MyRocks cons (improvements in progress) ▪ Lack of several features ▪ No SBR, Gap Lock, Foreign Key, Fulltext Index, Spatial Index support. Need to use case sensitive collation for perf ▪ Reads are slower, especially if your data fts in memory ▪ More dependent on flesystem
    [Show full text]
  • Optimal Bloom Filters and Adaptive Merging for LSM-Trees∗
    Optimal Bloom Filters and Adaptive Merging for LSM-Trees∗ NIV DAYAN, Harvard University, USA MANOS ATHANASSOULIS, Harvard University, USA STRATOS IDREOS, Harvard University, USA In this paper, we show that key-value stores backed by a log-structured merge-tree (LSM-tree) exhibit an intrinsic trade-off between lookup cost, update cost, and main memory footprint, yet all existing designs expose a suboptimal and difficult to tune trade-off among these metrics. We pinpoint the problem tothefact that modern key-value stores suboptimally co-tune the merge policy, the buffer size, and the Bloom filters’ false positive rates across the LSM-tree’s different levels. We present Monkey, an LSM-tree based key-value store that strikes the optimal balance between the costs of updates and lookups with any given main memory budget. The core insight is that worst-case lookup cost is proportional to the sum of the false positive rates of the Bloom filters across all levels of the LSM-tree. Contrary to state-of-the-art key-value stores that assign a fixed number of bits-per-element to all Bloom filters, Monkey allocates memory to filters across different levels so as to minimize the sum of their false positive rates. We show analytically that Monkey reduces the asymptotic complexity of the worst-case lookup I/O cost, and we verify empirically using an implementation on top of RocksDB that Monkey reduces lookup latency by an increasing margin as the data volume grows (50% − 80% for the data sizes we experimented with). Furthermore, we map the design space onto a closed-form model that enables adapting the merging frequency and memory allocation to strike the best trade-off among lookup cost, update cost and main memory, depending onthe workload (proportion of lookups and updates), the dataset (number and size of entries), and the underlying hardware (main memory available, disk vs.
    [Show full text]
  • Μtune: Auto-Tuned Threading for OLDI Microservices
    mTune: Auto-Tuned Threading for OLDI Microservices Akshitha Sriraman Thomas F. Wenisch University of Michigan [email protected], [email protected] ABSTRACT ment, protocol routing [25], etc. Several companies, such as Amazon [6], Netflix [1], Gilt [37], LinkedIn [17], Modern On-Line Data Intensive (OLDI) applications have and SoundCloud [9], have adopted microservice architec- evolved from monolithic systems to instead comprise tures to improve OLDI development and scalability [144]. numerous, distributed microservices interacting via Re- These microservices are composed via standardized Re- mote Procedure Calls (RPCs). Microservices face sub- mote Procedure Call (RPC) interfaces, such as Google’s millisecond (sub-ms) RPC latency goals, much tighter Stubby and gRPC [18] or Facebook/Apache’s Thrift [14]. than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms–scale threading and concur- Whereas monolithic applications face ≥ 100 ms th rency design effects that were once insignificant for such tail (99 +%) latency SLOs (e.g.,∼300 ms for web monolithic services can now come to dominate in the search [126, 133, 142, 150]), microservices must often sub-ms–scale microservice regime. We investigate how achieve sub-ms (e.g., ∼100 ms for protocol routing [151]) threading design critically impacts microservice tail la- tail latencies as many microservices must be invoked se- tency by developing a taxonomy of threading models—a rially to serve a user’s query. For example, a Facebook structured understanding of the implications of how mi- news feed service [79] query may flow through a serial croservices manage concurrency and interact with RPC pipeline of many microservices, such as (1) Sigma [15]: interfaces under wide-ranging loads.
    [Show full text]
  • Optimizing Space Amplification in Rocksdb
    Optimizing Space Amplification in RocksDB Siying Dong1, Mark Callaghan1, Leonidas Galanis1, Dhruba Borthakur1, Tony Savor1 and Michael Stumm2 1Facebook, 1 Hacker Way, Menlo Park, CA USA 94025 {siying.d, mcallaghan, lgalanis, dhruba, tsavor}@fb.com 2Dept. Electrical and Computer Engineering, University of Toronto, Canada M8X 2A6 [email protected] ABSTRACT Facebook has one of the largest MySQL installations in RocksDB is an embedded, high-performance, persistent key- the world, storing many 10s of petabytes of online data. The value storage engine developed at Facebook. Much of our underlying storage engine for Facebook's MySQL instances current focus in developing and configuring RocksDB is to is increasingly being switched over from InnoDB to My- give priority to resource efficiency instead of giving priority Rocks, which in turn is based on Facebook's RocksDB. The to the more standard performance metrics, such as response switchover is primarily motivated by the fact that MyRocks time latency and throughput, as long as the latter remain uses half the storage InnoDB needs, and has higher average acceptable. In particular, we optimize space efficiency while transaction throughput, yet has only marginally worse read ensuring read and write latencies meet service-level require- latencies. ments for the intended workloads. This choice is motivated RocksDB is an embedded, high-performance, persistent key-value storage system [1] that was developed by Face- by the fact that storage space is most often the primary 1 bottleneck when using Flash SSDs under typical production book after forking the code from Google's LevelDB [2, 3]. workloads at Facebook. RocksDB uses log-structured merge RocksDB was open-sourced in 2013 [5].
    [Show full text]
  • The Full-Stack Database Infrastructure Operations Experts for Web-Scale
    MinervaDB: The full-stack Database Infrastructure Operations Experts for web-scale MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703 MinervaDB Introduction • An open source, vendor neutral and independent consulting, support, managed remote DBA services provider for MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse with core expertise in performance, scalability, high availability, database reliability engineering and database security. • Our consultants have several years of experience in architecting and building web-scale database infrastructure operations for internet properties from diversified verticals like CDN, Mobile Advertising Networks, E-Commerce, Social Media Applications, SaaS, Gaming and Digital Payment Solutions. MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703 Flexible – On demand Database Ops. Team • Hire our MySQL / MariaDB / PostgreSQL consultants only when you need them , We don’t insist for long-term contracts and our client engagement policies are friendly. • Not every business need a resident full-time MySQL DBA, Hiring and retaining MySQL DBAs are expensive and exhaustive. So here comes the value by engaging with MinervaDB, We are pioneers in hiring and managing Sr. Level MySQL / MariaDB / PostgreSQL consultants worldwide with several years of experience in performance, scalability, high availability and database reliability engineering. MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703 Why engage MinervaDB for Database Infrastructure Operations ? MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703 • Vendor neutral and independent, Enterprise-class consulting, 24*7 support and remote DBA services for MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse. • Virtual corporation with an global team of seasoned professionals – We have consultants operating from multiple locations worldwide, All of us work from home and stay connected via email, Google Hangouts, Skype, private IRC, WhatsApp, Telegram and phone.
    [Show full text]
  • Succinct Range Filters
    Succinct Range Filters Huanchen Zhang, Hyeontaek Lim, Viktor Leise, David G. Andersen, Michael Kaminsky$, Kimberly Keeton£, Andrew Pavlo Carnegie Mellon University, eFriedrich Schiller University Jena, $Intel Labs, £Hewlett Packard Labs huanche1, hl, dga, [email protected], [email protected], [email protected], [email protected] ABSTRACT that LSM tree-based designs often use prefix Bloom filters to op- We present the Succinct Range Filter (SuRF), a fast and compact timize certain fixed-prefix queries (e.g., “where email starts with data structure for approximate membership tests. Unlike traditional com.foo@”) [2, 20, 32], despite their inflexibility for more general Bloom filters, SuRF supports both single-key lookups and common range queries. The designers of RocksDB [2] have expressed a de- range queries. SuRF is based on a new data structure called the Fast sire to have a more flexible data structure for this purpose [19]. A Succinct Trie (FST) that matches the point and range query perfor- handful of approximate data structures, including the prefix Bloom mance of state-of-the-art order-preserving indexes, while consum- filter, exist that accelerate specific categories of range queries, but ing only 10 bits per trie node. The false positive rates in SuRF none is general purpose. for both point and range queries are tunable to satisfy different This paper presents the Succinct Range Filter (SuRF), a fast application needs. We evaluate SuRF in RocksDB as a replace- and compact filter that provides exact-match filtering and range fil- ment for its Bloom filters to reduce I/O by filtering requests before tering.
    [Show full text]
  • LSM-Tree Database Storage Engine Serving Facebook's Social Graph
    MyRocks: LSM-Tree Database Storage Engine Serving Facebook's Social Graph Yoshinori Matsunobu Siying Dong Herman Lee Facebook Facebook Facebook [email protected] [email protected] [email protected] ABSTRACT Facebook uses MySQL to manage tens of petabytes of data in its 1. INTRODUCTION main database named the User Database (UDB). UDB serves social The Facebook UDB serves the most important social graph activities such as likes, comments, and shares. In the past, Facebook workloads [3]. The initial Facebook deployments used the InnoDB storage engine using MySQL as the backend. InnoDB was a robust, used InnoDB, a B+Tree based storage engine as the backend. The challenge was to find an index structure using less space and write widely used database and it performed well. Meanwhile, hardware amplification [1]. LSM-tree [2] has the potential to greatly improve trends shifted from slow but affordable magnetic drives to fast but these two bottlenecks. RocksDB, an LSM tree-based key/value more expensive flash storage. Transitioning to flash storage in UDB store was already widely used in variety of applications but had a shifted the bottleneck from Input/Output Operations Per Second very low-level key-value interface. To overcome these limitations, (IOPS) to storage capacity. From a space perspective, InnoDB had MyRocks, a new MySQL storage engine, was built on top of three big challenges that were hard to overcome, index RocksDB by adding relational capabilities. With MyRocks, using fragmentation, compression inefficiencies, and space overhead per the RocksDB API, significant efficiency gains were achieved while row (13 bytes) for handling transactions.
    [Show full text]
  • Μtune: Auto-Tuned Threading for OLDI Microservices Akshitha Sriraman and Thomas F
    µTune: Auto-Tuned Threading for OLDI Microservices Akshitha Sriraman and Thomas F. Wenisch, University of Michigan https://www.usenix.org/conference/osdi18/presentation/sriraman This paper is included in the Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18). October 8–10, 2018 • Carlsbad, CA, USA ISBN 978-1-939133-08-3 Open access to the Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation is sponsored by USENIX. mTune: Auto-Tuned Threading for OLDI Microservices Akshitha Sriraman Thomas F. Wenisch University of Michigan [email protected], [email protected] ABSTRACT ment, protocol routing [25], etc. Several companies, such as Amazon [6], Netflix [1], Gilt [37], LinkedIn [17], Modern On-Line Data Intensive (OLDI) applications have and SoundCloud [9], have adopted microservice architec- evolved from monolithic systems to instead comprise tures to improve OLDI development and scalability [144]. numerous, distributed microservices interacting via Re- These microservices are composed via standardized Re- mote Procedure Calls (RPCs). Microservices face sub- mote Procedure Call (RPC) interfaces, such as Google’s millisecond (sub-ms) RPC latency goals, much tighter Stubby and gRPC [18] or Facebook/Apache’s Thrift [14]. than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms–scale threading and concur- Whereas monolithic applications face ≥ 100 ms th rency design effects that were once insignificant for such tail (99 +%) latency SLOs (e.g.,∼300 ms for web monolithic services can now come to dominate in the search [126, 133, 142, 150]), microservices must often sub-ms–scale microservice regime. We investigate how achieve sub-ms (e.g., ∼100 ms for protocol routing [151]) threading design critically impacts microservice tail la- tail latencies as many microservices must be invoked se- tency by developing a taxonomy of threading models—a rially to serve a user’s query.
    [Show full text]