Deterministic, Mutable, and Distributed Record-Replay for Operating Systems and Database Systems

Total Page:16

File Type:pdf, Size:1020Kb

Deterministic, Mutable, and Distributed Record-Replay for Operating Systems and Database Systems Deterministic, Mutable, and Distributed Record-Replay for Operating Systems and Database Systems Nicolas Viennot Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2017 c 2017 Nicolas Viennot All Rights Reserved ABSTRACT Deterministic, Mutable, and Distributed Record-Replay for Operating Systems and Database Systems Nicolas Viennot Application record and replay is the ability to record application execution and replay it at a later time. Record-replay has many use cases including diagnosing and debugging applications by capturing and re- producing hard to find bugs, providing transparent application fault tolerance by maintaining a live replica of a running program, and offline instrumentation that would be too costly to run in a production environ- ment. Different record-replay systems may offer different levels of replay faithfulness, the strongest level being deterministic replay which guarantees an identical reenactment of the original execution. Such a guarantee requires capturing all sources of nondeterminism during the recording phase. In the general case, such record-replay systems can dramatically hinder application performance, rendering them unpractical in certain application domains. Furthermore, various use cases are incompatible with strictly replaying the original execution. For example, in a primary-secondary database scenario, the secondary database would be unable to serve additional traffic while being replicated. No record-replay system fit all use cases. This dissertation shows how to make deterministic record-replay fast and efficient, how broadening replay semantics can enable powerful new use cases, and how choosing the right level of abstraction for record-replay can support distributed and heterogeneous database replication with little effort. We explore four record-replay systems with different semantics enabling different use cases. We first present SCRIBE, an OS-level deterministic record-replay mechanism that support multi-process applications on multi-core systems. One of the main challenge is to record the interaction of threads running on different CPU cores in an efficient manner. SCRIBE introduces two new lightweight OS mechanisms, rendezvous point and sync points, to efficiently record nondeterministic interactions such as related system calls, signals, and shared memory accesses. SCRIBE allows the capture and replication of hard to find bugs to facilitate debugging and serves as a solid foundation for our two following systems. We then present RACEPRO, a process race detection system to improve software correctness. Process races occur when multiple processes access shared operating system resources, such as files, without proper synchronization. Detecting process races is difficult due to the elusive nature of these bugs, and the het- erogeneity of frameworks involved in such bugs. RACEPRO is the first tool to detect such process races. RACEPRO records application executions in deployed systems, allowing offline race detection by analyzing the previously recorded log. RACEPRO then replays the application execution and forces the manifesta- tion of detected races to check their effect on the application. Upon failure, RACEPRO reports potentially harmful races to developers. Third, we present DORA, a mutable record-replay system which allows a recorded execution of an appli- cation to be replayed with a modified version of the application. Mutable record-replay provides a number of benefits for reproducing, diagnosing, and fixing software bugs. Given a recording and a modified appli- cation, finding a mutable replay is challenging, and undecidable in the general case. Despite the difficulty of the problem, we show a very simple but effective algorithm to search for suitable replays. Lastly, we present SYNAPSE, a heterogeneous database replication system designed for Web applica- tions. Web applications are increasingly built using a service-oriented architecture that integrates services powered by a variety of databases. Often, the same data, needed by multiple services, must be replicated across different databases and kept in sync. Unfortunately, these databases use vendor specific data repli- cation engines which are not compatible with each other. To solve this challenge, SYNAPSE operates at the application level to access a unified data representation through object relational mappers. Additionally, SYNAPSE leverages application semantics to replicate data with good consistency semantics using mecha- nisms similar to SCRIBE. Table of Contents List of Figures v List of Tables ix 1 Introduction 1 2 SCRIBE: Transparent Deterministic Record-Replay 11 2.1 Introduction . 11 2.2 Architecture Overview . 14 2.3 System Calls . 17 2.3.1 Record . 17 2.3.2 Replay . 18 2.3.3 Go Live . 19 2.4 Shared Memory . 21 2.5 Rendezvous Points . 23 2.6 Sync Points . 26 2.6.1 Signal Delivery . 28 2.6.2 Page Ownership Transfer . 29 2.6.3 Signature Record and Replay . 30 2.7 Performance Evaluation . 33 2.8 Related Work . 39 2.9 Summary . 42 i 3 RACEPRO: Detection of Process Races in Deployed Systems 43 3.1 Introduction . 43 3.2 Process Race Study . 46 3.2.1 Findings . 47 3.2.2 Process Race Examples . 49 3.3 Architecture Overview . 52 3.4 Recording Executions . 52 3.5 Detecting Process Races . 54 3.5.1 The Happens-Before Graph . 55 3.5.2 Modeling Effects of System Calls . 57 3.5.3 Race Detection Algorithms . 60 3.6 Validating Races . 64 3.6.1 Creating Execution Branches . 64 3.6.2 Replaying Execution Branches and Going Live . 67 3.6.3 Checking Execution Branches . 70 3.7 Experimental Results . 72 3.7.1 Bugs Found . 73 3.7.2 Bug Statistics . 74 3.7.3 Performance Overhead . 75 3.8 Differences from SCRIBE .................................... 77 3.9 Related Work . 79 3.10 Summary . 81 4 DORA: Transparent Mutable Record-Replay 82 4.1 Introduction . 82 4.2 Mutable Replay Concept . 85 4.3 Recorder . 87 4.4 Replayer . 89 4.4.1 Additions . 90 4.4.2 Deletions . 92 4.4.3 Going Live . 93 ii 4.5 Explorer . 93 4.6 Properties . 97 4.7 Limitations . 98 4.8 Evaluation . 99 4.8.1 Debugging and Diagnosis Techniques . 101 4.8.2 Patch Validation . 105 4.8.3 Release Upgrades . 106 4.8.4 Performance . 107 4.9 Related Work . 108 4.10 Comparison with SCRIBE and RACEPRO ........................... 110 4.11 Summary . 111 5 SYNAPSE: Distributed Record-Replay for Database Systems 112 5.1 Introduction . 112 5.2 Background . 114 5.3 SYNAPSE API.......................................... 116 5.3.1 SYNAPSE Abstractions . 117 5.3.2 Comparison with SCRIBE ............................... 120 5.3.3 SYNAPSE Delivery Semantics . 122 5.3.4 SYNAPSE Programming by Example . 124 5.4 SYNAPSE Architecture . 127 5.4.1 Model-Driven Replication . 130 5.4.2 Enforcing Delivery Semantics . 131 5.4.3 Live Schema Migrations . 136 5.4.4 Bootstrapping and Reliability . 136 5.4.5 Testing Framework . 137 5.4.6 Supporting New DBs and ORMs . 138 5.5 Applications . 138 5.5.1 SYNAPSE at Crowdtap . 138 5.5.2 Integrating Open-Source Apps with SYNAPSE .................... 140 5.5.3 Splitting a monolithic app into services with SYNAPSE ................ 141 iii 5.6 Evaluation . 142 5.6.1 Sample Executions . 142 5.6.2 Application Overheads (Q1) . 143 5.6.3 Scalability (Q2) . ..
Recommended publications
  • Cloudikoulaone
    PRÉSENTE CLOUDIKOULAONE Le succès est votre prochaine destination MIAMI SINGAPOUR PARIS AMSTERDAM FRANCFORT ___ CLOUDIKOULAONE est une solution de Cloud public, privé et hybride qui vous permet de déployer en 1 clic et en moins de 30 secondes des machines virtuelles à travers le monde sur des infrastructures SSD haute performance. www.ikoula.com [email protected] 01 84 01 02 50 NOM DE DOMAINE | HÉBERGEMENT WEB | SERVEUR VPS | SERVEUR DÉDIÉ | CLOUD PUBLIC | MESSAGERIE | STOCKAGE | CERTIFICATS SSL LINUX PRATIQUE est édité par Les Éditions Diamond 10, Place de la Cathédrale - 68000 Colmar - France Tél. : 03 67 10 00 20 | Fax : 03 67 10 00 21 édito E-mail : [email protected] Linux Pratique n°102 [email protected] Service commercial : [email protected] Sites : http://www.linux-pratique.com http://www.ed-diamond.com Directeur de publication : Arnaud Metzler Chef des rédactions : Denis Bodor Rédactrice en chef : Aline Hof Responsable service infographie : Kathrin Scali Responsable publicité : Tél. : 03 67 10 00 27 Service abonnement : Tél. : 03 67 10 00 20 Photographie et images : http://www.fotolia.com Impression : pva, Landau, Allemagne Distribution France : (uniquement pour les dépositaires de presse) MLP Réassort : Plate-forme de Saint-Barthélemy-d’Anjou Au moment où je rédige ces lignes, la température extérieure affiche une tren- Tél. : 02 41 27 53 12 taine de degrés et une furieuse envie de troquer ma place au bureau devant Plate-forme de Saint-Quentin-Fallavier mon PC contre un transat au bord de la mer (à remplacer évidemment par ce Tél. : 04 74 82 63 04 qui vous fait plaisir lorsque la canicule pointe le bout de son nez) commence à Service des ventes : Distri-médias : Tél.
    [Show full text]
  • Nethserver-201 Cahier-11
    Micronator NethServer-201 Cahier-11 NethServer & diaspora* Version: RC-001 / vendredi 18 septembre 2020 - 13:14 © 2020 RF-232 6447, avenue Jalobert, Montréal Qc H1M 1L1 Tous droits réservés RF-232 AVIS DE NON-RESPONSABILITÉ Ce document est uniquement destiné à informer. Les informations, ainsi que les contenus et fonctionnalités de ce do- cument sont fournis sans engagement et peuvent être modifiés à tout moment. RF-232 n'offre aucune garantie quant à l'actualité, la conformité, l'exhaustivité, la qualité et la durabilité des informations, contenus et fonctionnalités de ce do- cument. L'accès et l'utilisation de ce document se font sous la seule responsabilité du lecteur ou de l'utilisateur. RF-232 ne peut être tenu pour responsable de dommages de quelque nature que ce soit, y compris des dommages di- rects ou indirects, ainsi que des dommages consécutifs résultant de l'accès ou de l'utilisation de ce document ou de son contenu. Chaque internaute doit prendre toutes les mesures appropriées (mettre à jour régulièrement son logiciel antivirus, ne pas ouvrir des documents suspects de source douteuse ou non connue) de façon à protéger le contenu de son ordinateur de la contamination d'éventuels virus circulant sur la Toile. Toute reproduction interdite Vous reconnaissez et acceptez que tout le contenu de ce document, incluant mais sans s’y limiter, le texte et les images, sont protégés par le droit d’auteur, les marques de commerce, les marques de service, les brevets, les secrets industriels et les autres droits de propriété intellectuelle. Sauf autorisation expresse de RF-232, vous acceptez de ne pas vendre, dé- livrer une licence, louer, modifier, distribuer, copier, reproduire, transmettre, afficher publiquement, exécuter en public, publier, adapter, éditer ou créer d’oeuvres dérivées de ce document et de son contenu.
    [Show full text]
  • Reference Coupling: an Exploration of Inter-Project Technical Dependencies and Their Characteristics Within Large Software Ecosystems
    Reference Coupling: An Exploration of Inter-project Technical Dependencies and their Characteristics within Large Software Ecosystems Kelly Blincoea,∗, Francis Harrisonb, Navpreet Kaurb, Daniela Damianb aUniversity of Auckland, New Zealand bUniversity of Victoria, BC, Canada Abstract Context: Software projects often depend on other projects or are developed in tandem with other projects. Within such software ecosystems, knowledge of cross-project technical dependencies is important for 1) practitioners under- standing of the impact of their code change and coordination needs within the ecosystem and 2) researchers in exploring properties of software ecosystems based on these technical dependencies. However, identifying technical depen- dencies at the ecosystem level can be challenging. Objective: In this paper, we describe Reference Coupling, a new method that uses solely the information in developers online interactions to detect technical dependencies between projects. The method establishes dependencies through user-specified cross-references between projects. We then use the output of this method to explore the properties of large software ecosystems. Method: We validate our method on two datasets | one from open-source projects hosted on GitHub and one commercial dataset of IBM projects. We manually analyze the identified dependencies, categorize them, and compare them to dependencies specified by the development team. We examine the types of projects involved in the identified ecosystems, the structure of the iden- ∗Corresponding author Email addresses: [email protected] (Kelly Blincoe), [email protected] (Francis Harrison), [email protected] (Navpreet Kaur), [email protected] (Daniela Damian) Preprint submitted to Elsevier February 2, 2019 tified ecosystems, and how the ecosystems structure compares with the social behaviour of project contributors and owners.
    [Show full text]
  • A Deep Dive Into Docker Hub's Security Landscape
    A Deep Dive into Docker Hub’s Security Landscape A story of inheritance? Emilien Socchi Jonathan Luu Thesis submitted for the degree of Master in Network and System Administration 30 credits Department of Informatics Faculty of Mathematics and Natural Sciences UNIVERSITY OF OSLO Spring 2019 A Deep Dive into Docker Hub’s Security Landscape A story of inheritance? Emilien Socchi Jonathan Luu © 2019 Emilien Socchi, Jonathan Luu A Deep Dive into Docker Hub’s Security Landscape http://www.duo.uio.no/ Printed: Reprosentralen, University of Oslo Abstract Docker containers have become a popular virtualization technology for running multiple isolated application services on a single host using minimal resources. That popularity has led to the cre- ation of an online sharing platform known as Docker Hub, hosting images that Docker containers instantiate. In this thesis, a deep dive into Docker Hub’s security landscape is undertaken. First, a Python based software used to conduct experiments and collect metadata, parental and vul- nerability information about any type of image available on Docker Hub is developed. Secondly, our tool allows analyzing the most recent image found in each Certified, Verified and Official repository, as well the most recent image found in 500 random Community repositories among the most popular ones. Using our software named Docker imAge analyZER (DAZER), the fol- lowing discoveries were made: (1) the Certified and Verified repositories introduced by Docker Inc. in December 2018 do not improve the overall Docker Hub’s security
    [Show full text]
  • Pythia: Grammar-Based Fuzzing of REST Apis with Coverage-Guided Feedback and Learning-Based Mutations
    Pythia: Grammar-Based Fuzzing of REST APIs with Coverage-guided Feedback and Learning-based Mutations Vaggelis Atlidakis Roxana Geambasu Patrice Godefroid Marina Polishchuk Baishakhi Ray Columbia University Columbia University Microsoft Research Microsoft Research Columbia University Abstract—This paper introduces Pythia, the first fuzzer that input format, and may also specify what input parts are to augments grammar-based fuzzing with coverage-guided feedback be fuzzed and how [41], [47], [7], [9]. From such an input and a learning-based mutation strategy for stateful REST API grammar, a grammar-based fuzzer then generates many new fuzzing. Pythia uses a statistical model to learn common usage patterns of a target REST API from structurally valid seed inputs, each satisfying the constraints encoded by the grammar. inputs. It then generates learning-based mutations by injecting Such new inputs can reach deeper application states and find a small amount of noise deviating from common usage patterns bugs beyond syntactic lexers and semantic checkers. while still maintaining syntactic validity. Pythia’s mutation strat- Grammar-based fuzzing has recently been automated in the egy helps generate grammatically valid test cases and coverage- domain of REST APIs by RESTler [3]. Most production-scale guided feedback helps prioritize the test cases that are more likely to find bugs. We present experimental evaluation on three cloud services are programmatically accessed through REST production-scale, open-source cloud services showing that Pythia APIs that are documented using API specifications, such as outperforms prior approaches both in code coverage and new OpenAPI [52]. Given such a REST API specification, RESTler bugs found.
    [Show full text]
  • Structure and Feedback in Cloud Service API Fuzzing
    Structure and Feedback in Cloud Service API Fuzzing Evangelos Atlidakis Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2021 © 2021 Evangelos Atlidakis All Rights Reserved ABSTRACT Stucture and Feedback in Cloud Service API Fuzzing Evangelos Atlidakis Over the last decade, we have witnessed an explosion in cloud services for hosting software applications (Software-as-a-Service), for building distributed services (Platform- as-a-Service), and for providing general computing infrastructure (Infrastructure-as-a- Service). Today, most cloud services are programmatically accessed through Applica- tion Programming Interfaces (APIs) that follow the REpresentational State Trans- fer (REST) software architectural style and cloud service developers use interface- description languages to describe and document their services. My thesis is that we can leverage the structured usage of cloud services through REST APIs and feedback obtained during interaction with such services in order to build systems that test cloud services in an automatic, efficient, and learning-based way through their APIs. In this dissertation, I introduce stateful REST API fuzzing and describe its imple- mentation in RESTler: the first stateful REST API fuzzing system. Stateful means that RESTler attempts to explore latent service states that are reachable only with sequences of multiple interdependent API requests. I then describe how stateful REST API fuzzing can be extended with active property checkers that test for violations of desirable REST API security properties. Finally, I introduce Pythia, a new fuzzing system that augments stateful REST API fuzzing with coverage-guided feedback and learning-based mutations.
    [Show full text]
  • Synapse: a Microservices Architecture for Heterogeneous-Database Web Applications
    Synapse: A Microservices Architecture for Heterogeneous-Database Web Applications Nicolas Viennot, Mathias Lecuyer,´ Jonathan Bell, Roxana Geambasu, and Jason Nieh fnviennot, mathias, jbell, roxana, [email protected] Columbia University Abstract added features, such as recommendations for what to search for, what to buy, or where to eat. Agility and ease of inte- The growing demand for data-driven features in today’s Web gration of new data types and data-driven services are key applications – such as targeting, recommendations, or pre- requirements in this emerging data-driven Web world. dictions – has transformed those applications into complex Unfortunately, creating and evolving data-driven Web ap- conglomerates of services operating on each others’ data plications is difficult due to the lack of a coherent data without a coherent, manageable architecture. We present architecture for Web applications. A typical Web applica- Synapse, an easy-to-use, strong-semantic system for large- tion starts out as some core application logic backed by a scale, data-driven Web service integration. Synapse lets in- single database (DB) backend. As data-driven features are dependent services cleanly share data with each other in added, the application’s architecture becomes increasingly an isolated and scalable way. The services run on top of complex. The user base grows to a point where performance their own databases, whose layouts and engines can be com- metrics drive developers to denormalize data. As features are pletely different, and incorporate read-only views of each removed and added, the DB schema becomes bloated, mak- others’ shared data. Synapse synchronizes these views in ing it increasingly difficult to manage and evolve.
    [Show full text]
  • Serendipity and Strategy in Rapid Innovation T
    Serendipity and strategy in rapid innovation T. M. A. Fink∗†, M. Reevesz, R. Palmaz and R. S. Farry yLondon Institute for Mathematical Sciences, Mayfair, London W1K 2XF, UK ∗Centre National de la Recherche Scientifique, Paris, France zBCG Henderson Institute, The Boston Consulting Group, New York, USA Innovation is to organizations what evolution is to organisms: it while, Bob chooses pieces such as axels, wheels, and small base is how organisations adapt to changes in the environment and plates that he noticed are common in more complex models, improve [1]. Governments, institutions and firms that innovate even though he is not able to use them straightaway to produce are more likely to prosper and stand the test of time; those new toys. We call this a far-sighted strategy. that fail to do so fall behind their competitors and succumb Who wins. At the end of the day, who will have innovated to market and environmental change [2, 3]. Yet despite steady the most? That is, who will have built the most new toys? We advances in our understanding of evolution, what drives inno- find that, in the beginning, Alice will lead the way, surging vation remains elusive [1, 4]. On the one hand, organizations ahead with her impatient strategy. But as the game progresses, invest heavily in systematic strategies to drive innovation [5{ fate will appear to shift. Bob's early moves will begin to look 8]. On the other, historical analysis and individual experience serendipitous when he is able to assemble a complex fire truck suggest that serendipity plays a significant role in the discovery from his choice of initially useless axels and wheels.
    [Show full text]