Coordination Avoidance in Distributed Databases

Total Page:16

File Type:pdf, Size:1020Kb

Coordination Avoidance in Distributed Databases Coordination Avoidance in Distributed Databases By Peter David Bailis A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Joseph M. Hellerstein, Co-Chair Professor Ion Stoica, Co-Chair Professor Ali Ghodsi Professor Tapan Parikh Fall 2015 Coordination Avoidance in Distributed Databases Copyright 2015 by Peter David Bailis 1 Abstract Coordination Avoidance in Distributed Databases by Peter David Bailis Doctor of Philosophy in Computer Science University of California, Berkeley Professor Joseph M. Hellerstein, Co-Chair Professor Ion Stoica, Co-Chair The rise of Internet-scale geo-replicated services has led to upheaval in the design of modern data management systems. Given the availability, latency, and throughput penalties asso- ciated with classic mechanisms such as serializable transactions, a broad class of systems (e.g., “NoSQL”) has sought weaker alternatives that reduce the use of expensive coordina- tion during system operation, often at the cost of application integrity. When can we safely forego the cost of this expensive coordination, and when must we pay the price? In this thesis, we investigate the potential for coordination avoidance—the use of as little coordination as possible while ensuring application integrity—in several modern data- intensive domains. We demonstrate how to leverage the semantic requirements of appli- cations in data serving, transaction processing, and web services to enable more efficient distributed algorithms and system designs. The resulting prototype systems demonstrate regular order-of-magnitude speedups compared to their traditional, coordinated counter- parts on a variety of tasks, including referential integrity and index maintenance, transac- tion execution under common isolation models, and database constraint enforcement. A range of open source applications and systems exhibit similar results. i To my family ii Contents List of Figuresv List of Tables viii Acknowledgments ix 1 Introduction1 1.1 Coordination Avoidance.............................3 1.2 Primary Contributions..............................6 1.3 Outline and Previously Published Work.....................9 2 Coordination: Concepts and Costs 10 2.1 Coordination and Correctness in Database Systems.............. 10 2.2 Understanding the Costs of Coordination.................... 12 2.2.1 Latency.................................. 12 2.2.2 Throughput and Scalability....................... 14 2.2.3 Availability and Failures......................... 17 2.2.4 Summary: Costs............................. 19 2.2.5 Outcome: NoSQL, Historical Context, Safety and Liveness...... 19 2.3 System Model................................... 21 3 Invariant Confluence and Coordination 27 3.1 Invariant Confluence: Criteria Defined..................... 27 3.2 Invariant Confluence and Coordination-Free Execution............ 28 3.3 Discussion and Limitations............................ 33 3.4 Summary..................................... 34 4 Coordination Avoidance and Weak Isolation 36 4.1 ACID in the Wild................................. 36 4.2 Invariant Confluence Analysis: Isolation Levels................. 37 4.2.1 Invariant Confluent Isolation Guarantees................ 39 4.2.2 Sticky Availability............................ 44 4.2.3 Non-Invariant Confluent Semantics................... 45 CONTENTS iii 4.2.4 Summary................................. 48 4.3 Implications: Existing Algorithms and Empirical Impact............ 49 4.3.1 Existing Algorithms........................... 50 4.3.2 Empirical Impact: Isolation Guarantees................. 51 4.4 Isolation Models................................. 56 4.5 Summary..................................... 64 5 Coordination Avoidance and RAMP Transactions 65 5.1 Overview..................................... 67 5.2 Read Atomic Isolation in the Wild........................ 68 5.3 Semantics and System Model.......................... 71 5.3.1 RA Isolation: Formal Specification................... 71 5.3.2 RA Implications and Limitations.................... 72 5.3.3 RA Compared to Other Isolation Models................ 73 5.3.4 RA and Serializability.......................... 76 5.3.5 System Model and Scalability...................... 80 5.4 RAMP Transaction Algorithms......................... 81 5.4.1 RAMP-Fast................................ 82 5.4.2 RAMP-Small: Trading Metadata for RTTs............... 84 5.4.3 RAMP-Hybrid: An Intermediate Solution............... 87 5.4.4 Summary and Additional Details.................... 88 5.4.5 Distribution and Fault Tolerance.................... 91 5.4.6 Additional Semantics........................... 92 5.4.7 Further Optimizations.......................... 93 5.5 Experimental Evaluation............................. 93 5.5.1 Experimental Setup............................ 94 5.5.2 Experimental Results: Comparison................... 95 5.5.3 Experimental Results: CTP Overhead.................. 100 5.5.4 Experimental Results: Scalability.................... 100 5.6 Applying and Modifying the RAMP Protocols................. 101 5.6.1 Multi-Datacenter RAMP......................... 102 5.6.2 Quorum-Replicated RAMP Operation................. 104 5.6.3 RAMP, Transitive Dependencies, and Causal Consistency....... 105 5.7 RSIW Proof.................................... 108 5.8 RAMP Correctness and Independence...................... 111 5.9 Discussion..................................... 114 5.10 Summary..................................... 115 6 Coordination Avoidance for Database Constraints 117 6.1 Invariant Confluence of SQL Constraints.................... 117 6.1.1 Invariant Confluence for SQL Relations................ 118 6.1.2 Invariant Confluence for SQL Data Types............... 120 CONTENTS iv 6.1.3 SQL Discussion and Limitations..................... 121 6.2 More Formal Invariant Confluence Analysis of SQL Constraints....... 122 6.3 Empirical Impact: SQL-Based Constraints................... 130 6.3.1 TPC-C Invariants and Execution.................... 130 6.3.2 Evaluating TPC-C New-Order...................... 132 6.3.3 Analyzing Additional Applications................... 136 6.4 Constraints from Open Source Applications.................. 137 6.4.1 Background and Context......................... 139 6.4.2 Feral Mechanisms in Rails........................ 142 6.4.3 Rails Invariant Confluence Analysis................... 153 6.5 Quantifying Integrity Violations in Rails.................... 156 6.6 Other Frameworks................................ 165 6.7 Implications for Databases............................ 167 6.7.1 Summary: Database Shortcomings Today................ 167 6.7.2 Domesticating Feral Mechanisms.................... 168 6.8 Detailed Validation Behavior, Experimental Workload............. 170 6.8.1 Uniqueness Validation Behavior..................... 170 6.8.2 Association Validation Behavior..................... 171 6.8.3 Uniqueness Validation Schema...................... 171 6.8.4 Uniqueness Stress Test.......................... 172 6.8.5 Uniqueness Workload Test........................ 172 6.8.6 Association Validation Schema..................... 172 6.8.7 Association Stress Test.......................... 173 6.8.8 Association Workload Test........................ 174 6.9 Summary..................................... 175 7 Related Work 176 8 Conclusions 184 8.1 Design Patterns for Coordination Avoidance.................. 184 8.2 Limitations.................................... 185 8.3 Future Work.................................... 186 8.3.1 Automating Coordination Avoidance.................. 187 8.3.2 Comprehending Weak Isolation..................... 188 8.3.3 Emerging Application Patterns...................... 189 8.3.4 Statistical Coordination Avoidance................... 190 8.4 Closing Thoughts................................. 191 Bibliography 193 v List of Figures 1.1 An illustration of a distributed, replicated database and its relation to appli- cation servers and end users. In modern distributed databases, data is stored on several servers that may be located in geographically distant regions (e.g., Virginia and Oregon, or even different continents) and may be accessed by mul- tiple database clients (e.g., application servers, analytics frameworks, database administrators) simultaneously. The key challenge that we investigate in this thesis is how to minimize the amount of synchronous communication across databases while providing “always on,” scalable, and high performance access to each replica.....................................2 1.2 In this thesis, we develop the principle of Invariant Confluence, a necessary and sufficient condition for safe, convergent, coordination-free execution, and apply it to a range of application domains at increasing levels of abstraction: database isolation, database constraints, and safety properties from modern database- backed applications. Each guarantee that is invariant confluent is guaranteed to have at least one coordination-free implementation; we investigate the design of several implementations in this work, which operate at the database infrastruc- ture tier (Figure 1.1)..................................6 2.1 CDF of round-trip times for slowest inter- and intra- availability zone links com- pared to cross-region links.............................. 13 2.2 Microbenchmark performance of coordinated and coordination-free execution of transactions of varying
Recommended publications
  • Velocity London 2018
    Building Resilient Serverless Systems @johnchapin | symphonia.io John Chapin • Currently Partner, Symphonia • Former VP Engineering, Technical Lead • Data Engineering and Data Science teams • 20+ yrs experience in govt, healthcare, travel, and ad-tech • Intent Media, RoomKey, Meddius, SAIC, Booz Allen Agenda • What is Serverless? • Resiliency • Demo • Discussion and Questions What is Serverless? Serverless = FaaS + BaaS! • FaaS = Functions as a Service • AWS Lambda, Auth0 Webtask, Azure Functions, Google Cloud Functions, etc... • BaaS = Backend as a Service • Auth0, Amazon DynamoDB, Google Firebase, Parse, Amazon S3, etc... go.symphonia.io/what-is-serverless Serverless benefits • Cloud benefits ++ • Reduced cost • Scaling flexibility • Shorter lead time go.symphonia.io/what-is-serverless Serverless attributes • No managing of hosts or processes • Self auto-scaling and provisioning • Costs based on precise usage (down to zero!) • Performance specified in terms other than host size/count • Implicit high availability, but not disaster recovery go.symphonia.io/what-is-serverless Resiliency “Failures are a given and everything will eventually fail over time ...” –Werner Vogels (https://www.allthingsdistributed.com/2016/03/10-lessons-from-10-years-of-aws.html) K.C. Green, Gunshow #648 Werner on Embracing Failure • Systems will fail • At scale, systems will fail a lot • Embrace failure as a natural occurrence • Limit the blast radius of failures • Keep operating • Recover quickly (automate!) Failures in Serverless land • Serverless is all about using vendor-managed services. • Two classes of failures: • Application failures (your problem, your resolution) • All other failures (your problem, but not your resolution) • What happens when those vendor-managed services fail? • Or when the services used by those services fail? Mitigation through architecture • No control over resolving acute vendor failures.
    [Show full text]
  • How Are the Leaders of the Most Digitally Savvy Companies
    Executive Traits for Recognizing the Bountiful Opportunities Ahead CONCLUSION Author By Krishnan Ramanujam President, Business and Technology Services, Tata Consultancy Services Industries are being reordered today by the confluence of four technologies: cloud computing (which makes supercomputing power affordable even for startup firms), artificial intelligence (which ups the IQ of products, people and processes), big data and analytics (which turn operational chaos into coherency), and the internet of things (which lets us track products, people, customers, and premises around the clock/around the world). But why do so many companies ignore the transformation that is happening now around them? In their new book, MIT professors Andrew McAfee and Erik Brynjolfsson argue (as Kuhn did about scientific revolutions) that it is difficult to change long-accepted beliefs. “Existing processes, customers and suppliers, pools of expertise, and more general mindsets can all blind incumbents to things that should be obvious, such as the possibilities of new technologies that depart greatly from the status quo,” they write in Machine, Platform, Crowd: Harnessing Our Digital Future.70 It’s why “so many of the smartest and most experienced people and companies … [are] the least able to see” a transformation that they won’t escape. 70 Andrew McAfee and Erik Brynjolfsson, Machine, Platform, Crowd: Harnessing Our Digital Future (W.W. Norton & Company), published June 2017. http://books. wwnorton.com/books/Machine-Platform-Crowd/ Accessed July 28, 2017. 91 Is such blindness avoidable? I think so. To most successful companies of the last do so, senior executives need to sharpen 10 years—Apple, Amazon, and Netflix— or develop five traits that may have rapidly recognize the potential of AI and sat dormant in them, but which I think automation, cloud computing, IoT and reside in all of us.
    [Show full text]
  • AWS with Corey Quinn
    SED 745 Transcript EPISODE 745 [INTRODUCTION] [00:00:00] JM: Amazon Web Services changed how software engineers work. Before AWS, it was common for startups to purchase their own physical servers. AWS made server resources as accessible as an API request, and AWS has gone on to create higher-level abstractions for building applications. For the first few years of AWS, the abstractions were familiar. S3 provided distributed reliable objects storage. Elastic MapReduce provided a managed cloud Hadoop system. Kinesis provided a scalable queuing system. Amazon was providing developers with managed alternatives to complicated open source software. More recently, AWS has started to release products that are completely novel. They’re unlike anything else. A perfect example is AWS Lambda, the first function as a service platform. Other newer AWS products include Ground Station, which is a service for processing satellite data; and AWS DeepRacer, a miniature race car for developers to build and test machine learning algorithms on. As AWS has grown into new categories, the blog announcements for new services and features have started coming so frequently that is hard to keep track of it all. Corey Quinn is the author of Last Week in AWS, a popular newsletter about what is changing across Amazon Web Services. Corey joins the show today to give his perspective on the growing shifting behemoth that is Amazon Web Services as well as the other major cloud providers that have risen to prominence. Corey is the host of the Screaming in the Cloud podcast, which you should check out if you like this episode.
    [Show full text]
  • K12 Education Leaders! on Behalf of AWS, We Are Excited for You to Join Us for AWS Re:Invent 2020!
    © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. Welcome K12 Education Leaders! On behalf of AWS, we are excited for you to join us for AWS re:Invent 2020! This year’s virtual conference is going to be the industry event of the year. It is free of charge and will offer 5 Keynotes, 18 Leadership Sessions, and unlimited access to hundreds of sessions. We are happy to have you along for the journey! To help you prepare and be ready for this 3-week event, we’ve created this guide of suggested sessions and keynotes for K12 education. This is only a starting point. There is much more content on the AWS re:Invent website. Step 1: Register for re:Invent Once you register you will be able to search all 500+ sessions. Step 2: Register for Edu Rewind AWS education has also created weekly re:Invent Rewind sessions for education organizations to discuss announcements and takeaways throughout the event. Step 3: Follow @AWS_EDU on Twitter to join the conversation. We look forward to seeing you there! Learn 500+ sessions covering core services and emerging technologies will be available at your fingertips over 3 weeks. With hours of content to comb through, our team recommends the following sessions for K12 education professionals. K12 Session Recommendations LEADERSHIP TALKS Reimagine business applications from the ground up BIZ291-L DEC 1, 2020 | 2:15 PM - 3:15 PM EST Harness the power of data with AWS Analytics ANT291-L DEC 9, 2020 | 12:15 PM - 1:15 PM EST AWS security: Where we’ve been, where we’re going SEC291-L DEC 8, 2020
    [Show full text]
  • Analyzing Dynamic Capabilities in the Context of Cloud Platform Ecosystems - a Case Study Approach
    Junior Management Science 2(3) (2017) 124-172 Volume 2, Issue 3, December 2017 Advisory Editorial Board: DOMINIK VAN AAKEN JUNIOR FREDERIK AHLEMANN CHRISTOPH BODE ROLF BRÜHL MANAGEMENT JOACHIM BÜSCHKEN LEONHARD DOBUSCH RALF ELSAS DAVID FLORYSIAK SCIENCE GUNTHER FRIEDL WOLFGANG GÜTTEL CHRISTIAN HOFMANN Yasmin Le, The State of the Art in Cryptocurrencies 1 KATJA HUTTER LUTZ JOHANNING Fabian Müller, Einfluss von Commitment und Affekten STEPHAN KAISER auf das Investitionsverhalten in Projekten 11 ALFRED KIESER NATALIA KLIEWER Marcus Pfeiffer, Biases bei betriebswirtschaftlichen Junior Management Science Entscheidungen in Großprojekten und DODO ZU KNYPHAUSEN-AUFSEß SABINE T. KÖSZEGI Lösungsansätze: Aktueller Stand der Theorie ARJAN KOZICA und Empirie 48 TOBIAS KRETSCHMER HANS-ULRICH KÜPPER Simon Hux, Ankereffekt und Risikoprämie anhand einer REINER LEIDL Crowdfunding-Kampagne 73 ANTON MEYER GORDON MÜLLER-SEITZ Anastasia Kieliszek, Corporate Divestment Decision GÜNTER MÜLLER-STEWENS Factors: A Systematic Review 104 BURKHARD PEDELL MARCEL PROKOPCZUK Kevin Rudolph, Analyzing Dynamic Capabilities in the TANJA RABL Context of Cloud Platform Ecosystems - A Case SASCHA RAITHEL Study Approach 124 ASTRID REICHEL KATJA ROST MARKO SARSTEDT DEBORAH SCHANZ ANDREAS G. SCHERER STEFAN SCHMID UTE SCHMIEL CHRISTIAN SCHMITZ PHILIPP SCHRECK GEORG SCHREYÖGG LARS SCHWEIZER DAVID SEIDL journal homepage: www.jums.academy THORSTEN SELLHORN ANDREAS SUCHANEK ORESTIS TERZIDIS ANJA TUSCHKE SABINE URNIK STEPHAN WAGNER BARBARA E. WEIßENBERGER ISABELL M. WELPE HANNES WINNER CLAUDIA B. WÖHLE THOMAS WRONA THOMAS ZWICK Published by Junior Management Science e. V. Analyzing Dynamic Capabilities in the Context of Cloud Platform Ecosystems - A Case Study Approach Kevin Rudolph Technische Universität Berlin Abstract Dynamic capabilities (DCs) refer to a firm’s abilities to continuously adapt its resource base in order to respond to changes in its external environment.
    [Show full text]
  • Download.Oracle.Com/Docs/Cd/B28359 01/Network.111/B28530/Asotr Ans.Htm
    SCUOLA DI DOTTORATO IN INFORMATICA Tesi di Dottorato XXIV Ciclo A Distributed Approach to Privacy on the Cloud Francesco Pagano Relatore: Prof. Ernesto Damiani Correlatore: Prof. Stelvio Cimato Direttore della Scuola di Dottorato in Informatica: Prof. Ernesto Damiani Anno Accademico 2010/2011 Abstract The increasing adoption of Cloud-based data processing and storage poses a number of privacy issues. Users wish to preserve full control over their sensitive data and cannot accept it to be fully accessible to an external storage provider. Previous research in this area was mostly addressed at techniques to protect data stored on untrusted database servers; however, I argue that the Cloud architecture presents a number of specific problems and issues. This dissertation contains a detailed analysis of open issues. To handle them, I present a novel approach where confidential data is stored in a highly distributed partitioned database, partly located on the Cloud and partly on the clients. In my approach, data can be either private or shared; the latter is shared in a secure manner by means of simple grant-and-revoke permissions. I have developed a proof-of-concept implementation using an in-memory RDBMS with row-level data encryption in order to achieve fine-grained data access control. This type of approach is rarely adopted in conventional outsourced RDBMSs because it requires several complex steps. Benchmarks of my proof- of-concept implementation show that my approach overcomes most of the problems. 2 Acknowledgements I want to thank
    [Show full text]
  • Failure Modes and Effects Analysis (FMEA)
    A R C 3 3 5 - R Designing for failure: Architecting resilient systems on AWS Adrian Cockcroft Harsha Nippani Vinay Kola VP, Cloud Architecture Solutions Architect Software Engineer Amazon Web Services Amazon Web Services Snap Inc. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Risk and resilience • Technical considerations • Customer use case: Snap • Continuous resilience • Related sessions • AWS whitepaper © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Everything fails, all the time” - Werner Vogels (CTO, Amazon.com) Business continuity How much data can you afford to How quickly must you recover? recreate or lose? What is the cost of downtime? Disaster Recovery point (RPO) Recovery time (RTO) Data loss Downtime mission Availability by the numbers Level of availability Percent uptime Downtime per year Downtime per day 1 Nine 90% 36.5 Days 2.4 Hours 2 Nines 99% 3.65 Days 14 Minutes 3 Nines 99.9% 8.76 Hours 86 Seconds 4 Nines 99.99% 52.6 Minutes 8.6 Seconds 5 Nines 99.999% 5.26 Minutes 0.86 Seconds Daily Downtime in Seconds 1 Nine 2 Nines 3 Nines 4 Nines 5 Nines 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Daily Downtime in Seconds Multi-AZ architecture Region • Enables fault-tolerant applications Availability Zone • AWS Regional services designed to withstand AZ failures • Leveraged in the Amazon S3 design for 99.999999999% durability Availability Zone Availability Zone Multi-AZ Zero blast radius! Well-Architected Framework AWS Shared Responsibility Model Resilient AWS infrastructure
    [Show full text]
  • Startup Attendee Guide
    © 2020, Amazon Web Services, Inc. its affiliates. All rights reserved. or Welcome! © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. On behalf of AWS, we are excited for you to join us for re:Invent 2020! This year’s virtual conference is going to be the industry event of the year. It will offer 5 Keynotes, 18 Leadership Sessions, and unlimited access to hundreds of sessions. We are happy to have you along for the journey! To be sure you are prepared and ready to take on our 3-week event, we’ve created this guide full of tips and tricks provided directly from re:Invent alumni, teams, and staff on how to maneuver our free virtual event successfully. Step 1 – Register Be sure to take advantage of the resources outlined in this document. We’ve curated them to offer you advice on how to plan a successful 3-week agenda. The full schedule, including dates, times, and locations, goes live in early November. In the meantime, use this document to start thinking about how you’ll plan your 3 weeks. We look forward to seeing you there! Leadership Partners & Overview Keynotes Learn Make a Plan Communities Play Checklist Contact Sessions Sponsors Continuous Content Every Week Access hours of sessions led by AWS experts, hear from cloud leaders, and be the first to learn what's next and new from AWS. AWS re:Invent attendees should expect to: • Learn firsthand about AWS services from recognized world experts. • Focus on key services that matter most to their businesses and interests.
    [Show full text]
  • Professional LAMP : Linux, Apache, Mysql, and PHP Web Development / Jason Gerner
    01_59723x ffirs.qxd 10/31/05 6:37 PM Page iii Professional LAMP Linux®, Apache, MySQL®, and PHP5 Web Development Jason Gerner Elizabeth Naramore Morgan L. Owens Matt Warden 01_59723x ffirs.qxd 10/31/05 6:37 PM Page i Professional LAMP 01_59723x ffirs.qxd 10/31/05 6:37 PM Page ii 01_59723x ffirs.qxd 10/31/05 6:37 PM Page iii Professional LAMP Linux®, Apache, MySQL®, and PHP5 Web Development Jason Gerner Elizabeth Naramore Morgan L. Owens Matt Warden 01_59723x ffirs.qxd 10/31/05 6:37 PM Page iv Professional LAMP: Linux®, Apache, MySQL®,and PHP5 Web Development Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2006 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN-13: 978-0-7645-9723-7 ISBN-10: 0-7645-9723-X Printed in the United States of America Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 1MA/RW/RR/QV/IN Library of Congress Cataloging-in-Publication Data Professional LAMP : Linux, Apache, MySQL, and PHP Web development / Jason Gerner ... [et al.]. p. cm. ISBN-13: 978-0-7645-9723-7 (paper/website) ISBN-10: 0-7645-9723-X (paper/website) 1. Web site development. 2. Open source software. I. Title: Linux, Apache, MySQL, and PHP Web development. II. Gerner, Jason, 1978– TK5105.888.P677 2006 005.2'762—dc22 2005026487 No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600.
    [Show full text]
  • Chief of the Year: Amazon's Werner Vogels
    Jan. 9, 2009 Chief Of The Year: Amazon’s Werner Vogels Amazon’s CTO is the face of the new tech leader: a technologist who can articulate the business value of a technology like cloud computing to customers. This report examines how Vogels ended up in this role at the cutting edge of business technology leadership. IN THIS REPORT The New Face Of Tech Leadership . .2 Vogels On Architecture And His Role . .9 What We Can Learn From This Year’s Chief . .14 Copyright 2009 United Business Media LLC. Important Note: This PDF is provided solely as a reader service. It is not intended for reproduction or public distribution. For article reprints, e-prints and permissions please contact: PARS International Corp., 102 West 38th Street, Sixth Floor, New York, NY 10018 ; (212) 221-9595; www.magreprints.com/quickquote.asp Werner Vogels The New Face Of Tech Leadership FROM AN EIGHTH-FLOOR CONFERENCE ROOM at Amazon .com’s headquarters on Beacon Hill, with an expansive view of down- town Seattle to the north, Werner Vogels contemplates how large companies might use cloud computing. He points to system automa- tion and “autoscaling” beyond what IT departments have implement- ed internally. He envisions “partner clouds” of shared IT resources, Amazon’s data, and applications. He even suggests FedEx could one day run its customer-facing vital package-tracking application in the cloud. CTO and cloud evangelist is “For most of these things, redefining the role it’s really Day 1,” says of the business Amazon’s bearish CTO— technologist he’s 6 feet, 5 inches tall— dressed, as usual, in gray and black.
    [Show full text]
  • Data-Intensive Text Processing with Mapreduce
    Data-Intensive Text Processing with MapReduce Synthesis Lectures on Human Language Technologies Editor Graeme Hirst, University of Toronto Synthesis Lectures on Human Language Technologies is edited by Graeme Hirst of the University of Toronto.The series consists of 50- to 150-page monographs on topics relating to natural language processing, computational linguistics, information retrieval, and spoken language understanding. Emphasis is on important new techniques, on new applications, and on topics that combine two or more HLT subfields. Data-Intensive Text Processing with MapReduce Jimmy Lin and Chris Dyer 2010 Semantic Role Labeling Martha Palmer, Daniel Gildea, and Nianwen Xue 2010 Spoken Dialogue Systems Kristiina Jokinen and Michael McTear 2009 Introduction to Chinese Natural Language Processing Kam-Fai Wong, Wenjie Li, Ruifeng Xu, and Zheng-sheng Zhang 2009 Introduction to Linguistic Annotation and Text Analytics Graham Wilcock 2009 Dependency Parsing Sandra Kübler, Ryan McDonald, and Joakim Nivre 2009 Statistical Language Models for Information Retrieval ChengXiang Zhai 2008 Copyright © 2010 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Data-Intensive Text Processing with MapReduce Jimmy Lin and Chris Dyer www.morganclaypool.com ISBN: 9781608453429 paperback ISBN:
    [Show full text]
  • Response Resolver Maps Response from Data Source to Graphql Response Via Velocity Template
    M O B 3 1 8 - R AWS AppSync does that: Support for alternative data sources Josh Kahn Senior Solutions Architect Amazon Web Services © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS AppSync • Fully managed API layer • Implemented in GraphQL • Database agnostic • Enterprise security built in • Real-time support (via GraphQL subscriptions) • Offline SDKs (AWS Mobile SDK for iOS, AWS Mobile SDK for Android, AWS SDK for JavaScript in the Browser) Out-of-the-box data sources Amazon Elasticsearch Service Amazon DynamoDB /graphql AWS AppSync AWS Lambda Amazon Aurora Serverless HTTP AWS services “Seldom can one database fit the needs of multiple distinct use cases. The days of the one-size-fits-all monolithic database are behind us, and developers are now building highly distributed applications using a multitude of purpose-built databases.” Dr. Werner Vogels CTO of Amazon.com © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS AppSync bookstore • What functionality do we need? • What data sources might power that functionality? AWS AppSync bookstore: Cart detail • What functionality do we need? • What data sources might power that functionality? AWS AppSync bookstore architecture AWS Cloud Amazon ES DynamoDB Amazon DynamoDB Lambda Streams function VPC Lambda Amazon ElastiCache AWS AppSync function Lambda Amazon Neptune function AWS Step Functions workflow Amazon Simple Queue Lambda Lambda REST API Service function function Goodreads AWS AppSync resolvers Request resolver Maps payload to underlying data source via Velocity template to execute query/mutation. /graphql AWS AppSync DynamoDB Response resolver Maps response from data source to GraphQL response via Velocity template.
    [Show full text]