Tell: an Elastic Database System for Mixed Workloads

Total Page:16

File Type:pdf, Size:1020Kb

Tell: an Elastic Database System for Mixed Workloads DISS. ETH NO. 24147 Tell: An Elastic Database System for Mixed Workloads A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by MARKUS PILMAN Master of Science ETH in Computer Science, ETH Zurich born on 18. September 1983 citizen of Amriswil, Thurgau accepted on the recommendation of Prof. Dr. Donald Kossmann (ETH Zürich), examiner Dr. Philip A. Bernstein (Microsoft Research), co-examiner Prof. Dr. Peter Boncz (Vrije Universiteit Amsterdam), co-examiner Prof. Dr. Timothy Roscoe (ETH Zürich), co-examiner 2017 Abstract It is an exciting time to do database research. Two movements dominated the field for the last few years: Big Data and NoSQL. Both movements aroseoutof necessity, as cloud computing imposes new requirements on database systems. Cloud computing makes scalability and elasticity more important than ever. A user does not want to pay for computing and storage resources she does not use, but she expects to be able to get these resources as soon as they are needed. Traditional database management systems, however, are not able to meet these requirements. Early NoSQL systems provided elasticity and scalability by massively simplifying the provided consistency guarantees and the underlying data model. Most notably key value stores can scale to thousands of machines and allow resizing their cluster at runtime. However, their simplicity is also their greatest weakness: The lack of transactions makes it difficult to reason about concurrency, and the simple data model makes them difficult to use. Key value stores push most of their complexity into the application. As a result, more recent solutions try not only to add transactions, but they also implement complex operations in a layer above the underlying NoSQL storage. This layering is often referred to as SQL over NoSQL. Big Data, on the other hand, is about the analytical processing of massive amounts of data in the cloud. The Hadoop ecosystem and, more recently, Spark are the most prominent systems that play in this field. These systems allow for massive parallelization of complex analytical queries and are elastic and scalable. They achieve this by implementing a shared data architecture which decouples computing resources from storage resources. However, these Big Data platforms still have a problem: bringing the data from the online NoSQL (or SQL) database into Hadoop is a complex issue. Traditionally, this is solved like traditional data warehousing which is a heavy weight solution. 4 A system like Spark also can not simply use a key value store for its underlying storage, because current key value stores perform poorly when they have to deliver high volumes of data. This thesis introduces Tell, a distributed shared-data database management system that fills the gap between NoSQL and Big Data. Tell implements theSQL over NoSQL design principle: it performs transaction processing on top of a high- performance key-value store. At the same time, its key value store is heavily optimized for scan queries, allowing data processing engines to fetch their data directly from the online database. Zusammenfassung Big Data und NoSQL waren und sind Bewegungen die die Datenbankwelt in den letzten Jahren stark geprägt haben. Beide Bewegungen entstanden aus der Notwendigkeit neuen Anforderungen, die vor allem Cloud Computing an moderne Datenbanksysteme stellt, zu genügen. Die wichtigste Eigenschaft die ein System innerhalb der Cloud haben muss ist Elastizität. Ein Kunde will nicht für Ressourcen bezahlen die er nicht nutzt. Der Anbieter muss jedoch in der Lage sein, zusätzliche Ressourcen zur Verfügung zu stellen sobald sich die Anforderungen der Kunden ändern. Elastizität und Skalierbarkeit wurden von NoSQL Systemen zu beginn erreicht indem die Konsistenzgarantien und das unterstützte Datenmodel stark vereinfacht wurden. Dank diesen Vereinfachungen können, zum Beispiel, Key-Value Stores auf tausende von Maschinen skalieren und zur Laufzeit Machinen aus dem Cluster entfernen oder zum Cluster hinzufügen. Leider ist jedoch diese Vereinfachung sowohl eine Stärke als auch eine Schwäche: Die fehlenden Transaktionen und das stark vereinfachte Datenmodel erschweren die Benutzung dieser Systeme, da die Komplexität von der Datenbank in die App- likation verlagert wird. Aus diesen Gründen versuchen neuere Systeme Transak- tionen und komplexe Operationen eine Schicht oberhalb eines NoSQL storages zu implementieren. Dieser Ansatz wird häufig als SQL over NoSQL bezeichnet. Big Data ist ein anderer Trend der letzten Jahre. Wir produzieren immer mehr Daten und wollen mit analytischen Abfragen Informationen aus diesen Daten gewinnen. Hadoop, und seit ein paar wenigen Jahren Spark, sind die bekan- ntesten Produkte die aus diesem Trend hervor gingen. Diese Systeme erlauben mit massiver Parallelisierung komplexe Anfragen über grosse Datenmengen zu beantworten. Beide Systeme skalieren und sind elastisch. Dies wird erreicht indem die Recheneinheiten vom Storage getrennt werden. 6 Leider haben diese Systeme einen grossen Nachteil: die Daten die verarbeitet werden sollen befinden sich meistens in einem Datenbanksystem, entweder einer NoSQL oder einer traditionellen SQL Datenbank. Die Daten müssen dann kopiert werden damit sie von Spark oder Hadoop gelesen werden können. Leider können Spark und Hadoop die Daten nicht direkt aus dem Online-System lesen, da Key- Value Stores nicht optimiert sind um grosse Datenmengen zu lesen. Diese Dissertation stellt Tell vor, eine verteilte shared-data Datenbanklösung die die Kluft zwischen NoSQL und Big Data füllt. Tell arbeitet nach dem SQL over NoSQL Prinzip: es implementiert Transaktionen eine Schicht oberhalb eines schnellen Key-Value Stores. Der Key-Value Store ist zusätzlich in der Lage grosse Datenmengen direkt an analytische Systeme zu liefern. Dies erlaubt sowohl Echtzeit-Transaktionsverarbeitung und analytische Anfragen auf den selben Daten. Acknowledgments Dicebat Bernardus Carnotensis nos esse quasi nanos gigantum umeris insidentes, ut possimus plura eis et remotiora videre, non utique proprii visus acumine, aut eminentia corporis, sed quia in altum subvehimur et extollimur magnitudine gigantea (John of Salisbury: Metalogicon 3,4,46-50) Working on this thesis was a tremendous learning experience and this work would not have been possible without those who worked in the field before me and with me. I want to express my deepest gratitude towards Donald, who advised me not only during my doctorate but virtually throughout all my years at ETH. Donald granted me a lot of freedom, but he was always there when he was needed. I also want to thank Mothy for his support. Discussions with him were always insightful. Mothy’s knowledge within and outside of his field is impressive. I also want to thank him for his life-advice to my wife and me. Throughout my doctoral studies, I had the chance to meet and work together with brilliant people. I want to thank all the people in the Systems Group for interesting discussions and new insights. While working on a particular problem, it was always helpful to have people who presented me the problem from a different angle. At this point, a special thank goes to my office mates, Simon, Lucas, and Jonas. Sharing the workplace with them was always pleasant. During my studies, I did an internship at Microsoft Research under the supervi- sion of Phil Bernstein. I want to thank Phil who was an awesome adviser. Even though I sometimes felt dumb while talking to him, he was always patient and never condescending. During my short time at Microsoft, I also worked a lot with Sudipto Das, from whom I also learned a lot and who was a very pleasant person to work with. Microsoft Research is filled with brilliant people, and I am glad Ihad the chance to talk to many of them. Tell was quite a big project and I would not have been able to do it alone. A first thanks goes to Simon Loesing, who worked closely together with me on Tell for nearly three years. I had the rare luck to supervise only excellent master students during my doctorate. Thomas Etter and Kevin Bocksrocker did an enormous amount of the implementation work - during their master thesis and while working for the Systems Group as Software Engineers. Thanks goes to Daniel Bucher who did some important work on the Bd-Tree and Florian Köhl who did excellent work in the field of temporal query processing (which was an important side-project during my doctoral studies but is not part of the work presented in this thesis). I also want to thank Lucas and Renato who worked on Tell and TellStore. Lucas’ work on AIM (Analytics in Motion) influenced my work a lot, and he helped during the implementation and the experimentation. Last but not least, my biggest gratitude goes to my wife, Sonja. Her support was essential for my work as well as for my private life. Contents 1 Introduction . 15 1.1 Background and Motivation 15 1.2 Problem statement 17 1.3 Contributions 17 1.4 Overview of the thesis 18 2 The Shared Data Architecture . 19 2.1 Introduction 19 2.1.1 Finding the Right Architecture for the Right Job . 19 2.1.2 Requirements . 19 2.1.3 Contributions . 20 2.2 Shared Nothing vs Shared Disk 20 2.2.1 Shared Nothing . 20 2.2.2 Shared Disk . 22 2.2.3 Copying data . 23 2.3 Shared Data 24 2.4 Potential Bottlenecks in a Shared Data System 26 2.4.1 Network bottleneck . 27 2.4.2 CPU . 28 2.4.3 Memory Controller . 29 I TellStore - A Storage for mixed Workloads 3 Designing a Storage for Mixed Workloads . 33 3.1 Introduction 33 3.2 Requirements 33 3.3 Why is this difficult? 37 4 TellStore . 39 4.1 Features 40 4.1.1 In-Memory . 41 4.1.2 Shared Scans . 41 4.1.3 Predictability . 41 4.1.4 Lock- And Latch-Freeness . 41 4.1.5 Consistency . 42 4.1.6 Durability .
Recommended publications
  • An Opinionated Guide to Technology Frontiers
    TECHNOLOGY RADARVOL. 21 An opinionated guide to technology frontiers thoughtworks.com/radar #TWTechRadar Rebecca Martin Fowler Bharani Erik Evan Parsons (CTO) (Chief Scientist) Subramaniam Dörnenburg Bottcher Fausto Hao Ian James Jonny CONTRIBUTORS de la Torre Xu Cartwright Lewis LeRoy The Technology Radar is prepared by the ThoughtWorks Technology Advisory Board — This edition of the ThoughtWorks Technology Radar is based on a meeting of the Technology Advisory Board in San Francisco in October 2019 Ketan Lakshminarasimhan Marco Mike Neal Padegaonkar Sudarshan Valtas Mason Ford Ni Rachel Scott Shangqi Zhamak Wang Laycock Shaw Liu Dehghani TECHNOLOGY RADAR | 2 © ThoughtWorks, Inc. All Rights Reserved. ABOUT RADAR AT THE RADAR A GLANCE ThoughtWorkers are passionate about ADOPT technology. We build it, research it, test it, 1 open source it, write about it, and constantly We feel strongly that the aim to improve it — for everyone. Our industry should be adopting mission is to champion software excellence these items. We use them and revolutionize IT. We create and share when appropriate on our the ThoughtWorks Technology Radar in projects. HOLD ASSESS support of that mission. The ThoughtWorks TRIAL Technology Advisory Board, a group of senior technology leaders at ThoughtWorks, 2 TRIAL ADOPT creates the Radar. They meet regularly to ADOPT Worth pursuing. It’s 108 discuss the global technology strategy for important to understand how 96 ThoughtWorks and the technology trends TRIAL to build up this capability. ASSESS 1 that significantly impact our industry. Enterprises can try this HOLD 2 technology on a project that The Radar captures the output of the 3 can handle the risk.
    [Show full text]
  • Table of Contents
    Table of Contents Introduction and Motivation Theoretical Foundations Distributed Programming Languages Distributed Operating Systems Distributed Communication Distributed Data Management Reliability Applications Conclusions Appendix Distributed Operating Systems Key issues Communication primitives Naming and protection Resource management Fault tolerance Services: file service, print service, process service, terminal service, file service, mail service, boot service, gateway service Distributed operating systems vs. network operating systems Commercial and research prototypes Wiselow, Galaxy, Amoeba, Clouds, and Mach Distributed File Systems A file system is a subsystem of an operating system whose purpose is to provide long-term storage. Main issues: Merge of file systems Protection Naming and name service Caching Writing policy Research prototypes: UNIX United, Coda, Andrew (AFS), Frangipani, Sprite, Plan 9, DCE/DFS, and XFS Commercial: Amazon S3, Google Cloud Storage, Microsoft Azure, SWIFT (OpenStack) Distributed Shared Memory A distributed shared memory is a shared memory abstraction what is implemented on a loosely coupled system. Distributed shared memory. Focus 24: Stumm and Zhou's Classification Central-server algorithm (nonmigrating and nonreplicated): central server (Client) Sends a data request to the central server. (Central server) Receives the request, performs data access and sends a response. (Client) Receives the response. Focus 24 (Cont’d) Migration algorithm (migrating and non- replicated): single-read/single-write (Client) If the needed data object is not local, determines the location and then sends a request. (Remote host) Receives the request and then sends the object. (Client) Receives the response and then accesses the data object (read and /or write). Focus 24 (Cont’d) Read-replication algorithm (migrating and replicated): multiple-read/single-write (Client) If the needed data object is not local, determines the location and sends a request.
    [Show full text]
  • Learning Key-Value Store Design
    Learning Key-Value Store Design Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, Zichen Zhu Harvard University ABSTRACT We introduce the concept of design continuums for the data Key-Value Stores layout of key-value stores. A design continuum unifies major Machine Databases K V K V … K V distinct data structure designs under the same model. The Table critical insight and potential long-term impact is that such unifying models 1) render what we consider up to now as Learning Data Structures fundamentally different data structures to be seen as \views" B-Tree Table of the very same overall design space, and 2) allow \seeing" Graph LSM new data structure designs with performance properties that Store Hash are not feasible by existing designs. The core intuition be- hind the construction of design continuums is that all data Performance structures arise from the very same set of fundamental de- Update sign principles, i.e., a small set of data layout design con- Data Trade-offs cepts out of which we can synthesize any design that exists Access Patterns in the literature as well as new ones. We show how to con- Hardware struct, evaluate, and expand, design continuums and we also Cloud costs present the first continuum that unifies major data structure Read Memory designs, i.e., B+tree, Btree, LSM-tree, and LSH-table. Figure 1: From performance trade-offs to data structures, The practical benefit of a design continuum is that it cre- key-value stores and rich applications.
    [Show full text]
  • Mergers in the Digital Economy
    2020/01 DP Axel Gautier and Joe Lamesch Mergers in the digital economy CORE Voie du Roman Pays 34, L1.03.01 B-1348 Louvain-la-Neuve Tel (32 10) 47 43 04 Email: [email protected] https://uclouvain.be/en/research-institutes/ lidam/core/discussion-papers.html Mergers in the Digital Economy∗ Axel Gautier y& Joe Lamesch z January 13, 2020 Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Amazon and Microsoft (GAFAM) acquired 175 companies, from small start-ups to billion dollar deals. By investigating this intense M&A, this paper ambitions a better understanding of the Big Five's strategies. To do so, we identify 6 different user groups gravitating around these multi-sided companies along with each company's most important market segments. We then track their mergers and acquisitions and match them with the segments. This exercise shows that these five firms use M&A activity mostly to strengthen their core market segments but rarely to expand their activities into new ones. Furthermore, most of the acquired products are shut down post acquisition, which suggests that GAFAM mainly acquire firm’s assets (functionality, technology, talent or IP) to integrate them in their ecosystem rather than the products and users themselves. For these tech giants, therefore, acquisition appears to be a substitute for in-house R&D. Finally, from our check for possible "killer acquisitions", it appears that just a single one in our sample could potentially be qualified as such. Keywords: Mergers, GAFAM, platform, digital markets, competition policy, killer acquisition JEL Codes: D43, K21, L40, L86, G34 ∗The authors would like to thank M.
    [Show full text]
  • CURRICULUM FRAMEWORK and SYLLABUS for FIVE YEAR INTEGRATED M.Sc (DATA SCIENCE) DEGREE PROGRAMME in CHOICE BASED CREDIT SYSTEM
    Five year Integrated M.Sc (Data Science) Degree Programme 2019-2020 CURRICULUM FRAMEWORK AND SYLLABUS FOR FIVE YEAR INTEGRATED M.Sc (DATA SCIENCE) DEGREE PROGRAMME IN CHOICE BASED CREDIT SYSTEM FOR THE STUDENTS ADMITTED FROM THE ACADEMIC YEAR 2019-2020 ONWARDS THIAGARAJAR COLLEGE OF ENGINEERING (A Government Aided Autonomous Institution affiliated to Anna University) MADURAI – 625 015, TAMILNADU Phone: 0452 – 2482240, 41 Fax: 0452 2483427 Web: www.tce.edu BOS Meeting Approved: 21-12-2018 Approved in 57th Academic Council meeting on 05-01-2019 Five year Integrated M.Sc (Data Science) Degree Programme 2019-2020 THIAGARAJAR COLLEGE OF ENGINEERING, MADURAI 625 015 DEPARTMENT OF APPLIED MATHEMATICS AND COMPUTATIONAL SCIENCE VISION “Academic and research excellence in Computational Science” MISSION As a Department, We are committed to Achieve academic excellence in Computational Science through innovative teaching and learning processes. Enable the students to be technically competent to solve the problems faced by the industry. Create a platform for pursuing inter-disciplinary research among the faculty and the students to create state of art research facilities. Promote quality and professional ethics among the students. Help the students to learn entrepreneurial skills. BOS Meeting Approved: 21-12-2018 Approved in 57th Academic Council meeting on 05-01-2019 Five year Integrated M.Sc (Data Science) Degree Programme 2019-2020 Programme Educational Objectives (PEO) Post graduates of M.Sc.(Data Science) program will be PEO1: Utilizing strong quantitative aptitude and domain knowledge to apply quantitative modeling and data analysis techniques to provide solutions to the real world business problems. PEO2: Applying research and entrepreneurial skills augmented with a rich set of communication, teamwork and leadership skills to excel in their profession.
    [Show full text]
  • Mac Os X Database Application
    Mac Os X Database Application Splashy Moses always degum his Politburo if Barr is unprovident or unswathing but. Corny Ashton enervating hinderingly or evite ergo when Weylin is faceless. Butcherly Maurits sometimes cognizes his alodiums hard and rebelled so submissively! New platform for the next section names of your data source you to It tedious really disappointing the heir that amount has been zero progress with this issue, could this time. Also many question are using databases on their Macs such as. Expert users may configure the ODBC. This application that you. Check the app from zero progress with a tabbed format of applications that this, transforming raw data! DBeaver Community Free Universal Database Tool. Provide the administrator username and password. You exhibit even export your bay as an html-table and print labels. Understanding at precious glance. Best Database Management Software for Mac 2021 Reviews. What does Texas gain for not selling electricity across state lines and therefore avoiding Federal Power and oversight? Take this open snaptube will get into chartable form at first mac os x application functioning of your experience with live without using app. Transform all kinds of files into optimized for various displays PDFs with water motion. However, four of the defining features of this crime is it it comes with native TLS encryption to ensure that important business success never gets into these wrong hands. Get stomp to legal one million creative assets on Envato Elements. Fuzzee allows to mac os application has been easier for free file to the appropriate odbc data synchronization tool.
    [Show full text]
  • Confidentiality and Perfomance for Cloud Databases
    Research Collection Doctoral Thesis Confidentiality and Performance for Cloud Databases Author(s): Braun-Löhrer, Lucas Victor Publication Date: 2017 Permanent Link: https://doi.org/10.3929/ethz-a-010866596 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library DISS. ETH NO. 24055 Confidentiality and Performance for Cloud Databases A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by LUCAS VICTOR BRAUN-LÖHRER Master of Science ETH in Computer Science, ETH Zurich born on June 18, 1987 citizen of Aadorf, Thurgau accepted on the recommendation of Prof. Dr. Donald A. Kossmann, examiner Prof. Dr. Timothy Roscoe, co-examiner Prof. Dr. Renée J. Miller, co-examiner Prof. Dr. Thomas Neumann, co-examiner 2017 Typeset with LATEX © 2017 by Lucas Victor Braun-Löhrer Permission to print for personal and academic use, as well as permission for electronic reproduction and dissemination in unaltered and complete form are granted. All other rights reserved. Abstract The twenty-first century is often called the century of the information society. Theamount of collected data world-wide is growing exponentially and has easily reached the order of several million terabytes a day. As a result, everybody is talking about “big data” nowadays, not only in the research communities, but also very prominently in enterprises, politics and the press. Thanks to popular services, like Apple iCloud, Dropbox, Amazon Cloud Drive, Google Apps or Microsoft OneDrive, cloud storage has become a (nearly) ubiquitous and widely-used facility in which a huge portion of this big data is stored and processed.
    [Show full text]
  • Mergers in the Digital Economy
    2020/01 DP Axel Gautier and Joe Lamesch Mergers in the digital economy CORE Voie du Roman Pays 34, L1.03.01 B-1348 Louvain-la-Neuve Tel (32 10) 47 43 04 Email: [email protected] https://uclouvain.be/en/research-institutes/ lidam/core/discussion-papers.html Mergers in the Digital Economy∗ Axel Gautier y& Joe Lamesch z January 13, 2020 Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Amazon and Microsoft (GAFAM) acquired 175 companies, from small start-ups to billion dollar deals. By investigating this intense M&A, this paper ambitions a better understanding of the Big Five's strategies. To do so, we identify 6 different user groups gravitating around these multi-sided companies along with each company's most important market segments. We then track their mergers and acquisitions and match them with the segments. This exercise shows that these five firms use M&A activity mostly to strengthen their core market segments but rarely to expand their activities into new ones. Furthermore, most of the acquired products are shut down post acquisition, which suggests that GAFAM mainly acquire firm’s assets (functionality, technology, talent or IP) to integrate them in their ecosystem rather than the products and users themselves. For these tech giants, therefore, acquisition appears to be a substitute for in-house R&D. Finally, from our check for possible "killer acquisitions", it appears that just a single one in our sample could potentially be qualified as such. Keywords: Mergers, GAFAM, platform, digital markets, competition policy, killer acquisition JEL Codes: D43, K21, L40, L86, G34 ∗The authors would like to thank M.
    [Show full text]
  • Making Transactional Key-Value Stores Verifiably
    Cobra: Making Transactional Key-Value Stores Verifiably Serializable Cheng Tan, Changgeng Zhao, Shuai Mu?, and Michael Walfish NYU Department of Computer Science, Courant Institute ?Stony Brook University Abstract. Today’s cloud databases offer strong properties, of its operation. Meanwhile, any internal corruption—as could including serializability, sometimes called the gold standard happen from misconfiguration, operational error, compromise, database correctness property. But cloud databases are compli- or adversarial control at any layer of the execution stack—can cated black boxes, running in a different administrative domain cause a serializability violation. Beyond that, one need not from their clients. Thus, clients might like to know whether adopt a paranoid stance (“the cloud as malicious adversary”) the databases are meeting their contract. To that end, we intro- to acknowledge that it is difficult, as a technical matter, to pro- duce cobra; cobra applies to transactional key-value stores. vide serializability and geo-distribution and geo-replication It is the first system that combines (a) black-box checking, of and high performance under various failures [40, 78, 147]. (b) serializability, while (c) scaling to real-world online trans- Doing so usually involves a consensus protocol that inter- actional processing workloads. The core technical challenge acts with an atomic commit protocol [69, 96, 103]—a com- is that the underlying search problem is computationally ex- plex combination, and hence potentially bug-prone. Indeed, pensive. Cobra tames that problem by starting with a suitable today’s production systems have exhibited serializability vio- SMT solver. Cobra then introduces several new techniques, lations [1, 18, 19, 25, 26] (see also §6.1).
    [Show full text]
  • Mergers in the Digital Economy
    3136 REPRINT Axel Gautier, Lamesch Joé Mergers in the Digital Economy Information Economics and Policy CORE Voie du Roman Pays 34, L1.03.01 B-1348 Louvain-la-Neuve Tel (32 10) 47 43 04 Email:[email protected] https://uclouvain.be/en/research-institutes/ lidam/core/reprints.html Mergers in the Digital Economy∗ Axel Gautier y& Joe Lamesch z June 2, 2020 Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Apple and Microsoft (GAFAM) acquired 175 companies, from small startups to billion dollar deals. In this paper, we provide detailed information and statistics on the merger activity of the GAFAM and on the characteristics of the firms they acquire. One of the most intriguing features of these acquisitions is that, in the majority of cases, the product of the target is discontinued under its original brand name post acquisition and this is especially true for the youngest firms. There are three reasons to discontinue a product post acquisition: the product is not as successful as expected, the acquisition was not motivated by the product itself but by the target's assets or R&D effort, or by the elimination of a potential competitive threat. While our data does not enable us to screen between these explanations, the present analysis shows that most of the startups are killed in their infancy. This important phenomenon calls for tighter intervention by competition authorities in merger cases involving big techs. Keywords: Mergers, GAFAM, platform, digital markets, competition policy, killer acquisition JEL Codes: D43, K21, L40, L86, G34 ∗The authors would like to thank P.
    [Show full text]
  • COSC349—Cloud Computing Architecture David Eyers Learning Objectives
    Cloud Middleware and MBaaS COSC349—Cloud Computing Architecture David Eyers Learning objectives • Outline the usefulness of middleware for developing applications that use cloud computing • Contrast Apple CloudKit and Google Firebase in terms of relationships between provider, tenants and clients • Describe typical MBaaS services • MBaaS—Mobile Back-end as a Service (seen in XaaS lecture) • Sketch pricing approach of CloudKit versus Firebase COSC349 Lecture 20, 2020 2 Middleware for cloud computing • Middleware: OS-like functionality for apps beyond OS • e.g., OS accesses local files; middleware accesses cloud files • (assumption that OS isn’t cloud aware is less realistic over time…) • Middleware often eases translation between platforms • Typically use middleware via a software library • … contrast this with programming directly against cloud APIs • Focus on two (Mobile) Backend as a Service offerings: • Apple's CloudKit (2014)—iCloud launched 2011 • Google’s Firebase (~2014)—Firebase launched in 2011 COSC349 Lecture 20, 2020 3 MBaaS versus typical AWS services • Amazon has a relationship with the tenant • … but does not explicitly have any link with tenants’ clients • Contrast MBaaS from Apple, Google, etc.: • company has relationship with the tenant (e.g., iOS app dev.) • company also has a relationship with the client • Differences between Apple, Google, etc. • Apple has a financial link by selling kit to tenants’ clients • Google interests in cross-service linkage—across search, etc. • (Amazon doesn’t seem to leverage Amazon+AWS
    [Show full text]
  • Paper Title (Use Style: Paper Title)
    International Association of Scientific Innovation and Research (IASIR) ISSN (Print): 2279-0047 (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Online): 2279-0055 International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net BIG DATA ANALYTICS ON THE CLOUD Dr. Tamanna Siddiqui, Mohammad Al Kadri Department Of Computer Science Aligarh Muslim University, Aligarh, INDIA Abstract: The late advancements in innovations and construction modeling made the big data –which is information examination procedure – one of the sultriest points in our present studies. For that big data applications get to be more the most required programming in our life, in light of the fact that all our business, social, instructive and medicinal studies consider this idea as nightmare specially when we discuss storing and accessing client information in business clouds, or mining of social information. Big data needs a gigantic hardware equipment and processing assets, this needs made the little and medium estimated organizations be perplexed from these expenses. Anyhow, cloud computing offers all these needs for big data implementation to those organizations. It gives services to internet by using assets of the figuring framework to give diverse services of the web. It makes customers and organizations have the capacity to control their applications without establishment and retrieve their own data at any computer through internet access. The significant concerns according to cloud computing are security and loss of control. This paper presents an overview of big data concept, its famous interactive analysis tools (Apache Hive, BigQuery, CitusDB, ..), big data software (MATLAB, R, SAS, STATA, … ) and their comparisons on the basis of specific parameters .This paper also focuses on Cloud computing services and its four models (PaaS, SaaS, IaaS, HaaS).
    [Show full text]