Management of Polyglot Persistent Environments for Low Latency Data Access in Big Data

Total Page:16

File Type:pdf, Size:1020Kb

Management of Polyglot Persistent Environments for Low Latency Data Access in Big Data Beheer van heterogene opslagtechnologieën voor snelle datatoegang in 'Big Data'-omgevingen Management of Polyglot Persistent Environments for Low Latency Data Access in Big Data Thomas Vanhove Promotoren: prof. dr. ir. F. De Turck, dr. G. Van Seghbroeck Proefschrift ingediend tot het behalen van de graad van Doctor in de ingenieurswetenschappen: computerwetenschappen Vakgroep Informatietechnologie Voorzitter: prof. dr. ir. B. Dhoedt Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2017 - 2018 ISBN 978-94-6355-084-0 NUR 988, 995 Wettelijk depot: D/2018/10.500/2 QoE-beheer van HTTP-gebaseerde adaptieve videodiensten QoE Management of HTTP Adaptive Streaming Services Niels Bouten Promotoren: prof. dr. ir. F. De Turck, prof. dr. S. Latré Proefschrift ingediend tot het behalen van de graden van Doctor in de ingenieurswetenschappen: computerwetenschappen (Universiteit Gent) en Doctor in de wetenschappen: informatica (Universiteit Antwerpen) Universiteit Gent Faculteit IngenieurswetenschappenVakgroep Informatietechnologie en Architectuur Voorzitter: prof. dr. ir. D. De Zutter Faculteit IngenieurswetenschappenVakgroep Informatietechnologie en Architectuur Departement Wiskunde en Informatica Voorzitter: prof. dr. C. Blondia Leden van de examencommissie: Faculteit Wetenschappen prof. dr. ir. Filip De Turck (promotor) Universiteit Gent - imec Academiejaar 2016 - 2017 dr. Gregory Van Seghbroeck (promotor) Universiteit Gent - imec prof. dr. ir. Daniel¨ De Zutter (voorzitter) Universiteit Gent prof. dr. ir. Frank Gielen Universiteit Gent prof. dr. Guy De Tre´ Universiteit Gent prof. dr. ing. Erik Mannens Universiteit Gent - imec ir. Werner Van Leekwijck Universiteit Antwerpen dr. Anthony Liekens IO Lab Proefschrift tot het behalen van de graad van Doctor in de ingenieurswetenschappen: Computerwetenschappen Academiejaar 2017-2018 Dankwoord Je zou denken dat 5,5 jaar een lange tijd is, maar nu ik hier de laatste woorden van dit boek neerschrijf, lijkt het alsof augustus 2012 toch niet zo veraf is. Ik ben dankbaar dat ik dit hoofdstuk mag afsluiten, maar tegelijk betekent dat ook dat ik afscheid moet nemen van mensen die ik in de voorbije jaren mocht leren kennen. Alvorens ik dat doe, wil ik iedereen bedanken die in mindere of meerdere mate een invloed gehad heeft op dit boek of op mijn leven in het algemeen gedurende mijn tijd aan de Universiteit Gent. Daarnaast zijn er een paar mensen die ik uitdrukkelijk wens te bedanken: De eerste personen die ik wil bedanken zijn mijn promotoren Filip De Turck en Gregory Van Seghbroeck. Filip, bedankt om mij de mogelijkheid te geven om aan een doctoraat te beginnen. Bedankt om keer op keer weer beschikbaar te zijn voor een gesprek of feedback en om mij de flexibiliteit te geven om dit doctoraat tot een goed einde te brengen. Gregory, bedankt om samen met mij vorm te geven aan mijn doctoraat en om een klankbord te zijn voor mijn ideeen.¨ Ik denk dat het maar zelden voorkomt dat je ook samen met e´en´ van je promotoren een spin- off kan oprichten en leiden. Bedankt om dit avontuur met mij te willen aangaan en ik kijk uit naar wat de komende jaren met Qrama zullen brengen. Naast mijn promotoren zijn er ook nog een paar mensen die niet kunnen ontbreken omdat zonder hun dit werk en de voorbije 5 jaren er heel anders hadden uitgezien: Tim Wauters en Bruno Volckaert. Bedankt om vanaf dag e´en´ klaar te staan voor mij, om papers na te lezen, en mij tot op vandaag te ondersteunen. Mijn woordenschat is te beperkt om uit te drukken hoe dankbaar ik ben voor jullie steun. Bruno, jouw bizarre humor en de talrijke verhalen tijdens de autoritten van en naar Leuven zijn legendarisch. Ik hoop er nog veel te mogen horen. En sorry voor de San Francisco fietstocht! Er zijn ook nog een paar iconen die op zich een vermelding verdienen. Onze vakgroep Informatietechnologie en onderzoeksgroep IDLab Gent worden gerund door de bewonderenswaardige Bart Dhoedt en Piet Demeester. Zij zijn de drijf- veer die zorgen voor de constante vooruitgang en groei die onze groepen kennen. Daarnaast zit je als doctoraatsstudent vaak met praktische vragen, zeker in die eer- ste en laatste maanden van het doctoraat, maar ik kan mijn hand ervoor in het vuur steken dat je het antwoord steeds bij Martine en Davinia kan vinden. Ik vraag mij ten zeerste af wat iedereen zou doen moesten jullie niet elke dag paraat staan voor iedereen met een kleine of grote vraag. Bedankt! Zonder een technisch team is een internettechnologie-onderzoeksgroep maar weinig waard, dus ook veel dank gaat uit naar Brecht, Joeri, Bert, Vicent en Simon van het A-team. Thanks guys! ii Bedankt aan Bernadette en Joke van de financiele¨ administratie en, last but not least, Sabrina om steeds onze werkplek proper te houden en voor de spontane ge- sprekken in de wandelgangen. Een dankwoord is niet compleet zonder de schijnwerper even op de bureaus 2.21 en 200.012 te zetten. Bram, Jerico, Jeroen, Kristof, Laurens, Leandro, Maxim, Merlijn, Niels, Olivier, Philip, Piet, Sander, Stefano, Steven B, Steven VC, Thijs, Tim en Wim ... dankjewel! Bedankt om het waard te maken om elke dag, goed of slecht, naar de bureau te komen. Ik zal nog lang met een lach terugkijken op de vele fijne bureau-momenten en de lunches waarbij we de een of andere existentiele¨ levensvraag van Merlijn in onze richting geslingerd kregen. Like, is this even real life? Het is ook fijn om jouw thesisbegeleiders als collegas te leren kennen. Be- dankt Femke en Femke voor jullie begeleiding in dat laatste masterjaar en voor de verschillende gezellige board game momenten daarna. Soms was het noodzakelijk om ook eens weg te stappen van de bureau en het is dan leuk om collega’s te heb- ben waarbij je stoom kan aflaten bij het koffiemachien. Bedankt hiervoor Thomas en Wannes. Bedankt ook aan alle andere collega’s om deze ervaring zo uniek te maken! Als we het dan toch over unieke ervaringen hebben, dan moet ik even stilstaan bij het Tengu verhaal. Tengu is in de loop van mijn doctoraat zijn eigen leven gaan leiden en dat is allemaal dankzij het Tengu-team en zijn supporters. Gregory, Jeroen, Merlijn, Sander, Leandro, Tim, Bruno, Filip, bedankt om Tengu te maken tot wat het vandaag al geworden is. Sinds oktober 2016 hebben ook Sebastien, Mathijs, Dixan en Michiel hun steentje bijgedragen via Qrama. Bedankt om deel uit te maken van dit avontuur en bedankt voor de schitterende sfeer die jullie elke dag weer opnieuw maken. Het is een voorrecht om samen met jullie in de tech start-up wereld aan Tengu te bouwen. Soms was het gemakkelijk om te vergeten dat er nog een leven was buiten het doctoraat en Qrama, maar gelukkig zijn er vrienden waar je beroep op kan doen om alles eventjes achter te laten. Bros (Nicolas, Thibaut, Lino en Ewout), be- dankt voor de maandelijkse hangouts, jaarlijkse oudejaarsavonden en vele andere momenten die we mochten delen met of zonder onze wederhelften (Teresa, Elke, Inne en Inge). De talrijke ontmaskeringen van wannabe dictators, het bestrijden van virussen op wereldschaal en UNO met punten zijn ook momenten uit de vele board game nights (aka REDACTED) die ik koester dankzij vrienden zoals Bart, Melissa, Nicolas en Teresa. Allemaal heel oprecht bedankt! Ik voeg hier graag een speciale vermelding voor Maxim toe, want het is buitengewoon om een col- lega een vriend te kunnen noemen. Ik heb je leren kennen als de grappenmaker van de bureau, maar daarachter schuilt iemand waar je echt op kan rekenen en die voor zijn vrienden door het vuur zou gaan. Bedankt voor de Tomorrowlands, het door- sturen van de templates voor vanalles en nog wat en alle schitterende momenten die we mochten beleven. Zonder mijn familie had dit verhaal waarschijnlijk ook een andere wending gekend. Bedankt mama en papa om mij te stimuleren in alles wat ik onderneem en om er te zijn als het soms niet volledig uitdraait zoals ik gedacht had. Ik kijk enorm op naar jullie en alles wat jullie gerealiseerd hebben en kan alleen maar hopen om iii daar ooit een fractie van te bereiken. Nic, ik denk niet dat je beseft wat jij voor mij betekend hebt in deze laatste jaren en ik denk ook niet dat ik het ooit in woorden zal kunnen uitdrukken. Ik ben heel dankbaar om een broer en beste vriend zoals jou te hebben. Bedankt ook aan Filip, Sabine, Mem´ e,´ Michiel en An-Sofie voor de zovele gezellige momenten. Tot slot wil ik nog een heel speciaal iemand in mijn leven bedanken: Stephanie. Bollie, ik prijs mijzelf gelukkig dat ik jou al bijna 10 jaar in mijn leven heb. Dit laatste jaar heeft veel van mijn tijd ingenomen, maar zonder klagen of zagen was je er steeds om mij te ondersteunen in de moeilijke momenten en zo veel stress van mij weg te nemen. Bedankt om mijn leven in deze voorbije 10 jaar beter te maken en ik kan al niet wachten op wat de rest van ons leven in petto heeft. Ik hou van jou! Gent, januari 2018 Thomas Vanhove Table of Contents Dankwoordi Samenvatting xxi Summary xxv 1 Introduction1 1.1 The Impact of Big Data.......................1 1.2 Problem Statement.........................5 1.3 Research Contributions.......................6 1.4 Dissertation Outline.........................8 1.5 Publications............................. 10 1.5.1 A1: Journal publications indexed by the ISI Web of Sci- ence “Science Citation Index Expanded”......... 10 1.5.2 P1: Proceedings included in the ISI Web of Science “Con- ference Proceedings Citation Index - Science”....... 11 1.5.3 C1: Other publications in international conferences.... 12 1.6 Spin-offs............................... 12 References................................. 13 2 Managing the Synchronization in the Lambda Architecture for Opti- mized Big Data Analysis 15 2.1 Introduction............................. 16 2.2 Lambda architecture: overview and challenges........... 17 2.3 Synchronization..........................
Recommended publications
  • Nosql Distilled: a Brief Guide to the Emerging World of Polyglot Persistence
    NoSQL Distilled This page intentionally left blank NoSQL Distilled A Brief Guide to the Emerging World of Polyglot Persistence Pramod J. Sadalage Martin Fowler Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Capetown • Sydney • Tokyo • Singapore • Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales (800) 382–3419 [email protected] For sales outside the United States please contact: International Sales [email protected] Visit us on the Web: informit.com/aw Library of Congress Cataloging-in-Publication Data: Sadalage, Pramod J. NoSQL distilled : a brief guide to the emerging world of polyglot persistence / Pramod J Sadalage, Martin Fowler. p. cm. Includes bibliographical references and index.
    [Show full text]
  • Polyglot Persistency an Adaptive Data Layer for Digital Systems Table of Contents
    polyglot persistency an adaptive data layer for digital systems table of contents 03 Executive summary 04 Why NoSQL databases? 05 Persistence store families 06 Assessing NoSQL technologies 10 Polyglot persistency in amdocs digital 14 About Amdocs To properly serve such demand, the underlying data layer executive must adapt from what has been used for the past three decades into a new adaptive data infrastructure capable summary of handling the multiple types of data models that exist in the modern application design, and providing an optimal persistence method – also called polyglot persistence. The digital era has changed the way users consume software. They want to access it from everywhere, at any A polyglot is “someone who speaks or writes several time, get all the capabilities they need in one place, and languages”. Neal Ford first introduced the term ’polyglot complete transactions with a single click of a button. persistence’ in 2006 to express the idea that NoSQL applications would, by their nature, require differing To meet these high standards, enterprises embrace persistence stores based on the type of data and access cloud principles and tools that will allow them to be needs. geographically distributed, intensely transactional, and continuously available. This paper introduces the concept of polyglot persistence along with the guidelines that can be utilized to map Furthermore, they understand that their software must various application use-cases to the appropriate be architected to be agile, distributed and broken up persistence family. The document also provides examples into independent, loosely coupled services, also known as of how Amdocs Digital is leveraging different persistence microservices.
    [Show full text]
  • Multi-Model Database Management Systems - a Look Forward Zhen Hua Liu 1 , Jiaheng Lu2, Dieter Gawlick1, Heli Helskyaho2,3
    Multi-Model Database Management Systems - a Look Forward Zhen Hua Liu 1 , Jiaheng Lu2, Dieter Gawlick1, Heli Helskyaho2,3 Gregory Pogossiants4, Zhe Wu1 1Oracle Corporation 2University of Helsinki 3Miracle Finland Oy 4Soulmates.ai Abstract. The existence of the variety of data models and their associated data processing technologies make data management extremely complex. In this paper, we envision a single Multi-Model DataBase Management Systems (MMDBMS) providing declarative accesses to a variety of data models. We briefly review the history of the evolution of the DBMS technology to derive requirements of MMDBMSs and then we illustrate our ideas of building MMDBMSs satisfying those requirements. Since the relational algebra is not powerful enough to provide a mathematical foundation for MMDBMSs, we promote the category theory as a new theoretical foundation, which is a generalization of the set theory. We also suggest a set of shared data infrastructure services among data models to support “Just-In-Time” multi-model data access autonomously. 1 INTRODUCTION – why MMDBMS? Here is a short history of databases: Initially, database management systems sup- ported the hierarchical and the network model (e.g., IBM’s IMS and GE’s IDS re- spectively). These databases evolved very fast and developed the core infrastructures, such as journaling, transactions, locking, 2PC (group and fast commit), recovery, restart, fault tolerance, high performance, TP-monitors, messaging, main storage da- tabases, and much, much more. We still use these concepts today. In the 80’ and 90’, these databases were widely replaced by the relational database management systems (RDBMS). The main argument is its solid theoretical foundation: set based relational data model and declarative query language (SQL) based on abstract algebra over set processing.
    [Show full text]
  • Noxperanto: Crowdsourced Polyglot Persistence Antonio Maccioni, Orlando Cassano, Yongming Luo, Juan Castrejón, Genoveva Vargas-Solar
    NoXperanto: Crowdsourced Polyglot Persistence Antonio Maccioni, Orlando Cassano, Yongming Luo, Juan Castrejón, Genoveva Vargas-Solar To cite this version: Antonio Maccioni, Orlando Cassano, Yongming Luo, Juan Castrejón, Genoveva Vargas-Solar. NoX- peranto: Crowdsourced Polyglot Persistence. Polibits, 2014, 50, pp.43 - 48. 10.17562/PB-50-6. hal-01584798 HAL Id: hal-01584798 https://hal.archives-ouvertes.fr/hal-01584798 Submitted on 11 Sep 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. NoXperanto: Crowdsourced Polyglot Persistence Antonio Maccioni, Orlando Cassano, Yongming Luo, Juan Castrejon,´ and Genoveva Vargas-Solar Abstract—This paper proposes NOXPERANTO, a novel modern application development [2], [3]. They enable the crowdsourcing approach to address querying over data integration of these data stores for managing big datasets in collections managed by polyglot persistence settings. The main a scalable and loosely coupled way. This approach seems to contribution of NOXPERANTO is the ability to solve complex queries involving different data stores by exploiting queries be a pertinent strategy to deal with big datasets integration. from expert users (i.e. a crowd of database administrators, data Nonetheless, the combination of these heterogeneous data engineers, domain experts, etc.), assuming that these users can stores, flexible schemas and non-standard APIs, represent an submit meaningful queries.
    [Show full text]
  • PDDM: a Database Design Method for Polyglot Persistence
    American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS) ISSN (Print) 2313-4410, ISSN (Online) 2313-4402 © Global Society of Scientific Research and Researchers http://asrjetsjournal.org/ PDDM: A Database Design Method for Polyglot Persistence Cristofer Zdepskia, Tarcizio Alexandre Binib*, Simone Nasser Matosc a,b,cFederal University of Technology – Parana (UTFPR), Address Doutor Washington Subtil Chueire,330- Jardim Carvalho, Ponta Grossa, 84017-220, Parana, Brazil aEmail: [email protected] bEmail: [email protected] cEmail: [email protected] Abstract Databases by Web 2.0 has revealed the limitations of the relational model related to scalability. This led to the emergence of NoSQL databases, with data storage models other than relational ones. These databases propose solutions to such limitations through horizontal scalability and partially compromise data consistency. The combination of multiple data models, called polyglot persistence, extends these solutions by providing resources for the implementation of complex systems that have components with distinct requirements that would not be possible by the use of only one data model in a satisfactory way. However, there are no consolidated methods for the NoSQL database design and neither methods for design systems that apply the polyglot persistence. This work proposes a database design method applied to systems that use polyglot persistence, combining different data models. This method can be applied to the relational model and aggregate-oriented NoSQL data models. The method defines a set of sub-steps based on the existing concepts of database design. The goal is to define a formal process to assist in defining the data models to be used and to transform the conceptual design into a logical design.
    [Show full text]
  • A Big Data Roadmap for the DB2 Professional
    A Big Data Roadmap For the DB2 Professional Sponsored by: align http://www.datavail.com Craig S. Mullins Mullins Consulting, Inc. http://www.MullinsConsulting.com © 2014 Mullins Consulting, Inc. Author This presentation was prepared by: Craig S. Mullins President & Principal Consultant Mullins Consulting, Inc. 15 Coventry Ct Sugar Land, TX 77479 Tel: 281-494-6153 Fax: 281.491.0637 Skype: cs.mullins E-mail: [email protected] http://www.mullinsconsultinginc.com This document is protected under the copyright laws of the United States and other countries as an unpublished work. This document contains information that is proprietary and confidential to Mullins Consulting, Inc., which shall not be disclosed outside or duplicated, used, or disclosed in whole or in part for any purpose other than as approved by Mullins Consulting, Inc. Any use or disclosure in whole or in part of this information without the express written permission of Mullins Consulting, Inc. is prohibited. © 2014 Craig S. Mullins and Mullins Consulting, Inc. (Unpublished). All rights reserved. © 2014 Mullins Consulting, Inc. 2 Agenda Uncover the roadmap to Big Data… the terminology and technology used, use cases, and trends. • Gain a working knowledge and definition of Big Data (beyond the simple three V's definition) • Break down and understand the often confusing terminology within the realm of Big Data (e.g. polyglot persistence) • Examine the four predominant NoSQL database systems used in Big Data implementations (graph, key/value, column, and document) • Learn some of the major differences between Big Data/NoSQL implementations vis-a-vis traditional transaction processing • Discover the primary use cases for Big Data and NoSQL versus relational databases © 2014 Mullins Consulting, Inc.
    [Show full text]
  • Multi-Model Databases: a New Journey to Handle the Variety of Data
    0 Multi-model Databases: A New Journey to Handle the Variety of Data JIAHENG LU, Department of Computer Science, University of Helsinki IRENA HOLUBOVA´ , Department of Software Engineering, Charles University, Prague The variety of data is one of the most challenging issues for the research and practice in data management systems. The data are naturally organized in different formats and models, including structured data, semi- structured data and unstructured data. In this survey, we introduce the area of multi-model DBMSs which build a single database platform to manage multi-model data. Even though multi-model databases are a newly emerging area, in recent years we have witnessed many database systems to embrace this category. We provide a general classification and multi-dimensional comparisons for the most popular multi-model databases. This comprehensive introduction on existing approaches and open problems, from the technique and application perspective, make this survey useful for motivating new multi-model database approaches, as well as serving as a technical reference for developing multi-model database applications. CCS Concepts: Information systems ! Database design and models; Data model extensions; Semi- structured data;r Database query processing; Query languages for non-relational engines; Extraction, trans- formation and loading; Object-relational mapping facilities; Additional Key Words and Phrases: Big Data management, multi-model databases, NoSQL database man- agement systems. ACM Reference Format: Jiaheng Lu and Irena Holubova,´ 2019. Multi-model Databases: A New Journey to Handle the Variety of Data. ACM CSUR 0, 0, Article 0 ( 2019), 38 pages. DOI: http://dx.doi.org/10.1145/0000000.0000000 1.
    [Show full text]
  • Nosql Distilled: a Brief Guide to the Emerging World of Polyglot
    NoSQL Distilled A Brief Guide to the Emerging World of Polyglot Persistence Pramod J. Sadalage Martin Fowler Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Capetown • Sydney • Tokyo • Singapore • Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales (800) 382–3419 [email protected] For sales outside the United States please contact: International Sales [email protected] Visit us on the Web: informit.com/aw Library of Congress Cataloging-in-Publication Data: Sadalage, Pramod J. NoSQL distilled : a brief guide to the emerging world of polyglot persistence / Pramod J Sadalage, Martin Fowler. p. cm. Includes bibliographical references and index. ISBN 978-0-321-82662-6 (pbk.
    [Show full text]
  • Polyglot Persistence: Handling Multiple Datastores
    Vol-2 Issue-6 2016 IJARIIE-ISSN(O)-2395-4396 Polyglot Persistence: Handling Multiple Datastores Ms. Namrata Rawal1, Ms. Vatika Sharma2 1PG Student, Network Security, GTU PG School, Ahmedabad, Gujarat, India 2I-verve Infoweb Company Ahmedabad, Gujarat, India ABSTRACT Handling Big Data means to handle huge databases. Although we have seen that we are handling reltaional data or non-relational data at a time using Map Reduce framework. In other words we can say that handling of multiple datastores cannot be done at a time. So, polyglot persistence come into place to handle it. This paper focuses on Polyglot persistence it is the term that used to describe different data storage technologies to handle multiple data stores at a same time. Keywords:-Polyglot persistence, Hadoop, MR, oracle cloud, NoSQL 1. INTRODUCTION Handling Big Data means to handle a huge amount of database without any restrictions of relational databases (data storage in form of rows and columns) so; it begins the concept of NoSQL databases to handle non-relational databases on an oracle cloud with no restrictions. 2. MOTIVTAION Honestly speaking there is nothing new under the Sun. Integration has been a problem that we have faced for 40-plus years in IT. But there are many new ways to handle a problem. So the role is shaped from SOA (Service Oriented Architecture) on the cloud. Another one perspective is big data is also coming in, so the problem is how to integrate that data whether by using historical perspective or from data mining an analytical perspective. So a question arises how we deal or handle huge volumes of data.
    [Show full text]
  • Polyglot Database Architectures = Polyglot Challenges
    Polyglot database architectures = polyglot challenges Lena Wiese Institute of Computer Science University of G¨ottingen Goldschmidtstraße 7 37077 G¨ottingen, Germany [email protected] Abstract. We categorize polyglot database architectures into three types (polyglot persistence, lambda architecture and multi-model databases) and discuss their advantages and disadvantages. 1 Polyglot Database Architectures When designing the data management layer for an application, several database requirements may be contradictory. For example, regarding access patterns some data might be accessed by write-heavy workloads while others are accessed by read-heavy workloads. Regarding the data model, some data might be of a dif- ferent structure than other data; for example, in an application processing both social network data and order or billing data, the former might usually be graph- structured while the latter might be semi-structured data. Regarding the access method, a web application might want to access data via a REST interface while another application might prefer data access with query language. It is hence worthwhile to consider a database and storage architecture that includes all these requirements. We describe three option for polyglot database architectures in the following three sections. 1.1 Polyglot Persistence Instead of choosing just one single database management system to store the en- tire data, so-called polyglot persistence could be a viable option to satisfy all requirements towards a modern data management infrastructure. Polyglot per- sistence (a term coined in [4]) denotes that one can choose as many databases as needed so that all requirements are satisfied. Polyglot persistence can in partic- ular be an optimal solution when backward-compatibility with a legacy appli- cation must be ensured.
    [Show full text]
  • POLYGLOT PERSISTENCE for ADDRESSING DATA MANAGEMENT on the CLOUD Genoveva Vargas-Solar, CNRS, LIG-LAFMIA, Juan Carlos Castrejon, U
    POLYGLOT PERSISTENCE FOR ADDRESSING DATA MANAGEMENT ON THE CLOUD Genoveva Vargas-Solar, CNRS, LIG-LAFMIA, Juan Carlos Castrejon, U. de Grenoble, Christine Collet, Grenoble INP, Rafael Lozano, ITESM-CCM [email protected] http://vargas-solar.imag.fr DATA MANAGEMENT WITH RESOURCES CONSTRAINTS STORAGE SUPPORT RAM ARCHITECTURE & RESOURCES AWARE Algorithms Systems Efficiently manage and exploit data sets according to given specific storage, memory and computation resources 2 DATA MANAGEMENT WITHOUT RESOURCES CONSTRAINTS ELASTIC COSTAWARE Algorithms Systems Costly manage and exploit data sets according to unlimited storage, memory and computation resources 3 DEALING WITH HUGE AMOUNTS OF DATA Yota 1024 Cloud Zetta 1021 Exa 1018 RAID Peta 1015 Disk 4 DEALING WITH HUGE AMOUNTS OF DATA Relational Graph Yota 1024 Key value Columns Cloud Zetta 1021 Exa 1018 RAID Concurrency 15 Peta 10 Consistency Disk Atomicity 5 6 7 8 9 ROADMAP DATA ALL POLYGLOT PUTTING CONCLUSION AROUND IN THE PERSISTENCE POLYGLOT AND OUTLOOK ERA OF THE PERSISTENCE IN CLOUD PRACTICE 10 POLYGLOT PERSISTENCE • Polyglot Programming: applications should be written in a mix of languages to take advantage of different languages are suitable for tackling different problems • Polyglot persistence: any decent sized enterprise will have a variety of different data storage technologies for different kinds of data • a new strategic enterprise application should no longer be built assuming a relational persistence support • the relational option might be the right one - but you should seriously
    [Show full text]
  • An Approach for Modeling Polyglot Persistence
    An Approach for Modeling Polyglot Persistence Cristofer Zdepski, Tarcizio Alexandre Bini and Simone Nasser Matos Federal Technological University of Parana, Av Monteiro Lobato, s/n - Km 04, 84016210, Ponta Grossa, Parana, Brazil Keywords: NoSQL, Database Design, Polyglot Persistence, Logical Level Design. Abstract: The emergence of NoSQL databases has greatly expanded database systems in both storage capacity and performance. To make use of these capabilities many systems have integrated these new data models into existing applications, making use of multiple databases at the same time, forming a concept called ”Polyglot Persistence”. However, the lack of a methodology capable of unifying the design of these integrated data models makes design a difficult task. To overcome this lack, this paper proposes a modeling methodology capable of unifying design patterns for these integrated databases, bringing an overview of the system, as well as a detailed view of each database design. 1 INTRODUCTION cations also brings out some weaknesses of the relati- onal model by working with static schemas. The popularization of Web 2.0 brought an dramati- Inspired by this limitations, NoSQL databases cally increase in data generation. Nowadays databa- have emerged as the solution to the growing need ses easily exceed petabytes scale, with daily increase of systems for information. As they expand over rates in the terabytes scale. As an example we can the Internet its data increases even faster. Accor- take the Facebook database, which in 2014 had a size ding to (Sadalage and Fowler, 2012) and (Cattell, of 300 petabytes, and a daily increase rate of 600 tera- 2011) the biggest goal of NoSQL databases is their bytes, according to (Wilfong and Vagata, 2014).
    [Show full text]