Object Migration in a Distributed, Heterogeneous SQL Database Network
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Computer Engineering (Datateknik) 2018 | LIU-IDA/LITH-EX-A--18/008--SE Object Migration in a Distributed, Heterogeneous SQL Database Network Datamigrering i ett heterogent nätverk av SQL-databaser Joakim Ericsson Supervisor : Tomas Szabo Examiner : Olaf Hartig Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrätt Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/. Copyright The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/. c Joakim Ericsson Abstract There are many different database management systems (DBMSs) on the market today. They all have different strengths and weaknesses. What if all of these different DBMSs could be used together in a heterogeneous network? The purpose of this thesis is to explore ways of connecting the many different DBMSs together. This thesis will explore suitable architectures, features, and performance of such a network. This is all done in the context of Ericsson’s wireless communication network. This has not been done in this context before, and a big part of the thesis is exploring if it is even possible. The result of this thesis shows that it is not possible to find a solution that can fulfill the requirements of such a network in this context. Acknowledgments Thanks to my family that has encouraged and supported me through all my years of education. It has not always been easy but entirely worth it looking back. This thesis marks a cornerstone of my formal education, but it is only the start of a lifelong learning. iv Contents Abstract iii Acknowledgments iv Contents v List of Figures vii List of Tables viii List of Listings ix List of Acronyms x 1 Introduction 1 1.1 Motivation . 1 1.2 Old System . 2 1.3 Aim............................................ 2 1.4 Initial System Description . 3 1.4.1 Requirements . 4 1.4.1.1 Heterogeneous Databases . 4 1.4.1.2 SQL Database . 4 1.4.1.3 Data Migration . 5 1.4.1.4 Performance Requirements . 5 1.5 Research Questions . 5 1.6 Delimitations . 5 2 Method 7 2.1 Pre-study . 7 2.2 System Design and Implementation . 8 2.3 Evaluation . 8 3 Theory 9 3.1 Relational Database Management Systems and SQL . 9 3.1.1 SQL . 9 3.1.2 ACID . 9 3.2 OLAP versus OLTP . 10 3.3 Distributed Database Management Systems . 11 3.4 ODBC, JDBC, and OLE DB . 12 3.5 Microsoft Linked Servers . 13 3.6 Distributed Query Engines . 13 3.6.1 PrestoDB . 13 3.6.2 Apache Drill . 14 v 3.7 Multistore and Polystore Systems . 15 3.7.1 CloudMdsQL and CoherentPaaS . 15 3.7.2 BigDAWG . 15 3.8 Data Warehousing . 16 4 Distributed Database 17 4.1 Database Schema . 17 4.2 Distribution Schema . 17 5 System Architecture 21 5.1 Centralized Approach . 21 5.2 Distributed Approach . 22 5.3 Possible Architectures . 23 5.3.1 CellApp Architectures . 23 5.3.1.1 Architecture A . 23 5.3.1.2 Architecture B . 24 5.3.1.3 Architecture C . 25 5.3.2 Machine Learning Application . 25 5.3.2.1 Architecture D . 26 5.3.2.2 Architecture E . 26 6 Results 28 6.1 Performance Measurements . 28 6.1.1 Test Setup . 28 6.1.1.1 Mock Application . 29 6.1.1.2 Database Management Systems (DBMSs) Tested . 30 6.1.1.3 Test Parameters . 31 6.1.2 Result using ODBC . 31 6.1.2.1 ODBC Using a High-performance Machine . 34 6.1.3 Using JDBC . 35 6.1.4 Introducing a Network Delay . 35 6.1.5 Introducing a Distributed Query Engine . 37 6.1.6 Comparing Relational Database Management Systems (RDBMSs) to a non SQL (NoSQL) Alternative . 38 6.2 The Machine Learning Application . 39 6.3 Data Migration . 40 6.4 Final System . 41 7 Discussion 43 7.1 Literature Study . 43 7.2 Performance Measurements . 44 7.3 Uniform Structured Query Language (SQL) Syntax . 44 7.4 Data Migration . 45 7.5 Method . 45 7.6 Final System . 46 7.7 Future Work . 46 7.8 Societal and Ethical Considerations . 46 8 Conclusion 48 Bibliography 50 A Protobuf Model 52 vi List of Figures 1.1 Data format . 2 1.2 Initial sketch of the proposed system architecture . 3 4.1 SQL database schema . 18 4.2 States table distribution schema . 19 4.3 Neighbors table distribution schema . 20 5.1 Architecture A - Simple direct connection . 23 5.2 Architecture B - Simple middleware . 24 5.3 Architecture C - Simple . 25 5.4 Architecture D - Middleware . 26 5.5 Architecture E - Database access component . 27 6.1 Test setup 1 . 29 6.2 MySQL Memory engine read and write operations . 33 6.3 SQLite read and write operations . 33 6.4 SQLite spike read and write operations . 34 6.5 SQLite main memory read and write operations . 34 6.6 Test setup 2 . 36 6.7 Test setup 3 . 37 6.8 Redis read and write operations . 39 6.9 Proposed solution . 42 vii List of Tables 6.1 Test parameters . 31 6.2 Test machine . 31 6.3 ODBC - Initial measurements of query read operations . 32 6.4 Test machine 2 . ..