Performance Comparison of the Most Popular Relational and Non-Relational Database Management Systems
Total Page:16
File Type:pdf, Size:1020Kb
Master of Science in Software Engineering February 2018 Performance comparison of the most popular relational and non-relational database management systems Kamil Kolonko Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona Sweden This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Author(s): Kamil Kolonko E-mail: [email protected], [email protected] External advisor: IF APPLICABLE University advisor: Javier González-Huerta Department of Software Engineering Faculty of Computing Internet : www.bth.se Blekinge Institute of Technology Phone : +46 455 38 50 00 SE-371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 i i ABSTRACT Context. Database is an essential part of any software product. With an emphasis on application performance, database efficiency becomes one of the key factors to analyze in the process of technology selection. With a development of new data models and storage technologies, the necessity for a comparison between relational and non-relational database engines is especially evident in the software engineering domain. Objectives. This thesis investigates current knowledge on database performance measurement methods, popularity of relational and non-relational database engines, defines characteristics of databases, approximates their average values and compares the performance of two selected database engines. Methods. In this study a number of research methods are used, including literature review, a review of Internet sources, and an experiment. Literature datasets used in the research incorporate over 100 sources including IEEE Xplore and ACM Digital Library. YCSB Benchmark was used as a direct performance comparison method in an experiment to compare OracleDB’s and MongoDB’s performance. Results. A list of database performance measurement methods has been defined as a result of the literature review. Two most popular database management engines, one relational and one non- relational has been identified. A set of database characteristics and a database performance comparison methodology has been identified. Performance of two selected database engines has been measured and compared. Conclusions. Performance comparison between two selected database engines indicated superior results for MongoDB under the experimental conditions. This database proved to be more efficient in terms of average operation latency and throughput for each of the measured workloads. OracleDB however, presented stable results in each of the category leaving the final choice of database to the specifics of a software engineering project. Activities required for the definition of database performance comparison methodology proved to be challenging and require study extension. Keywords: database, performance comparison, MongoDB, OracleDB CONTENTS ABSTRACT ...........................................................................................................................................I CONTENTS ......................................................................................................................................... II 1 INTRODUCTION ....................................................................................................................... 1 1.1 DEFINITIONS .......................................................................................................................... 1 1.2 HISTORY ................................................................................................................................ 1 1.3 PERFORMANCE ....................................................................................................................... 2 1.4 TAXONOMY ........................................................................................................................... 2 1.5 RELATIONAL VS NON-RELATIONAL ........................................................................................ 2 2 RELATED WORK ...................................................................................................................... 4 3 METHODOLOGY ...................................................................................................................... 6 4 DATABASE PERFORMANCE MEASUREMENT METHODS ........................................... 9 4.1 RESEARCH METHODOLOGY .................................................................................................... 9 4.2 RESEARCH QUESTION ........................................................................................................... 10 4.3 THE NEED FOR A REVIEW ..................................................................................................... 10 4.4 REVIEW ................................................................................................................................ 10 4.4.1 Search methodology ........................................................................................................ 10 4.4.2 Selection of articles ......................................................................................................... 11 4.4.3 Results ............................................................................................................................. 12 4.5 CONCLUSIONS ...................................................................................................................... 14 4.5.1 Method choice ................................................................................................................. 14 5 CHOICE OF COMPARED DBMSS ........................................................................................ 16 5.1 CHOICE CRITERIA ................................................................................................................. 16 5.2 INITIAL STUDY ..................................................................................................................... 17 5.3 RESEARCH METHODOLOGY .................................................................................................. 17 5.4 RESEARCH QUESTION ........................................................................................................... 18 5.5 RESEARCH ........................................................................................................................... 18 5.5.1 Search procedure ............................................................................................................ 18 5.5.2 Result .............................................................................................................................. 19 5.6 CONCLUSIONS ...................................................................................................................... 20 6 DATABASE CHARACTERISTICS DEFINITION ............................................................... 22 6.1 DATA SOURCES .................................................................................................................... 24 6.2 DATA ACQUISITION .............................................................................................................. 24 6.3 MEASURED DATABASE ......................................................................................................... 26 7 PERFORMANCE COMPARISON ......................................................................................... 27 7.1 SCHEMA ............................................................................................................................... 27 7.1.1 Schema translation ......................................................................................................... 27 7.2 MEASUREMENT METHOD ..................................................................................................... 28 7.3 WORKLOAD DEFINITION ...................................................................................................... 29 7.4 HARDWARE OUTLINE ........................................................................................................... 31 8 RESULTS ................................................................................................................................... 32 8.1 MONGODB .......................................................................................................................... 32 8.2 ORACLE ............................................................................................................................... 33 9 ANALYSIS AND DISCUSSION .............................................................................................. 35 9.1 WORKLOAD A ...................................................................................................................... 35 9.2 WORKLOAD B ...................................................................................................................... 38 9.3 WORKLOAD C ...................................................................................................................... 41 ii 9.4 WORKLOAD D ...................................................................................................................... 42 9.5 WORKLOAD E ...................................................................................................................... 45 9.6 WORKLOAD F .....................................................................................................................