Performance Aspect on Databases and Virtualized Real-Time Applications

Performance Aspect on Databases and Virtualized Real-time Applications Christine Niyizamwiyitira Blekinge Institute of Technology doctoral dissertation series No 2018:02 Performance Aspects of Database and Virtualized Real-time Applications Christine Niyizamwiyitira Doctoral Dissertation in Computer Systems Engineering Department of Computer Science and Engineering Blekinge Institute of Technology SWEDEN 2018 Christine Niyizamwiyitira Department of Computer Science and Engineering Publisher: Blekinge Institute of Technology, SE-371 79 Karlskrona, Sweden ISBN: 978-91-7295-348-2 ISSN: 1653-2090 urn:nbn:se:bth-15758 Abstract Context: High computing system performance depends on the interaction between software and hardware layers in modern computer systems. Two strong trends that effect different layers in computer systems are that single processors are now more or less completely replaced by multiprocessors, which are often organized into clusters, and virtualization of resources. The performance evaluation of different software on such physical and virtualized resources, is the focus of this thesis. Objectives: The objectives of this thesis are to investigate the performance evaluation of SQL and No SQL database management systems, namely Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB; and soft real-time application namely, voice-driven web. Scheduling algorithms for resource allocation for hard real-time applications on virtual processor are also investigated. Methods: Experiment is used to measure the performance of SQL and No SQL management systems on cluster. It is also used to develop a prototype and predicts processor performance of voice-driven web on multiprocessors. Theoretical methods are used to model and design algorithms to schedule real-time applications on the virtual processor machine. Simulation is used to quantify the performance implications of certain parameter values in our theoretical results and to compare expected performance with theoretical bounds in our schedulability tests. Results: The performance of SQL and NoSQL database management systems namely, Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB is evaluated in terms of writing and reading throughput and latencies in cluster computing. For reading throughput, all database systems are horizontally scalable as the cluster’s nodes number increases, however, only Cassandra and couchDB exhibit scalability for data writing. The overall evaluation shows that Cassandra has the most writing scalable throughput as the number of nodes increases with a relative low latency, whereas PostgreSQL has the lowest writing latency, and MongoDB has the lowest reading latency. i The architectures’ tradeoffs of voice-driven web show that the voice engine should be installed on the server instead of being on the mobile device, and performance evaluations show that speech engine scales with respect to the number of cores in the multiprocessor with and without hyperthreading. The thesis presents scheduling techniques for real-time applications that runs in virtual machines which are time sharing the processor. Each virtual machine’s period and execution time that allow real-time applications to meet their deadlines can be defined using these techniques. Simulation results show the impact of the length of different VM periods with respect to overhead. The tradeoffs between resources consumption and period length are also given. Furthermore, a utilization based test for scheduling real-time application on virtual multiprocessor is presented. This test determines if a task set is schedulable or not. If the task set is schedulable the algorithm provides the priority for each task. This algorithm avoids Dhall’s effect, which may cause task sets with even very low utilization to miss deadlines. Conclusions: The thesis presented the performance evaluation of reading and writing throughput and latencies for SLQ and NoSQL management systems on cluster. The thesis quantifies the tradeoffs of voice-driven web architectures and the performance scalability of the speech engine with respect to number of cores of the multiprocessor. Furthermore, this thesis proposes scheduling algorithms for real-time application with hard deadline on virtual processors, either as a single core processor or as a multicore processor. Keywords: SLQ and NoSQL database, Bigdata management systems, Structured and non-structured Database Evaluation, Voice-driven web, Multicore performance prediction, Hard real-time Scheduling, Virtual Multiprocessor Scheduling. ii Acknowledgements I would like to thank my main supervisor, Professor Lars Lundberg. His guidance, generous and continuous support have been a great source of encouragement through the course towards the Ph.D. degree. I am particularly grateful for his patience and humility in dealing with me. I would like also to thank my supervisor Dr. Mikael Svahnberg. I am particularly grateful for his inputs during the supervisory meetings, which have brought new perspectives and understanding during my PhD study. I would like to thank my senior reviewer Professor Håkan Lennerstad for his very good mathematical explanation as a part of my research. My time at Blekinge Tekniska Högskola (BTH) was made enjoyable in large part due to colleagues, friends and BTH staff that have become a part of my life. I would like to thank Sogand Shirinbab, Siva Krishna Dasari and Dr. Thi My Chinh Chu for many constructive discussions. Also, I would like to extend my acknowledgement to Eva-Lotta Runesson, Monica Nilsson, and the entire DIDD department for all the practical help they provided me. I would like to thank Dr. Louis Sibomana, Dr.Charles Kabiri and Dr. Said Ngoga Rutabayiro for supporting me socially and career wise both in Sweden and Rwanda. I am grateful to Anita Kayihura for enjoyable life in Sweden. I would like to thank Dr. Felix K. Akorli, for introducing me about BTH. I gratefully acknowledge the funding source that made my Ph.D. work possible, the University of Rwanda (UR) in partnership with the Swedish International Development Agency (SIDA). I thank the project research “Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) at BTH for giving me the opportunity to work on industrial problem through Telenor Sverige. Last but not least, I would like to express my deepest gratitude to my family, who have always been encouraging me throughout my studies. Karlskrona, March 2018 Christine Niyizamwiyitira iii Dedicated to Mucyo Faustin and Mucyo Isimbi Christa iv List of publications The thesis is based on the following publications: 1. C. Niyizamwiyitira and L. Lundberg, “Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster,” International Journal of Database Management Systems (IJDMS) Vol.9, No.6, December 2017 2. C. Niyizamwiyitira, L. Lundberg, and M. Svahnberg, “Evaluation of Voice-driven Web Application Architecture,” in Signal Image Technology and Internet Based Systems (SITIS), 2012 Eighth International Conference on, 2012, pp. 555-562 3. C. Niyizamwiyitira and L. Lundberg, “Performance evaluation and prediction of open source speech engine on multicore processors,” in Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems, 2013, pp. 345-352 4. C. Niyizamwiyitira and L. Lundberg, “Real-time scheduling of multiple virtual machines,” International journal of Computers and their applications (IJCA) , Vol. 24, No. 3, pp.91-109, Sept. 2017 5. C. Niyizamwiyitira and L. Lundberg, “Period assignment in real-time scheduling of multiple virtual machines,” in Proceedings of the 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, 2015, pp. 180-187 6. C. Niyizamwiyitira, L. Lundberg, and H. Lennerstad, “A Utilization- based Schedulability Test of Real-time Systems Running on a Multiprocessor Virtual Machine,” Submitted to the journal of computer The following publications are associated with, but not included in this thesis: 1. C. Niyizamwiyitira, L. Lundberg, and H. Lennerstad, “Utilization- Based Schedulability Test of Real-Time Systems on Virtual Multiprocessors,” in Parallel Processing Workshops (ICPPW), 2015 v 44th International Conference on, pp. 267-276, Oct. 2015 2. C. Niyizamwiyitira and L. Lundberg, “Performance Evaluation of Trajectory Queries on Multiprocessor and Cluster,” Computer Science & Information Technology (CS & IT), v.6, pp. 145-163, May 2016 vi Table of Contents Abstract ...................................................................................................... i Acknowledgements ..................................................................................iii List of publications ................................................................................... v Table of Contents ................................................................................... vii Chapter One .............................................................................................. 1 1.1 Motivation ................................................................................................... 1 1.2 Historical Background ................................................................................. 2 1.2.1 Cluster Computing ................................................................................... 4 1.2.2 Virtualization ........................................................................................... 4 1.2.3 Database

Performance Aspect on Databases and Virtualized Real-Time Applications

Bigchaindb Server Documentation Release 1.2.0

Package 'Rethinker'

Linux on Z Platform ISV Strategy Summary

Dmitry Omelechko

Rethinkdb on Wandboard

Performance Evaluation of Sql and Nosql Database Management Systems in a Cluster

Vorlesung Web Services Und Workflows

Back-End Developer with Demonstrated Experience in Developing Scalable Web Applications Using Best Practices and Methodologies

Slides: Slides.Baqend.Com

IBM Linuxone and Linux on Z Systems SOA

Empirical Investigation of Trends in Nosql-Based Big-Data Solutions in the Last Decade

Rethinkdb As a High Performance Nosql Engine for Real-Time Applications