Performance Evaluation of Relational Embedded Databases: an Empirical
Total Page:16
File Type:pdf, Size:1020Kb
Performance evaluation of relational Check for updates embedded databases: an empirical study Evaluación del rendimiento de bases de datos embebida: un estudio empírico Author: ABSTRACT 1 Hassan B. Hassan Introduction: With the rapid deployment of embedded databases Qusay I. Sarhan2 across a wide range of embedded devices such as mobile devices, Internet of Things (IoT) devices, etc., the amount of data generat- ed by such devices is also growing increasingly. For this reason, the SCIENTIFIC RESEARCH performance is considered as a crucial criterion in the process of selecting the most suitable embedded database management system How to cite this paper: to be used to store/retrieve data of these devices. Currently, many Hassan, B. H., and Sarhan, Q. I., Performance embedded databases are available to be utilized in this context. Ma- evaluation of relational embedded databases: an empirical study, Kurdistan, Irak. Innovaciencia. terials and Methods: In this paper, four popular open-source rela- 2018; 6(1): 1-9. tional embedded databases; namely, H2, HSQLDB, Apache Derby, http://dx.doi.org/10.15649/2346075X.468 and SQLite have been compared experimentally with each other to evaluate their operational performance in terms of creating data- Reception date: base tables, retrieving data, inserting data, updating data, deleting Received: 22 September 2018 Accepted: 10 December 2018 data. Results and Discussion: The experimental results of this Published: 28 December 2018 paper have been illustrated in Table 4. Conclusions: The experi- mental results and analysis showed that HSQLDB outperformed other databases in most evaluation scenarios. Keywords: Embedded devices, Embedded databases, Performance evaluation, Database operational performance, Test methodology. 1 Software Engineering and Embedded Systems (SEES) Research Group, University of Duhok, Duhok, Kurdistan Region, Iraq, Email: [email protected]. 2 Software Engineering and Embedded Systems (SEES) Research Group, Department of Computer Science, College of Science, University of Duhok, Duhok, Kurdistan Region, Iraq, Email: [email protected]. 1 INTRODUCCTION and compatible database for their businesses. Thus, many studies in literature such as (9-14) have intro- Recent advances in data-driven software applications duced, explored, and evaluated the performance of and systems that need to store their own data with- various traditional and embedded databases along- in the machines they reside in rather than remote side many directions. All the aforementioned stud- machines/servers, have paved the way for different ies were crucial in providing useful explanation of types of software applications and devices to use different types of databases, databases architectures, embedded databases. An embedded database is a and database based applications. Also, they were small size database integrated with a software appli- extremely valuable in providing a theoretical back- cation that requires access to its own data. Thus, the ground and a general evaluation metrics for this ex- database system is “hidden” from the application’s perimental study. However, no previous published end-user and requires little or no maintenance ac- study in literature has compared experimentally be- tivities (1). Embedded database systems can be used tween the selected embedded databases in terms of for various purposes. For example, they can be used data storing and processing capabilities. Thus, this for storing system logging activities, email archive ac- was the rationale behind this experimental study to tivities, contacts information in mobile devices, and be conducted. sensor data in constrained and IoT devices (17). To The rest of this paper is organized in sections as fol- ensure database consistency, the data are stored and lows. Section 2 presents the evaluation methodology updated in short running transactions which increas- (evaluation conditions, evaluation scenarios, evalua- es write latency and amplification (2). This plays a de- tion metrics, and software/hardware setups) used in termining factor in the performance of applications. this experimental study. Section 3 presents the exper- As the data in embedded databases reside in the imental results of comparing the selected relational same machine that runs the application; the database embedded databases with each other according to engine is always running and accessible. Besides, it is various test scenarios. Finally, some conclusions and not shared or networked (only one software applica- future works have been given in Section 4. tion could access the embedded database file at any time). Compared to traditional databases (also called EVALUATION METHODOLOGY client-server databases) which are often used in web applications, web services, and distributed soft- The evaluation methodology used in this study is ware applications (16), traditional databases are more adapted from (18) and it includes the evaluation con- complicated. For example, they require monitoring ditions, evaluation scenarios, evaluation metrics, and and managing of data sharing mechanisms, servers software/hardware setups that have been applied configuration, servers failure solution, and serv- and used to compare experimentally the perfor- ers maintenance) (3). In this paper, four well-known mance of each embedded database in terms of data open-source relational embedded databases; namely, storing and processing capabilities. H2 (4), HSQLDB (5), Apache Derby (6), and SQLite (7) have been compared experimentally with each other 1. Evaluation Conditions to evaluate their operational performance. Database developers and researchers are concerned about the The following evaluation conditions have been con- factors affecting the database performance in re- sidered in this study: al-world usage (8). In addition, database benchmark- ing helps organisations in selecting the most suitable • All selected embedded databases have been tested with their default settings and configurations. 2 • Every test scenario has been applied on each em- (columns). The table with its specifications has bedded database using Structured Query Lan- been created using the SQL statement in Scenario guage (SQL) statements(15) with the same test sce- 1. nario and its related parameters. • All test data in this paper are using identical data • Each database table used in the evaluation pro- structure as shown in Table 1. cess consists of 1000 records (rows) and 10 fields Table 1. Database structure Attribute Name Data typ e efnam e varchar(1 0) elnam e varchar(1 0) emai l varchar(4 0) job varchar( 9) mgr int hiredat e date sal numeric(7 ,2) comm numeric(7 ,2) gende r varchar(1 0) empno int • The hardware configuration of the test environ- nections to the selected embedded databases ment is not the most paramount in this paper, we tend to design some scenarios in certain environ- 2. Evaluation Scenarios ment to evaluate the performance of different embedded databases. From our experience in the The operational performance of each relational em- field, all test scenarios have been selected as the bedded database is compared experimentally via dif- most used scenarios in different types of software ferent Evaluation tests taken from real-world data- applications. base scenarios, as follows: • Test applications (to store/receive data to/from embedded databases) have been programmed and ● Scenario 1 executed on the same computer system to ensure The test application creates a single database ta- using the same software and hardware specifica- tions. ble using the following SQL statement: • Before starting the testing and measurements, all CREATE TABLE empTable “+i+” (empno user applications (excepting applications used to INT PRIMARY KEY , efname VARCHAR(10), test the databases) have been closed. elname VARCHAR(10), email VARCHAR(40) , • During the whole period of testing and measure- job VARCHAR(9), mgr INT , hiredate DATE, ments, the used computer system disconnected (7,2) (7,2) from the Internet. sal NUMERIC , comm NUMERIC , gender (10) • Every evaluation scenario has been repeated 10 VARCHAR ) times and then measurements have been averaged to ensure more accuracy. ● Scenario 2 • Evaluation results have been recorded after ini- The test application creates 50 database tables tializing the database driver and establishing con- using the following SQL statement in a loop (it- 3 eration variable i goes from 1 to 1000): ● Scenario 8 CREATE TABLE empTable “+i+” (empno The test application reads only the last record INT PRIMARY KEY , efname VARCHAR(10), from the database table using the following SQL elname VARCHAR(10), email VARCHAR(40) , job statement: VARCHAR(9) , mgr INT , hiredate DATE , sal SELECT * FROM empTable where emp- NUMERIC(7,2), comm NUMERIC(7,2), gender no=1000; VARCHAR(10) ); ● Scenario 9 ● Scenario 3 The test application deletes all data in the data- The test application deletes the database table base table using the following SQL statement: with no inserted records using the following DELETE * FROM empTable; SQL statement: DROP TABLE empTable; ● Scenario 10 The test application deletes only the first record ● Scenario 4 from the database table using the following SQL The test application deletes the database table statement: with 1000 records using the following SQL state- DELETE * FROM empTable where empno=1; ment: DROP TABLE empTable; ● Scenario 11 The test application deletes only the middle re- ● Scenario 5 cord from the database table using the following The test application reads