Enhancement of The Performance of Monitoring Tool

for Small Footprint

MTP Stage-I Report

Submitted in partial fulfillment of the requirements

for degree of

Master of Technology

Under guidance of

Prof. Deepak B. Phatak

Submitted By

Sukh deo Rolla No: 123050061

Department of Computer Science and Engineering

Indian Institute of Technology Bombay,

Mumbai

1 Acknowledgement

I would like to extremely thanks my guide, Prof. Deepak B. Phatak, for the consistent directions and guidance throughout the project. Because of his consistent encouragement and right directions, we are able to do this project work. I would like to thanks Mr. Nagesh Karmali for his continuous inputs in my work. I would also like to thank every one else who supported me in this work.

2 Abstract

Nowadays, mobile devices are not only use for basic purpose like call, but many different purpose and different applications. We can say mobile de- vices are sibling of personal computers. Mobile device usage patterns are very different as compared to conventional PCs, as the devices are targeted for general users. Mobile devices have some great features, e.g., light weight and efficiency, but it has some drawback like small memory and battery is- sues. Peoples want to use many applications, due to growing the number of purposes, but it challenges to manage the many application in mobile device. Performance of mobile device depends on how to managing our applications and its data. There is a need to measure the performance of various com- monly used applications.

Our work start with simple question: does storage affect the performance of mobile device? What kinds of requires for mobile device? What are the parameter to measure the performance of databases? How to measure the performance of mobile device and its possible tools. Finally, we will discuss the possible way to optimize storage as well as monitoring tool. Contents

Acknowledgement ...... 2

1 Introduction 4 1.1 Motivation ...... 4 1.2 Small footprint databases ...... 5 1.3 Monitoring Tool ...... 6

2 Literature Survey of Small Footprint Databases 8 2.1 Introduction ...... 8 2.2 Berkeley DB ...... 8 2.2.1 Access Methods ...... 8 2.2.2 Use of Berkeley DB ...... 9 2.2.3 Limitation ...... 9 2.3 Perst DB ...... 10 2.4 SQLite DB ...... 10 2.4.1 Architecture of SQLite ...... 10 2.5 Comparison among Berkeley DB, Perst DB, and SQLite DB . 11

3 Literature Survey of Monitoring Tool 12 3.1 Introduction ...... 12 3.2 IO Characteristics ...... 12 3.3 IOzone ...... 13 3.3.1 Limitation ...... 14 3.4 AndroBench ...... 14 3.4.1 Microbenchmarks ...... 14 3.4.2 SQLite Benchmarks ...... 15 3.4.3 Limitation of AndroBench ...... 16 3.5 Androstep ...... 17

1 4 Problem Statement and Proposal 20 4.1 Possible optimization in SQLite ...... 20 4.2 Possible Enhancement in Monitoring Tool ...... 20

5 Conclusion and Future Work 22

2 List of Figures

2.1 Architecture of Berkeley DB[1] ...... 9 2.2 Architecture of SQLite[2] ...... 10

3.1 Structure of AndroBench Measure Tab[7] ...... 15 3.2 Structure of AndroBench Setting Tab[8] ...... 16 3.3 Structure of Measure tab[16] ...... 17 3.4 Structure of Mobibench[16] ...... 19

3 Chapter 1

Introduction

1.1 Motivation

Today’s mobile devices are multifunction devices. We can say, it is sibling of the personal computer, i.e, laptop. It provides to use many application for both business and consumer purposes. Smart phones and tablets allow people to access the Internet for email, message, and web browsing. A busi- ness person can install many business application like streamline work-flow, save time, keep data accessible, automate daily tasks and many other appli- cations. Consumer can see about goods and its price on Internet and many more task can do by mobile devices, i.e., smart phone and tablet. We can work on the road, or away from the office by mobile devices. We can syn- chronize with our personal computer and can get the updated informations.

Due to huge facility and light-weight property of mobile devices, there are huge markets. Peoples want to use many applications, but mobile de- vice has some limitations, i.e., small memory and battery backup. There are challenge to manage the many applications in mobile devices, we have to analysis the possible optimization techniques. Research on mobile devices can be broadly do in a application and service, device architecture, and op- erating system[11]. Nowadays, great challenge to manage many applications in mobile devices due to user requirements. In traditionally, storage did not view as a critical component of mobile devices in term of expected perfor- mance.

4 Performance of mobile devices require to continuously improve for fulfill the desire of users to run more applications. Performance of mobile device is based on storage performance of mobile devices, e.g., quality of audio/video playback while number of application are downloaded in background, storage of web browser history, and browsing speed all depend on storage perfor- mance. Most mobile devices use Embedded Multimedia Card(eMMC), it is based on NAND flash memory. It is also required to use good file systems and I/O schedulers for optimizing the storage performance. In mobile device, SQLite is used both as system and application data. For example, contacts, message, and bookmarks of the web browser. It stores the entire database as one file. So, performance of SQLite is storage performance, means how to handle the data and access method.

Huge requirement of mobile devices, it required to measure the perfor- mance of mobile device. Performance of mobile device is closely depend on storage of that device. Bases on this motivation, we will discuss small footprint databases, its monitoring tools, and also discuss possible way to enhance the performance of mobile device.

1.2 Small footprint databases

Small footprint databases is also called lightweight databases or embedded databases. It is specially designed for mobile devices. Because, mobile de- vices are different from personal computer in various view. So, we can not use high load databases. Because, mobile devices have some limitations[5]:

• Limited Memory: Most mobile devices use NAND based flash mem- ory. It has limited capability to handle it. Reason of limited memory are its light weight and reduce battery consumptions.

• Limited Processing Power: It has small process power. But, Many of the services provided over the Internet, such as gaming, music, and video. So, it is required to manage efficient way of applications for fast response.

• Battery consumption: Battery is important part of mobile devices. Battery run out very quickly in mobile device. Nowadays, one research field is to reduce the battery consumptions.

5 • Limited bandwidth: Increasing the number of users and applica- tions, it decrease the bandwidth of wireless shares resource and CPU time.

Bases on this limitation, mobile devices require such types of database that uses small memory and low power, this types of memory called Small foot- print database. In the second chapter, we will discuss Berkeley DB, Perst DB, and SQLite DB. There are small footprint databases, each has specific features, based on that features, we will select good databases.

1.3 Monitoring Tool

The storage performance directly influence over all performance of mobile device and it depends on what kinds of storage medium and databases uses in mobile devices. Due to low power consumption and high performances, most of device uses NAND flash memory. In mobile device, the storage IO is one of key factors that governs the overall system performances. In mobile device, each application generates the specific load. Before measure the per- formance of particular application, we have to analysis function of databases transaction and metadata of storage medium. Based on throughput, we will be able to enhance of the performance of that application, after that we can enhance the overall performance of mobile device. Monitoring tool measures the load of each application that are running on mobile device. Some moni- toring tool does not give correct result, because of it does not analysis data types and file system of running application. We have to choose the good monitoring tool. so that we can enhance the performance of mobile device.

In the first chapter, we will discuss introduction of mobile databases, i.e, light weight databases and monitoring tools. In the second chapter, we will discuss various features and limitation of light weight databases, i.e, Berke- ley DB, Perst DB, and SQLite DB. In the third chapter, we will discuss the many monitoring tools, i.e., IOzone, AndroBench, and AndroStep. IO- zone, for measuring the disk performance. It is not work in Android devices. AndroStep, it is specially designed for Android device to measuring the per- formance of SQLite. But, lack of workload analysis, it does not give correct result. AndroStep, it combine the all feature of IOzone and AndroStep, and do proper analysis of IO types. It gives better result then AndroStep. In the

6 fourth chapter, we will discuss possible enhancement in monitoring tool. In fifth chapter, we will conclude overall report and discuss future work.

7 Chapter 2

Literature Survey of Small Footprint Databases

2.1 Introduction

This chapter includes literature study in several small footprint databases. Each databases has own characteristics, we can decide, which databases is good after study on the basis of requirement. We have analysis the features of Berkeley DB, Perst DB, and SQLite DB.

2.2 Berkeley DB

Berkeley DB is [3]. It is open source and written in C with callable APIs for C#, C++, PHP, and most of languages[10]. Berkeley DB originated at The University of California, Berkeley, but later developed by Oracle corporation. It supports concurrent access by multiple users. It uses to store the data, key/value technique. For example, we can use to store the user name and password, user name as key and password as value. It is not a relational database, but we can store multiple data using one key. Architecture of Berkeley DB in figure 2.1. It has three layer, first layer application program which is directly connected to access methods and recovery.

2.2.1 Access Methods

8 Berkeley DB provides three types of access methods:

• B+tree: B+tree store records in leaf pages. Key in tree stored in sorted order. B+tree reduce tree size, give better search performance.

• Hash: Berkeley DB uses ex- tended linear hashing.

• Fixed or variable length records: In this method, each record has a record number, Figure 2.1: Architecture of Berkeley which serve as key. For exam- DB[1] ple, if we want to store a text file than each line treating as a record.

2.2.2 Use of Berkeley DB Berkeley DB provides concurrent access to databases[14]. It can be use stan- dalone application, linked by many cooperating application. For example, in mobile application, we can use Berkeley DB to store name and mobile number, but contact number can be seen by dial and receive number, even it can seen from message contact. Example, Directory server, providing naming and directory lookup service by using Berkeley DB. Postfix uses Berkeley DB to store administrative in- formation.

2.2.3 Limitation Berkeley DB has some limitation:

• It does not support RDBMS. It has no schema.

• It uses store the record only key/value technique, if we can not compare the data other then key value.

9 2.3 Perst DB

Perst Db is Java object oriented, open source embedded databases[13]. Feature, which support[18]: • Classic B-Tree implementation • R-tree indexes for spatially- oriented applications such as GIS and navigation • Time Series class to efficiently deal with small fixed-size ob- jects • Specialized versions of collec- tions for thick indices (indices with many duplicates), and bit indices (keys with a restricted number of possible values). Figure 2.2: Architecture of SQLite[2] It is application specific database for store real object information. Exam- ple, for Restaurant searching appli- cation, we can use Perst DB, because we can access and store information using making object

2.4 SQLite DB

SQLite is embedded database. It has not all features of RDBMS, because of required small memory. It has more extra features then RDBMS, i.e., zero configuration, self-content, and server-less database[6]. SQLite is native database. It has schema, so that we can design own table form as we want.

2.4.1 Architecture of SQLite We can understand the how to works SQLite by architecture of SQLite[2]. This information is very useful, if we want to change inner code of SQLite. Block diagram shown figure 2.2.

10 • Interface: This modules provides external interface to SQLite library. It is implemented in main.c.

• SQL command process: In this module has three part.

– Tokenizer: This modules, break the SQL statement into token and call to parser modules. – Parser: This modules, parse the meaning of token based on con- text. – Code generator: After assemble meaning of token, it generate code that will use in virtual machine.

• Virtual Machine: It execute the code generator code.

• B-tree: SQLite maintains the data by B-tree. A separate B-tree makes for each table. All B-tree are stored in same file.

• Pager: B-tree modules request data form of chunk, default size is 1024KB. Page cache, take care of rollback journal and rollback infor- mations.

• OS Interface: It provides facility of OS service, i.e., disk access, and critical section problems.

2.5 Comparison among Berkeley DB, Perst DB, and SQLite DB

SQLite is native database. It has schema, so that we can design own table form as we want. Berkeley DB is embedded database[4]. It is used to store the data by key/value schema. Perst DB is object oriented, embedded database. It is application specific database for store real object information. We can not really compare properly, because each has some special property. But, SQLite has schema, so we can use as we want. Overall, SQLite gives better performance among them.

11 Chapter 3

Literature Survey of Monitoring Tool

3.1 Introduction

IO Characteristics of Android bases mobile device is different from desktop computer. We have to required different tool to measure the storage per- formance of mobile device. I have studied different monitoring tool, that is used to monitoring the embedded databases. Based on existing monitoring tools, we will discuss what possible feature can enhance.

3.2 IO Characteristics

We can define IO characteristics generally four types: • IO type(Read vs. Write).

• IO size(KB).

• Spatial locality(Sequential vs. Random )

• Temporal locality(Hot vs. Cold). Synchronous IO: Works on a polling basis[9]. While polling, if data is avail- able then we get the data, otherwise application will blocked and retry for data. Buffered IO: Default IO of programs is in buffered, i.e., they are used from

12 a temporary storage to the requesting program then it will increase the throughput of application. We should analyzed all the characteristics of Android device. Based on above characteristics, we can measure correct workload of application process. Example, we have to understand for how many write for metadata and syn- chronous when IO is for journal file or other file. Some existing workload does not identify the file types, so that accuracy level is not good. We will discuss workload tool based on above four characteristics.

3.3 IOzone

IOzone is file system benchmark tool[15]. It is good for measure the perfor- mance of file system in computer platform. It is generally uses by computer vendors. It measures the I/O performance of accessing read, write, fread, and fwrite operations of single file.

• Write: This test measure the performance of writing new file. It consist data need to create the new file and metadata. Metadata consist the overhead address like directory information and space allocation.

• Re-write: This test measure the performance of write a file that is already written. It takes low cost, because no overhead of metadata.

• Read: This test measure the performance of reading the file.

• Re-Read: This test measure the performance of reading file, that was recently read.

• Random Read and Random Write: This test measure performance of read/write, that requires random access file. It includes overhead of seek time and latency.

• It can test fread and fwrite, this test using library function fwrite and fread. It can also test recently fwrite and fread operation using freread and frewrite.

• Mmap: Mmap, map a file into a users address space. It provides to easy to access the data for applications, it does not require to msync() for waiting the data.

13 3.3.1 Limitation It is file system benchmark tool[16].

• It does not include fsync(). But, SQLite forces to commit data via fsync() to commit on journal modes. So, we can not measure the correct performance of android device.

• In Android devices[12], there is no way to getting the superuser permis- sion in user level for remounting and unmount file, but IOzone need to superuser permission. So, it cannot measure the performance of Android devices.

• IOzone must write sequencial write operation, before run any other benchmark, because SQLite uses journal mode.

Base on there limitation, we cannot use to measure the performance of SQLite in android device.

3.4 AndroBench

AndroBench, storage benchmarking tool for Android mobile device[12]. It measures the sequential and random IO performance and throughput of var- ious SQLite transaction, i.e., insert, delete, and update. Methodology of AndroBench[12]. It has two methodology, i.e., microbenchmarks and SQLite Benchmarks.

3.4.1 Microbenchmarks AndroBench uses four micro-benchmarks for measure the performance of sequential and random IO. For measure the sequential read performance, it goes below step:

• AndroBench creates a file(32MB by default) in target partition.

• File is sequentially read with fixed buffer size(256kb by default).

• For sequentially write same as a read but file is reduced to 2MB, because of write operation take too many time.

14 • For random read/write operation: AndroBench measure the num- ber of IO operation per second. Each read/write operation is 4KB in size by default, 32MB of file uses for read and 2MB for writes.

For measuring the correct performance, it takes the average of thee transac- tions. Java does not provide to bypass buffer cache, so it is implemented by C.

3.4.2 SQLite Benchmarks For testing the performance of SQLite transaction, AndroBench creates a table, that has 17 columns(twelve integer types and five text types) in target location. It performs three types of tests, i.e., insert, update, and delete. By default, number of transaction to set 300, but we can change from setting button, First, it release cache page by using SQLite- Database.releaseMemory(). Changing Parameters: User can change, target partition, file size, and buffer size used in micro bench- marks, and SQLite benchmarks. After successfully completion all benchmark, it transmits the mea- surement result to central server. Result is associate by model name and parameter. It measure the per- formance by transaction per sec- ond. Figure 3.1: Structure of AndroBench Measure Tab[7]

15 Figure 3.2: Structure of AndroBench Setting Tab[8] For understanding the structure of measuring tab, please refer figure[3.1]. We can measure the overall performance, performance of various SD card, and SQLite transaction by clicking specified button. In figure[3.2], shown the setting tab, in this tab has given many feature, we can change buffer size, file size, and partion location as we want.

3.4.3 Limitation of AndroBench Although, AndroBench is powerful tool for measure of performance of an- droid device. But, it has some limitation[5].

• It do not allow changing synchronization option such as O SYNC, O DIRECT, and Mmap.

• It has no any method to measure the performance of fsync() call.

16 • Androbench does not support multi-threading benchmark environment. On the above limitation, AndroBench is not able to measure to correct per- formance.

3.5 Androstep

Androstep is storage performance benchmarking tool[16]. It combine both characteristics of IOzone and AndroStep. It has two part: • Workload generator(Mobibench)

• Workload analyzer, called Mo- bile Storage Analyzer(MOST). Mobibench: It analyze the differ- ent application behavior in Android IO and generate the file system work load of applications. File system workload, random and sequential, synchronous and buffered IO. It is implemented in two versions, first one shell application and second one is Android application. Both version implemented in C language. It use JAVA Native Interface(JNI) to run the API, since JAVA can- not call the app in C directly. Mo- bibench provides similar environ- ment as smart-phones, to create the Figure 3.3: Structure of Measure workload environment. tab[16] It measures databases operation of insert, update, and delete in SQLite. Performance of SQLite de- pends on PRAGRMA options, e.g., synchronous mode or journal mode. Mobibench generates three result,

17 throughput, CPU utilization, and number of context switches. Units of throughput in SQLite operation is transactions per second. Most: Androstep implement two feature in Android device:

• Identify the file type for given block: It identify the inode number of file using logical block address and using inode number, it identify the file type.

• Identify the application for given block: It identify application depen- dent IO and file types dependent access characteristics. Based on sep- aration of this characteristics, we can measure workload accurately.

Most collects the IO trace at block device driver level then it processes in- formation for each respective block. It address three mapping:

• LBA-to-file mapping: It accept a logical block number as a input, and generates a file name.

• LBA-to-process mapping: MOST identify original process that issue a given IO.

• Retrospective LBA mapping: In SQLite, many temporary file created and deleted. It is very important to understand how this file are uti- lized.

In figure[3.3], shown setting tab, in this tab we can select various operation of file IO and SQLite transaction, i.e., insert, delete, and update. In figure[3.4], shown the measure tab, it has options to select various measuring options, i.e., all, file IO, and SQLite.

AndroStep gives better performance, because of Mobibench does good analysis of IO and MOST categorizes logical blocks into three types, i.e., Metadata, Journal, and Data.

18 Figure 3.4: Structure of Mobibench[16]

19 Chapter 4

Problem Statement and Proposal

4.1 Possible optimization in SQLite

In NAND based flash memory, write operation is ten times slower than read operation, because of write before erase and garbages overheads. In SQLite, rollback journal file system used[?]. It is good for small transaction and se- quential writes. It is costly for random write transactions. Shadow paging technique is better solution in term of large transactions. It will increase not only fast transaction in SQLite, but also increases the life of NAND flash memory, because shadow paging is like a concept of wear- leaving. It will optimize the random write transaction very efficient way, because only requires to mapping the pointer in B.tree. Most application does not required recovery technique, only speed matter , for example video game.

My Proposal: We can add a module in SQLite that will choose the best technique among journal, shadow paging, and unreliable technique(i.e., no any recovery technique use) based on applications.

4.2 Possible Enhancement in Monitoring Tool

After analysis IOzone, AndroBench, and Androstep. We found, AndrStep gives better performance among IOzone, AndroBench, and AndroStep. But,

20 it has some limitations, that needs to analysis.

• It has no any procedure to analysis IO characteristics of Hot and Cold junk. If we want to measure the performance of P-SQLite[17], then it will not gives the correct performance, because P-SQLite uses Hot and Cold storage technique for enhancement of performance.

• It has no any procedures other then Journal mode, If we use WAL, shadow paging technique, and adaptive logging technique then, it will not gives the correct result.

• It only measures the throughput and context switching in term of trans- action per second. But, if we want to measure which application takes how much memory and battery consumption, then it will not works.

• It measures only insert, delete, and update transaction of SQLite, but if we want to measures the select transaction, then it does not give the result.

We can remove above mention limitations, then it will measure the accurate performance of SQLite databases. My Proposal: We can add modules after analysis the characteristics of each limitations.

21 Chapter 5

Conclusion and Future Work

We have discussed small foot print databases, i.e., Berkeley DB, Perst DB, and SQLite DB. We have also discussed the possible way to enhance the per- formance of SQLite DB, i.e., implement shadow paging and adaptive logging. We have discussed monitoring tools for small footprint databases, i.e., IO- zone, AndroBench, and AndroStep. We found some limitation in AndroStep, that we will discuss for solution in MTP stage-2.

22 Bibliography

[1] Architecture of berkeley db. URL:http://www.aosabook.org/en/bdb.html[last accessed: Sept, 29, 2013].

[2] Architecture of sqlite. URL:http://www.sqlite.org/arch.html[last ac- cessed: Sept, 29, 2013].

[3] Berkeley db. URL:http://www.oracle.com/technetwork/products/berkeleydb/ overview/index.html[last accessed: Oct, 2, 2013].

[4] Berkeley db vs. sqlite db. URL:http://db- engines.com/en/system/Berkeley+DB%3BSQLite[last accessed: Oct, 2, 2013].

[5] Limitation of mobile device. URL:http://www.wireless- center.net/Mobile-and-Wireless/2505.html[last accessed: Oct, 2, 2013].

[6] Sqlite. URL:https://www.sqlite.org/[last accessed: Sept, 29, 2013].

[7] Structure of androbench measure tab. URL:http://www.androbench.org/wiki/File:Androbench bench.png[last accessed: Oct, 2, 2013].

[8] Structure of androbench setting tab. URL:http://www.androbench.org/wiki/File:Androbench setting.png[last accessed: Oct, 2, 2013].

[9] Synchronous io vs. buffered io. URL:http://stackoverflow.com/questions/ 1450551/buffered-i-o-vs-unbuffered-io[last accessed: Oct, 2, 2013].

23 [10] the free encyclopediar From Wikipedia. Berkeley bd. http://en.wikipedia.org/wiki/Berkeley DB[last accessed: Sept, 29, 2013]. [11] Hyojun Kim, Nitin Agrawal, and Cristian Ungureanu. Re- visiting storage for . TOS, 8(4):14, 2012. URL:http://doi.acm.org/10.1145/2385603.2385607[last accessed: Aug, 30, 2013]. [12] Je-Min Kim and Jin-Soo Kim. Androbench: Benchmarking the storage performance of android-based mobile devices. In Sabo Sambath and Egui Zhu, editors, ICFCE, volume 133 of Advances in Soft Computing, pages 667–674. Springer, 2011. URL:http://csl.skku.edu/papers/fce12a.pdf:[last accessed: AUG, 30 2013]. [13] From mcobject. Perst db. URL:http://www.mcobject.com/[last ac- cessed: Sept, 29, 2013]. [14] Keith Bostic Michael A. Olson and Margo Seltzer. Berkeley db. Proceed- ings of the FREENIX Track:1999 USENIX Annual Technical Confer- ence, 1999. URL:http://static.usenix.org/event/usenix99/full papers/ olson/olson.pdf[last accessed: Sept, 29, 2013]. [15] Best of open source in storage. Iozone filesystem benchmark. URL:http://www.iozone.org/docs/IOzone msword 98.pdf[last accessed: Sept, 29, 2013]. [16] Jungwoo Hwang Seongjin Lee Sooman Jeong, Kisung Lee and Youjip Won. Androstep: Android storage performance analysis too. 2013. http://www.esos.hanyang.ac.kr/files/publication/conferences/ international/AndroStep.pdf[last accessed: Sept, 29, 2013]. [17] Seong Min Kim Min Kyu Maeng Ki-Woong Park Woong Choi, Sung Kyu Park and Kyu Ho Park. P-SQLITE: PRAM-Based Mobile DBMS for Write Performance Enhancement. 2013. [18] Dr. Dobbs’s world software development. Mcob- ject releases perst vs. sqlite benchmark on android. URL:http://www.drdobbs.com/database/mcobject-releases- perst-vs-sqlite-benchm/205207102[last accessed: Sept, 29, 2013].

24