Comparative Study on Data Searching in Linked List & B-Tree

COMPARATIVE STUDY ON DATA SEARCHING IN LINKED LIST & B-TREE AND B+TREE TECHNIQUES AHMED ESHTEWI S GIUMA A dissertation submitted in partial fulfillment of the requirement for the award of the Degree of Master of Computer Science (Software Engineering) The Department of Software Engineering Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia MARCH 2015 v ABSTRACT There are many methods of searching large amount of data to find one particular piece of information. Such as finding the name of a person in a mobile phone record. Certain methods of organizing data make the search process more efficient. The objective of these methods is to find the element with the least time. In this study, the focus is on time of search in large databases, which is considered an important factor in the success of the search. The goal is choosing the appropriate search techniques to test the time of access to data in the database and what is the ratio difference between them. Three search techniques are used in this work namely; linked list, B- tree, and B+ tree. A comparison analysis is conducted using five case databases studies. Experimental results reveal that after the average times for each search algorithms on the databases have been recorded, the linked list requires lots of time during search process, with B+ tree producing significantly low times. Based on these results, it is clear that searching in B- tree is faster than linked list at a ratio of (1: 5). The searching time in a B+ tree is faster than B- tree at the ratio of (1: 2). The searching time in a B+ tree is faster than linked list at the ratio of (1: 8). With that, it can be concluded that B+ tree is the fastest technique for data access. vi ABSTRAK Terdapat banyak kaedah dalam pencarian suatu maklumat dari satu kumpulan data yang banyak. Contohnya seperti mencari nama dalam telefon bimbit. Sestengah kaedah menguruskan data bagi menjadikan proses pencarian lebih efisien. Objektif kaedah yang dibincangan adalah untuk mencari data dengan cepat. Dalam kajian ini, tumpuan kajian adalah pada masa carian dalam pengkalan data yang besar dimana ia adalah satu factor penting dalam menentukan kejayaan dalam carian. Matlamatnya adalah memilih teknik yang paling sesuai dalam carian data didalam pengkalan data dan perbandingan dalam peratus masa capaian diantara teknik teknik tersebut. Tiga jenis carian dikaji iaitu linked list, B-tree dan B+ tree satu analisa perbandingan dibuat dengan menggunakan lima kajian kes. Hasil kajian telah laporkan dimana linked list memerlukan banyak masa dalam carian berbanding B+ tree. Berdasarkan keputusan ini telah menunjukkan carian dalam B- tree adalah pantas berbanding linked list dengan kadar (1:5). Carian masa dalam B+ tree adalah lebih baik berbanding linked list dengan kadar (1:2). Sementara itu carian masa dalam B+ tree adalah lebih laju berbanding linked list dengan nisbah (1:8). Dengan itu, dapatlah dirumuskan B+ tree adalah teknik yang paling laju dalam capaian data. vii CONTENTS TITLE i DECLARATION ii DEDICATION iii ACKNOWLEDGEMENT iv ABSTRACT v ABSTRAK vi CONTENTS vii LIST OF TABLES xiii LIST OF FIGURES xv CHAPTER 1 INTRODUCTION 1 1.1 Background 1 1.2 Problem Statement 2 1.3 Project Objectives 3 1.3 Project Scope 3 1.4 Outline of the Report 3 CHAPTER 2 LITERATURE REVIEW 5 2.1 Introductions 5 2.2 Data Structure of Linked Lists 6 2.2.1 Searching in Linked List 7 2.2.2 Advantages and Disadvantages for Linked List 8 viii 2.2.2.1 Advantages 8 2.2.2.2 Disadvantages 9 2.2.3 Implementing Linked Lists 9 2.3 Data Structure of B-Tree 10 2.3.1 Advantages and Disadvantages for B-Tree 11 2.3.1.1Advantages 11 2.3.1.2 Disadvantages 12 2.3.2 Implementing B-Tree 12 2.3.3 Searching in B-Tree 12 2.4 Data Structure of B+Tree 13 2.4.1 Advantages and Disadvantages for B+Tree 14 2.4.1.1 Advantages 14 2.4.1.2 Disadvantages 14 2.4.2 Implementing B+ Tree 14 2.4.3 Searching in B+ Tree 15 2.5 Related Work 16 2.6 Chapter Summary 18 CHAPTER 3 RESEARCH METHODOLOGY 19 3.1 Introduction 19 3.2 The Proposed Methodology for Comparative Study on Database Speed Searching 20 3.2.1 Load of Data from the Source 21 3.2.3 Load Data to B-Tree 22 3.2.3 Calculate Searching Time Using B-Tree 22 3.2.4 Load Data to Linked List 22 3.2.5 Calculate Searching Time Using Linked List 22 3.2.6 Load Data to B+Tree 23 ix 3.2.7 Calculate Searching Time Using B+tree 23 3.2.8 Comparative Study between Linked List & B- Tree and B+ Tree 23 3.2.9 Calculate the Result 23 3.3 Performance Measure 23 3.4 Chapter Summary 24 CHAPTER 4 IMPLEMENTATION AND DISCUSSIONS OF RESULTS 25 4.1 Introduction 25 4.1.1 The Complexity 26 4.1.1.1 The Complexity of Using the Link List Algorithm 26 4.1.1.2 The Complexity of Using the B-Tree Algorithm 29 4.1.1.3 The Complexity of Using the B+Tree Algorithm 31 4.1.2 The Inputs 32 4.1.3 The Queries 33 4.2 Data Source 33 4.2.1 Data Source for First Case Study ( The World ) 34 4.2.2 Data Source for Second Case Study(Employees) 35 4.2.3 Data Source for Third Case Study(Immigration) 36 4.2.4 Data Source for Fourth Case Study (Libyana) 38 4.2.5 Data Source for Fifth Case Study (Staff) 39 4.2.6 The Size Difference between the Databases 40 4.3 Load of Data from the Source 41 4.3.1 Load of World Database from the Source 41 4.3.2 Load of Employees Database from the Source 42 x 4.3.3 Load of Immigration Database from the Source 44 4.3.4 Load of Libyana Database from the Source 45 4.3.5 Load of Staff Database from the Source 47 4.4 Calculate Searching Time Using World Database 48 4.4.1 Calculate Searching Time Using Linked List 48 4.4.2 Calculate Searching Time Using B-Tree 50 4.4.3 Calculate Searching Time Using B+Tree 52 4.5 Calculate Searching Time Using Employees Database 53 4.5.1 Calculate Searching Time Using Linked List 53 4.5.2 Calculate Searching Time Using B-Tree 55 4.5.3 Calculate Searching Time Using B+Tree 57 4.6 Calculate Searching Time Using Immigration Database 58 4.6.1 Calculate Searching Time Using Linked List 58 4.6.2 Calculate Searching Time Using B-Tree 60 4.6.3 Calculate Searching Time Using B+Tree 62 4.7 Calculate Searching Time Using Libyana Database 63 4.7.1 Calculate Searching Time Using Linked List 63 4.7.2 Calculate Searching Time Using B-Tree 65 4.7.3 Calculate Searching Time Using B+Tree 67 4.8 Calculate Searching Time Using Staff Database 68 4.8.1 Calculate Searching Time Using Linked List 68 4.8.2 Calculate Searching Time Using B-Tree 70 4.8.3 Calculate Searching Time Using B+Tree 72 xi 4.9 Results Discussions 73 4.9.1 Search Time Using World Database Case Study 74 4.9.1.1 The Application of Linked List on the World Database 74 4.9.1.2 The Application of B-Tree on the World Database 75 4.9.1.3 The Application of B+Tree on the World Database 75 4.9.2 Analysis of the Results of the Application of Algorithms on World Database 76 4.9.3 Search Time Using Employees Database Case Study 78 4.9.3.1 The Application of Linked List on the Employees Database 78 4.9.3.2 The Application of B-Tree on the Employees Database 78 4.9.3.3 The Application of B+Tree on the Employees Database 79 4.9.4 Analysis of the Results of the Application of Algorithms on Employees Database 80 4.9.5 Search Time Using Immigration Database Case Study 81 4.9.5.1 The Application of Linked List on the Immigration Database 81 4.9.5.2 The Application of B-Tree on the Immigration Database 82 4.9.5.3 The Application of B+Tree on the Immigration Database 83 4.9.6 Analysis of the Results of the Application of Algorithms on Immigration Database 83 xii 4.9.7 Search Time Using Libyana Database Case Study 85 4.9.7.1 The Application of Linked List on the Libyana Database 85 4.9.7.2 The Application of B-Tree on the Libyana Database 86 4.9.7.3 The Application of B+Tree on the Libyana Database 87 4.9.8 Analysis of the Results of the Application of Algorithms on Libyana Database 87 4.9.9 Search Time Using Case Study Staff Database 89 4.9.9.1 The Application of Linked List on the Staff Database 89 4.9.9.2 The Application of B-Tree on the Staff Database 90 4.9.9.3 The Application of B+Tree on the Staff Database 91 4.9.10 Analysis of the Results of the Application of Algorithms on Staff Database 91 4.11 Chapter Summary 93 CHAPTER 5 CONCLUSIONS 94 5.1 Objectives Achievement 94 5.2 Conclusion 94 5.3 Future Work 95 REFERENCES 96 APPENDIX 98 VITA 156 xiii LIST OF TABLES 4.1 Complexity Connected to the Database and Load Data 27 4.2 Complexity of Linked List Algorithm 28 4.3 Complexity Connected to the Database and Load Data 29 4.4 Complexity of B-Tree Algorithm 30 4.5 Complexity Connected to the Database and Load Data 31 4.6 Complexity of B+ Tree Algorithm 32 4.7 Specifications for the Database World 35 4.8 Specifications for the Database Employees 36 4.9 Specifications for the Database Immigration 37 4.10 Specifications for the Database Libyana 38 4.11 Specifications for the Database Staff 39 4.12 The Size of the Databases 40 4.13 Application of Linked List on the World Database Results 74 4.14 Application of B-Tree on the World Database Results 75 4.15 Application of B+Tree on the World Database Results 75 4.16 The Results of the Application of Algorithms on World Database 76 4.17 Application of Linked List on the Employees Database Results 78 4.18 Application of B-Tree on the Employees Database Results 79 4.19 Application of B+Tree on the Employees Database Results 79 4.20 The Results of the Application of Algorithms on Employees Database 80 4.21 Application of Linked List on the Immigration Database Results 82 4.22 Application of B-Tree on the Immigration Database Results 82 4.23

Comparative Study on Data Searching in Linked List & B-Tree

Application of TRIE Data Structure and Corresponding Associative Algorithms for Process Optimization in GRID Environment

CS 106X, Lecture 21 Tries; Graphs

Introduction to Linked List: Review

University of Cape Town Declaration

Linux Kernel and Driver Development Training Slides

Implementing the Map ADT Outline

Algorithms & Data Structures: Questions and Answers: May 2006

Highly Parallel Fast KD-Tree Construction for Interactive Ray Tracing of Dynamic Scenes

Singly-Linked Listslists

The TREE LIST – Introducing a Data Structure

Algorithm for Character Recognition Based on the Trie Structure

B+ TREES Examples