A Study on Real-Time Database Technology and Its Applications

Total Page:16

File Type:pdf, Size:1020Kb

A Study on Real-Time Database Technology and Its Applications Eastern Illinois University The Keep Masters Theses Student Theses & Publications Summer 2020 A Study on Real-Time Database Technology and Its Applications Geethmi Nimantha Dissanayake Eastern Illinois University Follow this and additional works at: https://thekeep.eiu.edu/theses Part of the Databases and Information Systems Commons, and the Data Science Commons Recommended Citation Dissanayake, Geethmi Nimantha, "A Study on Real-Time Database Technology and Its Applications" (2020). Masters Theses. 4822. https://thekeep.eiu.edu/theses/4822 This Dissertation/Thesis is brought to you for free and open access by the Student Theses & Publications at The Keep. It has been accepted for inclusion in Masters Theses by an authorized administrator of The Keep. For more information, please contact [email protected]. A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS A Study on Real-Time Database Technology and Its Applications Geethmi Nimantha Dissanayake Eastern Illinois University Dr. Peter Ping Liu Dr. Jerry Cloward Dr. Wutthigrai Boonsuk A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS Table of Contents 1.Introduction..............................................................................................................................6 1.1 Purpose .........................................................................................................................7 1.2 Statement of Problem ....................................................................................................8 1.3 Limitation .....................................................................................................................9 1.4 Delimitation ..................................................................................................................9 2. Literature Review .................................................................................................................. 11 2.1 Evolution of real-time databases.................................................................................. 11 2.2 Early misconceptions about real-time databases .......................................................... 12 2.3 Types of Storage Models in Real-time Databases ........................................................ 13 2.4 Performance Evaluation of Real-Time Databases ........................................................ 18 2.5 Trending Usage and Popularity of Real-time databases ............................................... 21 2.6 Ridesharing concept .................................................................................................... 25 3. Research Methodology .......................................................................................................... 30 3.1 System Setup .............................................................................................................. 30 3.2 Design of the Study ..................................................................................................... 30 3.2.1 Feature comparison .............................................................................................. 31 3.2.2 Case Study ........................................................................................................... 33 3.2.3 Ride share application .......................................................................................... 33 4. Feature Comparison .............................................................................................................. 35 A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS 4.1 Google Firebase : A key-value based store .................................................................. 35 4.2 MongoDB : A document-based store........................................................................... 37 4.3 Cassandra: A column-based store ................................................................................ 39 4.4 Orient DB : A graph based store .................................................................................. 41 5. Case Study ............................................................................................................................ 42 6. Results and Discussion .......................................................................................................... 50 7. Conclusion ............................................................................................................................ 60 8. Future Work .......................................................................................................................... 61 References ................................................................................................................................ 63 A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS Table of figures Figure 1. Illustration of a key-value model ................................................................................ 14 Figure 2. Illustration of a document based database model ........................................................ 15 Figure 3. Illustration of a document (column-based?) database model ...................................... 17 Figure 4. Illustration of a graph database model ........................................................................ 18 Figure 5.Proposed database comparison framework (Kumar, Srividya, & Mohanavalli 2017) ... 20 Figure 6. Most important metrics tracked for database performance (2019) ............................... 21 Figure 7. Most popular databases (2019) ................................................................................... 22 Figure 8. Database engines ranking (2019) ................................................................................ 23 Figure 9. SQL & NoSQL Multiple Database Combinations (2019) ........................................... 24 Figure 10. Most popular multiple database type combinations (2019) ........................................ 25 Figure 11: Ridesharing patterns (Furuhata, et al.,2013) .............................................................. 27 Figure 12: Ride hailing popularity (Iqbal, 2020) ........................................................................ 28 Figure 13. Uber’s hypergrowth (Brickell, 2019) ........................................................................ 29 Figure 14: Key-value structure .................................................................................................. 35 Figure 15: Key-value example ................................................................................................... 35 Figure 16: Document based store .............................................................................................. 37 Figure 17: MongoDB example .................................................................................................. 38 Figure 18: Column based model ................................................................................................ 39 Figure 19:Cassandra example .................................................................................................... 40 Figure 20: Ridesharing .............................................................................................................. 42 Figure 21: Rideshare flow ......................................................................................................... 43 Figure 22: ER model for the ridesharing platform...................................................................... 44 A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS Figure 23: Class diagram for the Ride share application ............................................................ 46 A STUDY ON REAL-TIME DATABASE TECHNOLOGY AND ITS APPLICATIONS 1.Introduction The advancement of popular technological appliances such as mobile phones and GPS trackers have increased the demand for real-time applications. These applications usually require up-to-the-second information by the users as well as the large volume of data to be processed along the speed of the actual application. These computing systems must react within precise time constraints while responding to the events in the operating environment (Buttazzo, 1997). The databases for these applications are expected to facilitate instant and massive data storage, instant access and delivery. Cloud computing is a powerful technology to perform massive-scale and complex computing. It can be a part of the solutions where a massive amount of data and instant access are required such as in mobile computing applications. It eliminates the need to maintain expensive computing hardware, dedicated space, and software. Massive growth in the scale of data or big data generated through cloud computing has been observed (Hashem, et al., 2015). Big data technology and the real-time databases were introduced as a solution in the emergence of larger and complex data sets, to offer better solutions and features to handle massive data sets with higher performance. Oracle defines big data as data that contains greater variety arriving in increasing volumes and with ever-higher velocity (Oracle Corp.). It helps organizations to integrate, manage and analyze data originated from various sources. Use of big data technology can be highly beneficial for a business, because of its ability to analyze the behaviors inside an organization, its actors and use those analyses to make important predictions. The traditional relational database model was no longer capable of fulfilling all above
Recommended publications
  • Beyond Relational Databases
    EXPERT ANALYSIS BY MARCOS ALBE, SUPPORT ENGINEER, PERCONA Beyond Relational Databases: A Focus on Redis, MongoDB, and ClickHouse Many of us use and love relational databases… until we try and use them for purposes which aren’t their strong point. Queues, caches, catalogs, unstructured data, counters, and many other use cases, can be solved with relational databases, but are better served by alternative options. In this expert analysis, we examine the goals, pros and cons, and the good and bad use cases of the most popular alternatives on the market, and look into some modern open source implementations. Beyond Relational Databases Developers frequently choose the backend store for the applications they produce. Amidst dozens of options, buzzwords, industry preferences, and vendor offers, it’s not always easy to make the right choice… Even with a map! !# O# d# "# a# `# @R*7-# @94FA6)6 =F(*I-76#A4+)74/*2(:# ( JA$:+49>)# &-)6+16F-# (M#@E61>-#W6e6# &6EH#;)7-6<+# &6EH# J(7)(:X(78+# !"#$%&'( S-76I6)6#'4+)-:-7# A((E-N# ##@E61>-#;E678# ;)762(# .01.%2%+'.('.$%,3( @E61>-#;(F7# D((9F-#=F(*I## =(:c*-:)U@E61>-#W6e6# @F2+16F-# G*/(F-# @Q;# $%&## @R*7-## A6)6S(77-:)U@E61>-#@E-N# K4E-F4:-A%# A6)6E7(1# %49$:+49>)+# @E61>-#'*1-:-# @E61>-#;6<R6# L&H# A6)6#'68-# $%&#@:6F521+#M(7#@E61>-#;E678# .761F-#;)7-6<#LNEF(7-7# S-76I6)6#=F(*I# A6)6/7418+# @ !"#$%&'( ;H=JO# ;(\X67-#@D# M(7#J6I((E# .761F-#%49#A6)6#=F(*I# @ )*&+',"-.%/( S$%=.#;)7-6<%6+-# =F(*I-76# LF6+21+-671># ;G';)7-6<# LF6+21#[(*:I# @E61>-#;"# @E61>-#;)(7<# H618+E61-# *&'+,"#$%&'$#( .761F-#%49#A6)6#@EEF46:1-#
    [Show full text]
  • CWIC: Using Images to Passively Browse the Web
    CWIC: Using Images to Passively Browse the Web Quasedra Y. Brown D. Scott McCrickard Computer Information Systems College of Computing and GVU Center Clark Atlanta University Georgia Institute of Technology Atlanta, GA 30314 Atlanta, GA 30332 [email protected] [email protected] Advisor: John Stasko ([email protected]) Abstract The World Wide Web has emerged as one of the most widely available and diverse information sources of all time, yet the options for accessing this information are limited. This paper introduces CWIC, the Continuous Web Image Collector, a system that automatically traverses selected Web sites collecting and analyzing images, then presents them to the user using one of a variety of available display mechanisms and layouts. The CWIC mechanisms were chosen because they present the images in a non-intrusive method: the goal is to allow users to stay abreast of Web information while continuing with more important tasks. The various display layouts allow the user to select one that will provide them with little interruptions as possible yet will be aesthetically pleasing. 1 1. Introduction The World Wide Web has emerged as one of the most widely available and diverse information sources of all time. The Web is essentially a multimedia database containing text, graphics, and more on an endless variety of topics. Yet the options for accessing this information are somewhat limited. Browsers are fine for surfing, and search engines and Web starting points can guide users to interesting sites, but these are all active activities that demand the full attention of the user. This paper introduces CWIC, a passive-browsing tool that presents Web information in a non-intrusive and aesthetically pleasing manner.
    [Show full text]
  • Concept-Oriented Programming: References, Classes and Inheritance Revisited
    Concept-Oriented Programming: References, Classes and Inheritance Revisited Alexandr Savinov Database Technology Group, Technische Universität Dresden, Germany http://conceptoriented.org Abstract representation and access is supposed to be provided by the translator. Any object is guaranteed to get some kind The main goal of concept-oriented programming (COP) is of primitive reference and a built-in access procedure describing how objects are represented and accessed. It without a possibility to change them. Thus there is a makes references (object locations) first-class elements of strong asymmetry between the role of objects and refer- the program responsible for many important functions ences in OOP: objects are intended to implement domain- which are difficult to model via objects. COP rethinks and specific structure and behavior while references have a generalizes such primary notions of object-orientation as primitive form and are not modeled by the programmer. class and inheritance by introducing a novel construct, Programming means describing objects but not refer- concept, and a new relation, inclusion. An advantage is ences. that using only a few basic notions we are able to describe One reason for this asymmetry is that there is a very many general patterns of thoughts currently belonging to old and very strong belief that it is entity that should be in different programming paradigms: modeling object hier- the focus of modeling (including programming) while archies (prototype-based programming), precedence of identities simply serve entities. And even though object parent methods over child methods (inner methods in identity has been considered “a pillar of object orienta- Beta), modularizing cross-cutting concerns (aspect- tion” [Ken91] their support has always been much weaker oriented programming), value-orientation (functional pro- in comparison to that of entities.
    [Show full text]
  • Technical Manual Version 8
    HANDLE.NET (Ver. 8) Technical Manual HANDLE.NET (version 8) Technical Manual Version 8 Corporation for National Research Initiatives June 2015 hdl:4263537/5043 1 HANDLE.NET (Ver. 8) Technical Manual HANDLE.NET 8 software is subject to the terms of the Handle System Public License (version 2). Please read the license: http://hdl.handle.net/4263537/5030. A Handle System Service ​ ​ Agreement is required in order to provide identifier/resolution services using the Handle System technology. Please read the Service Agreement: http://hdl.handle.net/4263537/5029. ​ ​ © Corporation for National Research Initiatives, 2015, All Rights Reserved. Credits CNRI wishes to thank the prototyping team of the Los Alamos National Laboratory Research Library for their collaboration in the deployment and testing of a very large Handle System implementation, leading to new designs for high performance handle administration, and handle users at the Max Planck Institute for Psycholinguistics and Lund University Libraries ​ ​ ​ NetLab for their instructions for using PostgreSQL as custom handle storage. ​ Version Note Handle System Version 8, released in June 2015, constituted a major upgrade to the Handle System. Major improvements include a RESTful JSON-based HTTP API, a browser-based admin client, an extension framework allowing Java Servlet apps, authentication using handle identities without specific indexes, multi-primary replication, security improvements, and a variety of tool and configuration improvements. See the Version 8 Release Notes for details. Please send questions or comments to the Handle System Administrator at [email protected]. 2 HANDLE.NET (Ver. 8) Technical Manual Table of Contents Credits Version Note 1 Introduction 1.1 Handle Syntax 1.2 Architecture 1.2.1 Scalability 1.2.2 Storage 1.2.3 Performance 1.3 Authentication 1.3.1 Types of Authentication 1.3.2 Certification 1.3.3 Sessions 1.3.4 Algorithms TODO 3 HANDLE.NET (Ver.
    [Show full text]
  • Preview Orientdb Tutorial
    OrientDB OrientDB About the Tutorial OrientDB is an Open Source NoSQL Database Management System, which contains the features of traditional DBMS along with the new features of both Document and Graph DBMS. It is written in Java and is amazingly fast. It can store 220,000 records per second on commodity hardware. In the following chapters of this tutorial, we will look closely at OrientDB, one of the best open-source, multi-model, next generation NoSQL product. Audience This tutorial is designed for software professionals who are willing to learn NoSQL Database in simple and easy steps. This tutorial will give a great understanding on OrientDB concepts. Prerequisites OrientDB is NoSQL Database technologies which deals with the Documents, Graphs and traditional database components, like Schema and relation. Thus it is better to have knowledge of SQL. Familiarity with NoSQL is an added advantage. Disclaimer & Copyright Copyright 2018 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at [email protected].
    [Show full text]
  • Pointers Getting a Handle on Data
    3RLQWHUV *HWWLQJ D +DQGOH RQ 'DWD Content provided in partnership with Prentice Hall PTR, from the book C++ Without Fear: A Beginner’s Guide That Makes You Feel Smart, 1/e by Brian Overlandà 6 Perhaps more than anything else, the C-based languages (of which C++ is a proud member) are characterized by the use of pointers—variables that store memory addresses. The subject sometimes gets a reputation for being difficult for beginners. It seems to some people that pointers are part of a plot by experienced programmers to wreak vengeance on the rest of us (because they didn’t get chosen for basketball, invited to the prom, or whatever). But a pointer is just another way of referring to data. There’s nothing mysterious about pointers; you will always succeed in using them if you follow simple, specific steps. Of course, you may find yourself baffled (as I once did) that you need to go through these steps at all—why not just use simple variable names in all cases? True enough, the use of simple variables to refer to data is often sufficient. But sometimes, programs have special needs. What I hope to show in this chapter is why the extra steps involved in pointer use are often justified. The Concept of Pointer The simplest programs consist of one function (main) that doesn’t interact with the network or operating system. With such programs, you can probably go the rest of your life without pointers. But when you write programs with multiple functions, you may need one func- tion to hand off a data address to another.
    [Show full text]
  • Database Solutions on AWS
    Database Solutions on AWS Leveraging ISV AWS Marketplace Solutions November 2016 Database Solutions on AWS Nov 2016 Table of Contents Introduction......................................................................................................................................3 Operational Data Stores and Real Time Data Synchronization...........................................................5 Data Warehousing............................................................................................................................7 Data Lakes and Analytics Environments............................................................................................8 Application and Reporting Data Stores..............................................................................................9 Conclusion......................................................................................................................................10 Page 2 of 10 Database Solutions on AWS Nov 2016 Introduction Amazon Web Services has a number of database solutions for developers. An important choice that developers make is whether or not they are looking for a managed database or if they would prefer to operate their own database. In terms of managed databases, you can run managed relational databases like Amazon RDS which offers a choice of MySQL, Oracle, SQL Server, PostgreSQL, Amazon Aurora, or MariaDB database engines, scale compute and storage, Multi-AZ availability, and Read Replicas. You can also run managed NoSQL databases like Amazon DynamoDB
    [Show full text]
  • Oracle Nosql Database EE Data Sheet
    Oracle NoSQL Database 21.1 Enterprise Edition (EE) Oracle NoSQL Database is a multi-model, multi-region, multi-cloud, active-active KEY BUSINESS BENEFITS database, designed to provide a highly-available, scalable, performant, flexible, High throughput and reliable data management solution to meet today’s most demanding Bounded latency workloads. It can be deployed in on-premise data centers and cloud. It is well- Linear scalability suited for high volume and velocity workloads, like Internet of Things, 360- High availability degree customer view, online contextual advertising, fraud detection, mobile Fast and easy deployment application, user personalization, and online gaming. Developers can use a single Smart topology management application interface to quickly build applications that run in on-premise and Online elastic configuration cloud environments. Multi-region data replication Enterprise grade software Applications send network requests against an Oracle NoSQL data store to and support perform database operations. With multi-region tables, data can be globally distributed and automatically replicated in real-time across different regions. Data can be modeled as fixed-schema tables, documents, key-value pairs, and large objects. Different data models interoperate with each other through a single programming interface. Oracle NoSQL Database is a sharded, shared-nothing system which distributes data uniformly across multiple shards in a NoSQL database cluster, based on the hashed value of the primary keys. An Oracle NoSQL Database data store is a collection of storage nodes, each of which hosts one or more replication nodes. Data is automatically populated across these replication nodes by internal replication mechanisms to ensure high availability and rapid failover in the event of a storage node failure.
    [Show full text]
  • Object Databases As Data Stores for High Energy Physics
    OBJECT DATABASES AS DATA STORES FOR HIGH ENERGY PHYSICS Dirk Düllmann CERN IT/ASD & RD45, Geneva, Switzerland Abstract Starting from 2005, the LHC experiments will generate an unprecedented amount of data. Some 100 Peta-Bytes of event, calibration and analysis data will be stored and have to be analysed in a world-wide distributed environment. At CERN the RD45 project has been set-up in 1995 to investigate different approaches to solve the data storage problems at LHC. The focus of RD45 soon moved to the use of Object Database Management Systems (ODBMS) as a central component. This paper gives an overview of the main advantages of ODBMS systems for HEP data stores. Several prototype and production applications will be discussed and a summary of the current use of ODBMS based systems in HEP will be presented. The second part will concentrate on physics data analysis based on an ODBMS. 1 INTRODUCTION 1.1 Data Management at LHC The new experiments at the Large Hadron Collider (LHC) at CERN will gather an unprecedented amount of data. Starting from 2005 each of the four LHC experiments ALICE, ATLAS, CMS and LHCb will measure of the order of 1 Peta Byte (1015 Bytes) per year. All together the experiments will store and repeatedly analyse some 100 PB of data during their lifetimes. Such an enormous task can only be accomplished by large international collaborations. Thousands of physicists from hundreds of institutes world-wide will participate. This also implies that nearly any available hardware platform will be used resulting in a truly heterogeneous and distributed system.
    [Show full text]
  • The Java API Quick Reference You Can Create Both Local (That Is, Without Using a Remote Server) and Remote Database with the Java API
    The Java API quick reference You can create both local (that is, without using a remote server) and remote database with the Java API. Each kind of database has a specific related class, but they expose the same interface: • To create a local document database, use the ODatabaseDocumentTx class: ODatabaseDocumentTx db = new ODatabaseDocumentTx ("local:<path>/<db-name>").create(); • To create a local graph database, use the OGraphDatabase class: OGraphDatabase db = new GraphDatabase("local:<path>/<db-name>"). create(); • To create a local object database, use the OObjectDatabaseTx class: OGraphDatabase db = new GraphDatabase("local:<path>/<db-name>"). create(); • To create a remote database: new OServerAdmin("remote:<db-host>").connect(<root- username>,<root-password>).createDatabase(<db-name>,<db- type>,<storage-type>).close(); • To drop a remote database: new OServerAdmin("remote:<db-host>/<db-name>").connect(<root- username>,<root-password>).dropDatabase(); Where: • path: This specifies the path where you wish to create the new database. • db-host: This is the remote host. It could be an IP address or the name of the host. • root-user: This is the root username as defined in the server config file. • root-password: This is the root password as defined in the server config file. • db-name: This is the name of the database. • db-type: This specifies the type of database. It can be "document" or "graph". • storage-type: This specifies the storage type. It can be local or memory. local means that the database will be persistent, while memory means that the database will be volatile. Appendix Open and close connections To open a connection, the open API is available: <database>(<db>).open(<username>,<password>); Where: • database: This is a database class.
    [Show full text]
  • Database Performance Study
    Database Performance Study By Francois Charvet Ashish Pande Please do not copy, reproduce or distribute without the explicit consent of the authors. Table of Contents Part A: Findings and Conclusions…………p3 Scope and Methodology……………………………….p4 Database Performance - Background………………….p4 Factors…………………………………………………p5 Performance Monitoring………………………………p6 Solving Performance Issues…………………...............p7 Organizational aspects……………………...................p11 Training………………………………………………..p14 Type of database environments and normalizations…..p14 Part B: Interview Summaries………………p20 Appendices…………………………………...p44 Database Performance Study 2 Part A Findings and Conclusions Database Performance Study 3 Scope and Methodology The authors were contacted by a faculty member to conduct a research project for the IS Board at UMSL. While the original proposal would be to assist in SQL optimization and tuning at Company A1, the scope of such project would be very time-consuming and require specific expertise within the research team. The scope of the current project was therefore revised, and the new project would consist in determining the current standards and practices in the field of SQL and database optimization in a number of companies represented on the board. Conclusions would be based on a series of interviews with Database Administrators (DBA’s) from the different companies, and on current literature about the topic. The first meeting took place 6th February 2003, and interviews were held mainly throughout the spring Semester 2003. Results would be presented in a final paper, and a presentation would also be held at the end of the project. Individual summaries of the interviews conducted with the DBA’s are included in Part B. A representative set of questions used during the interviews is also included.
    [Show full text]
  • (UCP) in a Tomcat Application Container Requires the Following Steps
    Design and Deploy TomCat AppliCations for Planned, Unplanned Database Downtimes and Runtime Load BalanCing with UCP In OraCle Database RAC and Active Data Guard environments ORACLE WHITE PAPER NOVEMBER 2017 DESIGN AND DEPLOY TOMCAT SERVLETS FOR PLANNED AND UNPLANNED DATABASE DOWNTIME AND LOAD BALANCING WITH UCP Table of Contents IntroduCtion 3 Issues to be addressed 3 OraCle Database 12C High-Availability and Load BalanCing ConCepts 4 Configure TomCat for UCP 4 Create a New Database Resource 4 Create a JNDI lookup in the servlet 5 Create a web.xml for the servlet 6 Hiding Planned MaintenanCe from TomCat Servlets 6 Web Applications Steps 6 DBA or RDBMS Steps 7 Hiding Unplanned Database downtime from TomCat Servlets 9 Developer or Web Application Steps 9 DBA or RDBMS Steps 10 Application Continuity Checklist 10 Runtime Load BalanCing (RLB) with TomCat Servlets 12 Web AppliCation steps 12 Appendix 13 Enable JDBC & UCP logging for debugging 13 ConClusion 14 2 DESIGN AND DEPLOY TOMCAT SERVLETS FOR PLANNED AND UNPLANNED DATABASE DOWNTIME AND LOAD BALANCING WITH UCP IntroduCtion Achieving maximum appliCation uptime without interruptions is a CritiCal business requirement. There are a number of requirements such as outage detection, transparent planned maintenanCe, and work load balanCing that influenCe appliCation availability and performance. The purpose of this paper is to help Java Web applications deployed with ApaChe TomCat, achieve maximum availability and sCalability when using Oracle. Are you looking for best praCtiCes to hide your tomCat applications from database outages? Are you looking at, smooth & stress-free maintenanCes of your tomCat appliCations? Are you looking at leveraging Oracle Database’s runtime load balancing in your tomCat web appliCations? This paper Covers the configuration of your database and tomCat servlets for resilienCy to planned maintenanCe, unplanned database outages, and dynamic balancing of the workload across 1 database instances, using RAC, ADG, GDS , and UCP.
    [Show full text]