International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2

NoSQL Selection - A Case Study of an Enterprise Jeang-Kuo Chen1*, Wei-Zhe Lee2 Dept. Information Management, Chaoyang University of Technology Taichung, Taiwan 1 [email protected] 2 [email protected]

Abstract—From then till now, some different specific one, with more than 225 species [2]. How database models have been developed while to choose an appropriate NoSQL database to use relational database model is the most used one is an important issue for the enterprise because of by businesses. The storage of enterprise data is changing the database can affect the original more and more because of the popularization performance of the business operations. of big data collection technology and This paper provides a basic introduction to each application fields. The user always expects to NoSQL database model, compares the data access database as fast as possible. The speed of formats and characteristics of each database model, relational is limited because of mass and lists the actual products of NoSQL databases join operations. More and more enterprises for each model. In addition, the paper also have changed to use NoSQL database to store proposed a set of principles for enterprises to data. However, there are more than 200 kinds choose an appropriate NoSQL database to of NoSQL databases. How to choose an improve the business problems and challenges. appropriate NoSQL database is an important Finally, a business case, according to the issue because it will affect the performance of characteristics of the enterprise and the amount of the enterprise operations. In this paper, we will data size, indicates that the NoSQL database analyze some of the NoSQL database models. HBase is the best choice for the enterprise. Besides, we will propose some principles for enterprise to choose a suitable NoSQL database. 2. RELATED WORK

Keywords—Big Data, Database, RDB, NoSQL This section describes the development history databases, Database Model of database models and the basic concepts of NoSQL databases. 1. INTRODUCTION 2.1 The Development History of Database From the 1940s to the present, different kinds of models database models had been developed such as File A wide variety of database models have been System, Hierarchical, Network, Relational and developed since the 1940s. Here the development Object-Oriented. In particular, Relational process is described below [1]. Database model (RDM) has the advantages of data consistency, minimal data repetition and powerful 1) File Processing System was developed from RDBMS, and is therefore the most widely used for the 1940s to the 1950s. Data is stored in enterprises [1]. punched cards or magnetic tapes in the File With the popularization of big data collection Processing System. The File Processing technology and application fields, enterprises need System cannot manage and process the data to store more and more data. Users expect efficiently because of data duplication and application programs to access data as soon as inconsistence in the File Processing System. possible. In addition, the formats of storage data Besides, the File Processing System cannot are also more and more multivariate, for example, deal with appropriate access control to users. key-value pairs, document-oriented, time series, 2) The Hierarchical Database Model was so more and more enterprises use NoSQL developed from the 1960s to the 1970s. databases to store data [2][4][12]. However, the Each record in the Hierarchical Database NoSQL database is a general term rather than a model is considered as a node in a tree data

資訊科技國際期刊第十二卷第一期 20 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2

structure. The most famous Hierarchical DB TABLE 1 is IMS (Information Management System) AN EXAMPLE OF A RELATION which was developed by IBM. Table Name: Students 3) The Network Database Model was developed SID: Name: Telephone: Birthday: from the 1960s to the 1970s. Each record in char(4) char(10) char(12) date the Network Database model is considered as S001 Mary 02-33344556 1993/1/1 a vertex in a graph . The most S002 John 03-77889900 1994/2/2 famous Network DB is IDMS (Integrated S003 Zoe 04-35465768 1995/3/3 Database Management System) which was developed by IBM. In addition, the Integrity Constraints of RDM 4) The Relational Database Model (RDM) was are part of a database design, and are used to proposed by E. F. Codd in the 1970s [1] and establish the basis for checking data stored in the the details are described in Section 2.2. database and to ensure the accuracy of the data. 5) The Object-Oriented Database Model Therefore, Integrity Constraints can not only (OODM) was developed in the late 1980s. prevent authorized users to store illegal data into This model combines the features of Object- the database but also avoid the situation of data Oriented Programming Language and DB. inconsistency between relations. Here are four Although RDM is the repository for the most common Integrity Constraints shown as follows business use, it is not suitable for some areas [1]. of application, such as Computer Aided 1) Key Constraint: a relation must have a unique Design (CAD) and Office Automation. The and minimal Primary Key (PK). OODM includes about 25 databases such as 2) Domain Constraint: the attribute value of a db4o, Versant, and so on [2][14]. relation must belong to an atomic value of 6) The Object-Relational Database Model Domain. (ORDM) was developed by Won Kim and 3) Entity Integrity Constraint: used to describe Michael Stonebraker in the 1990s. The the principles of a PK. ORDM integrates the concepts of OODM into 4) Referential Integrity Constraint: used to RDM, and is the evolution of OODM. The describe the principles of a foreign key. ORDM includes about 15 databases such as PostgreSQL [3]. 2.3 Entity-Relationship Model (ER- Model) 2.2 Relational Database Model (RDM) Entity-Relationship Model (ER-Model) uses A Relational Database (RDB) is a set of geometric symbols of entities, relationships, etc., Relations that are two-dimensional tables to to describe the relationships among entities, and it organize data. Each relation is composed of is a tool used to analyse and design the architecture Relation Schema and Relation Instance, where of DB. In ER-Model, an entity denotes an object Relation Schema includes relation name, attribute which can be identified in the real world such as names and attribute domains; and Relation order, customer, product, supplier, and so on; Instance refers to the data stored in a relation at a while a relationship describes a relationship specific point of time [1]. among some entities [1]. Table 1 shows an example of a relation for The common symbols of ER-Model are RDM as follows [1]: shown in Table 2 [1], and each symbol is 1) The name of this Relation is “Students.” described as follows. 2) The attribute names in this Relation are “SID,” 1) Entity Type is a general term for a group of “Name,” “Telephone,” and “Birthday.” entities with the same characteristics, and an Respectively entity is a data subject unit that can be 3) The domain is a set of acceptable values for a recognized in the real world. specific attribute. For example, the acceptable 2) Weak Entity Type is a kind of entity type that values of the attribute “Birthday” are needs to depend on the other entity type, and reasonable dates. their relationship is established with 4) The Relation Instance contains three tuples of Identifying Relationship Type. students. 3) Relationship-Entity Type, also known as Bridge Entity Type, is a transformation of a many-to-many relationship.

21 資訊科技國際期刊第十二卷第二期 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2

4) Relationship Type is a general term for a ERD is shown in Fig. 1 [1], and this ERD is a group of relationships with the same simple school DB with four entities: Students, characteristics, and a relationship is a relation Selections, Courses, and Employees. The established between more than one entities. relationships among these entities are described as 5) Identifying Relationship Type is a kind of follows. A student can select multiple courses and relationship type that connects a weak entity a course also can be selected by many students. An type and an entity type. employee can teach multiple courses, but a course 6) Attributes can be divided into Atomic can be taught by only one employee. Attribute Types, Composite Attribute Types, Multivalued Attribute Types, Derived Attribute Types, and Key Attribute Types, where (a) Atomic Attribute Types are the most basic attribute types, (b) Composite Attribute Types are composed of multiple Atomic Attribute Types, () Multivalued Attribute Types mean that the attribute can contain more than one data value, (d) Derived Attribute mean that its value is derived from the values of the other attributes, and (e) Key Attribute Types are used to identify an entity.

Fig. 1 An example of ERD TABLE 2 THE COMMON SYMBOLS OF ER-MODEL 2.4 NoSQL Databases Types of Elements Symbols NoSQL is an abbreviation of "Not Only SQL." That means if RDB is suitable for use then use it. Entity If it is not suitable for use, we do not have to use it. We can consider using other more suitable Weak Entity database [4]. The features of NoSQL databases are described as follows [2][4][12]. Relationship-Entity 1) Non-relational: NoSQL databases do not use (Bridge Entity) RDM, neither support SQL Join operations. 2) Distributed: Data in NoSQL databases usually Relationship is stored in different servers and metadata manages the location of the data. 3) Horizontally scalable: NoSQL databases Identifying Relationship increase the capacity of databases by increasing the number of servers. For example, Attribute A relational databases (RDB) needs a server with good performance and large capacity to Key Attribute save huge data, but the cost of the RDB server may 5~10 times higher than that of a normal server. However, a NoSQL database just only needs some normal servers to store large Composite Attribute amount of data. 4) High data processing rate: The data processing rate of NoSQL databases is higher Multivalued Attribute than that of RDB. For an example, Google

uses MapReduce to process the data stored in Derived Attribute BigTable at 20 Petabyte per day. 5) NoSQL databases are suitable for applications that need to query faster, but do not care The diagram of a database drawn with the whether the query results are very accurate. symbols of the ER-Model is called an Entity- According to the statistics of NoSQL database Relationship Diagram (ERD). An example of official website [2], the current number of NoSQL

資訊科技國際期刊第十二卷第一期 22 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2 databases has more than 225. In addition, NoSQL 3.2. Document Store databases are widely used by Google, Yahoo, Facebook, Twitter, Taobao, Amazon, and so on [4]. Document Store is based on the semi- structured document to store data; such a document is usually stored in a specific format, 3. THE DATABASE MODELS OF such as XML (eXtensible Markup Language), NOSQL DATABASES JSON and BSON (Binary JSON) and so on. Each record can have different attributes, and there is no This section will introduce each of the NoSQL specific data type [5][7]. For example, a database models as well as the information on Document Store archives school curriculum data which formats or features each database model is in JSON files, an example of which is shown in suitable for. Fig. 3. There are around 30 packages belonging to the 3.1. Key Value Store Document Store Model, and these packages The data of key value store is saved as the include MongoDB, CouchDB, RethinkDB, IBM format of Key→Value [5], where Cloudant, and so on [2][14]. 1) Key is a string used to identify the Value is { unique; "Courses": { 2) Value is an actual data value which is stored "C001": { "title": "Business English", in digit, string, or JSON (JavaScript Object "credits": 2, Notation) format; "remark": "This course has a group discussion." 3) Users can search for Value by a specific Key. }, If a store builds a product information search "C002": { "title": "Software Engineering", system, so that users can read the barcode to query "credits": 3 the product name, price, and other related } information, we can use key value store way to } } create data, where Key is the bar code of goods, Value is in JSON format to store the product name, Fig. 3 An example of the Document Store price, pricing units, and other information, as shown in Fig. 2. 3.3. Graph Databases Graph Databases uses the graphic structure to Key (the barcode Value (JSON format) construct the data, the details are as follows [5] of each product) [10]. {'title': 'banana', 1) The graphic structure includes "vertex" and 9876543210987 'price': 39, 'unit': 'bunch'} "edge." 2) Each "vertex" and "edge" can store data. {'title': 'apple', 3) "Edge" is used to describe the relationship 9876543210988 'price': 69, between "vertices." 'unit': 'kilogram'} Fig. 4 shows an example of an aircraft flight hour meter between four cities, describing the {'title': 'tomato', 9876543210989 'price': 59, relationship between "vertices" and "edges" in 'unit': 'kilogram'} Graph Databases [5]. Among them: 1) The "vertices" describes country and city data. … … 2) The "edge" describes how long it takes for the

plane to travel from its departure point to its Fig. 2 An example of key value store destination. 3) Users can check the following information There are around 50 packages belonging to through Graph Databases: the Key Value Store Model, and these packages a. All the way from the point of departure to include Redis, MemcacheDB, Riak, Aerospike, the destination of the aircraft; Oracle NoSQL Database, and so on [2][14]. b. The minimum time it takes for the aircraft to travel from the point of departure to the destination; c. Whether need to transfer to reach a destination.

23 資訊科技國際期刊第十二卷第二期 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2

Graph Databases is suitable for recording the Column Family, title is the name of the information on social networks, recommendation Column Qualifier, and Cotton T-shirt is the systems, etc. [12]. There are around 20 packages data value. belonging to the Graph Database Model, and these packages include Neo4j, TITAN, Sparksee, and so TABLE 3 on [2][14]. AN EXAMPLE OF WIDE COLUMN STORE

Table Name: Products_Inventories National: Canada Row ts Column family Column family City: Vancouver Duration: 5.5 hr Key Products Inventories P001 t1 Products:classes = “Men” Duration: 2.3 hr National: U.S. t2 Products:title = City: New York “Cotton T-shirt” t3 Products:descriptions = “TBD” Duration: 6 hr t4 Products:price = National: U.S. “329” City: San Francisco t5 Inventory:quantity = “10” Fig. 4 An example of the Graph Database t6 Inventory:place = “1A” P002 t7 Products:classes = “Sports” 3.4. Wide Column Store t8 Products:title = “Elastic Sport Pants” Wide Column Store is a NoSQL database with t9 Products:descriptions a complex table schema, the outline of which is = “TBD” described below [5][13][16][17]. t10 Products:price = 1) A Row Key is used as a unique value to “349” identify a specific record in the Wide Column t11 Inventory:quantity Store. It is similar to the primary key of a table = “20” t12 Inventory:place = in an RDB. “2A” 2) A Timestamp (ts) is an integer used to identify a specific version of a data value. There are around 18 packages belonging to the 3) A data table is divided into multiple Column Wide Column Store Model, and these packages Family, each Column Family is based on include Apache Cassandra, Apache HBase, "Family: Qualifier = Value" format to store Apache Accumulo, and so on [2][14]. data, where a. “Family” is a column family name; 3.5. Event Sourcing b. “Qualifier” is a column qualifier name; c. “Value” is an actual data value stored in text. Event Sourcing is used to store the events Table 3 shows an example to illustrate the which happened in the past to track the status of table schema of a Wide Column Store DB as something. The Events have some characteristics follows. described as follows [6]. 1) The table name is called “Products- 1) The events cannot be changed or undone. Inventories” which allows users to query 2) The events may affect earlier status of events. product and/or inventory information. For example, “cancel booking” may affect the 2) The table contains two records of data and earlier result of event “booking success.” uses the product code P001 and P002 as Row 3) The events must clearly describe the time, Key value, respectively. place, people and other information. 3) The timestamp (abbreviated here as ts) value If a seminar is expected to produce a containing t1, t2, etc. are sorted by the time the registration system for a certification in order to data is entered grasp the registration and the current number of 4) The table has two Column Family, Products applicants. The registration system uses Event and Inventories, respectively. Sourcing to store registrations as shown in Table 5) Take the String Products:title = “Cotton T- 4. The three fields of "Registration Time", shirt” as an example, Products is the name of "Person" and "Summary" can be regarded as one

資訊科技國際期刊第十二卷第一期 24 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2 event, and the record of "Current Enrollment TABLE 6 Number" can know the enrollment number of the MULTI-MODEL DATABASES AND RELATED current seminar [6]. There are 2 packages DATABASE MODELS belonging to the Event Sourcing Database Model, Database Database models and these packages include Event Store and name Eventsourcing for Java (es4j) [2]. OrientDB Object-Database, Document Store (JSON), Graph Database, TABLE 4 Key Value Store AN EXAMPLE OF EVENT SOURCING ArangoDB Document Store (JSON), Graph databases, Registration Person Summary Current Key Value Store Time Enrolment Oracle NoSQL Document Store (JSON), Number Database Graph databases, 2018/09/22 A Applied 1 Key-Value Store 09:00 2018/09/25 B Applied 2 10:00 3.8. Summary 2018/09/28 C Applied 3 The basic concepts for each NoSQL database 11:30 2018/09/29 D Applied 4 model are illustrated and lists which databases 12:00 each NoSQL database model contains. According 2018/09/29 E Applied 5 to the above descriptions of the seven database 13:00 models, we analyse that which NoSQL database model is suitable for dealing with what 3.6. Time Series Databases (TSDB) characteristics or format of data. The results are The TSDB is used to process time-series data. collated as shown in Table 7. Please refer to Table Time series is a time-ordered set of random 8 for a list of the actual databases included in the variables that change from time to time. For complete NoSQL database models. example, gross domestic product, consumer price index, temperature, rainfall Volume, exchange rate, TABLE 7 etc. [9]. NOSQL DATABASE MODEL SUITABLE FOR An observational station measures temperature PROCESSING DATA and rainfall once per hour, and transmits the Database Suitable data features to process results to a TSDB. An example of the storage of models TSDB data is shown in Table 5 for the temperature Key Value ⚫ Data which are stored in key-value and rainfall records of a particular observing Stores pairs. station for 2017. There are around 7 packages ⚫ Data which have fewer columns (about 3 columns below). belonging to the TSDB Model, and these packages ⚫ Applications which need to process a include influxdata, kdb+, OpenTSDB, large amount of data in a short time. eXtremeDB, Riak TS, and so on [2][8][14]. Document ⚫ Data which are stored in semi- Stores structured documents with a specific format, such as XML, JSON, etc. TABLE 5 Graph DB ⚫ Data which are stored in the graph of AN EXAMPLE OF THE TSDB data structure, such as social network relations. Measurement Temperature Rainfall (mm) Wide ⚫ Data which have more columns. Time (℃) Column ⚫ Applications which need to search the 2017/01/01 00:00 -3 11 Stores data in a specific column frequently. 2017/01/01 01:00 -4 8 Event ⚫ Records which happened in past, such 2017/01/01 02:00 -5 5 … … … Sourcing as the general ledger in accounting, the 2017/12/31 23:00 -1 0 DB registration records of a conference, and so on. TSDB ⚫ Time Series Data, such as temperature, 3.7. Multi-model Databases exchange rate, and so on. Multi-model Databases covers more than two Multi- ⚫ According to the included database model DB models of a specific database. database models, and some Multi-model Databases and related database models are listed in Table 6 [2][11][14].

25 資訊科技國際期刊第十二卷第二期 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2

TABLE 8 business through the above knowledge and the LIST OF NOSQL DATABASES [2] problems faced by the enterprises, and illustrate Database models Databases the case through a case. key value stores DynamoDB, Azure Table Storage, Riak, Redis, Aerospike, LevelDB, 4.1. The Principle of Database Selection RocksDB, Berkeley DB, GenieDB, BangDB, Chordless, Scalaris, Tokyo Determine the following questions according Cabinet / Tyrant, Scalien, Voldemort, Dynomite, KAI, MemcacheDB, to corporate culture and characteristics. Faircom C-Tree, LSM, KitaroDB, 1) Understand the problems, goals, and upscaledb, STSdb, Tarantool/Box, challenges the business operation database Chronicle Map, Maxtable, quasardb, Pincaster, RaptorDB, TIBCO Active must face currently. Spaces, allegro-C, nessDB, 2) According to the business requirements as HyperDex, SharedHashFile, Symas LMDB, Sophia, NCache, TayzGrid, well as NoSQL database characteristics of PickleDB, Mnesia, LightCloud, non-relational, decentralized, and high Hibari, OpenLDAP, Genomu, volume data processing to assess to continue BinaryRage, Elliptics, DBreeze, TreodeDB, BoltDB, Serenety, to use the current RDB, or switch to NoSQL Cachelot, filejson, InfinityDB, SCR database. Siemens Common Repository 3) If using NoSQL, we must decide which Document Stores Elastic, ArangoDB, OrientDB, gunDB, MongoDB, Cloud Datastore, NoSQL database model to use based on the Azure DocumentDB, RethinkDB, characteristics and format of the data for each Couchbase Server, CouchDB, ToroDB, SequoiaDB, NosDB, NoSQL database model. RavenDB, MarkLogic Server, 4) When selecting a particular database, we can Clusterpoint Server, JSON ODM, first identify the most commonly discussed NeDB, Terrastore, AmisaDB, JasDB, RaptorDB, djondb, EJDB, databases on the web based on the statistics densodb, SisoDB, SDB, NoSQL and ratings of the DB-Engines Ranking embedded db, ThruDB, iBoxDB, Website [14]. Then we select the database BergDB, ReasonDB, IBM Cloudant, BagriDB, EMC Documentum xDB, based on their strengths and business needs eXist, Sedna, BaseX, Qizx, Berkeley out of the most appropriate databases. DB XML, JEntigrator Graph DB Neo4J, ArangoDB, OrientDB, Infinite Graph, Sparksee, TITAN, InfoGrid, 4.2. Database Selection Case HyperGraphDB, GraphBase, Trinity, AllegroGraph, BrightstarDB, Bigdata, Meronymy, WhiteDB, Onyx 4.2.1 Problem Description Database, OpenLink Virtuoso, VertexDB, FlockDB, weaver, An online shopping website uses RDB to store BrightstarDB, Execom IOG, Fallen 8 operational data and the features of which are Wide Column HBase, MapR, Hortonworks, described as below. Stores Cloudera, Cassandra, Scylla, Hypertable, Accumulo, Amazon 1) The RDB has a total of 15 data tables, and SimpleDB, Cloudata, MonetDB, each data table has average 20 fields. HPCC, Apache Flink, IBM Informix, Splice Machine, eXtremeDB 2) An average of about 500,000 data records are Financial Edition, ConcourseDB, generated daily in the RDB. Druid, KUDU, Elassandra 3) When searching for information, users often Event Sourcing DB Event Store, Eventsourcing for Java (es4j) need to join several related tables with a large TSDB Axibase, Riak TS, Informix Time amount of data. Series Solution, influxdata, Due to the increasing amount of data on the pipelinedb, kdb+, eXtremeDB Multi-model DB ArangoDB, OrientDB, Datomic, RDB and the fact that users expect the faster and gunDB, CortexDB, Oracle NOSQL the better the data processing, the business owner Database, AlchemyDB, WonderDB, hopes the IT department staff can solve this RockallDB, FoundationDB problem.

4. CHOOSE THE APPROPRIATE 4.2.2 Solution DATABASE According to the instructions of the boss, the So far, we have briefly discussed the features director of Enterprise Information Department of each NoSQL database model, and have listed traced the reason and found out why the data the available packages for each model. Next, we processing speed is slowing down. In addition to a will show how to select a database suitable for the large amount of data generated daily, the most

資訊科技國際期刊第十二卷第一期 26 International Journal of Advanced Information Technologies (IJAIT), Vol. 12, No.2 important reason is that many users often need to Republic of China: TopTeam Information JOIN several tables with large amounts of data Co., Ltd., 2016. when they search for information. So the [5] S. Dan, NoSQL for Mere Mortals, 1st ed., supervisor intends to suggest using the NoSQL London, England: Pearson P T R, 2015. database as a solution. [6] Microsoft Corp. (2018) Introducing to Event Another problem with this solution is which Sourcing. [Online]. Available: NoSQL database to use? Analyse as below. https://msdn.microsoft.com/en- 1) The suitable category of NoSQL DBs is Wide us/library/jj591559.aspx#sec1 Column Store because an online shopping [7] (2018) Document-oriented database website needs to search a specific column of (Wikipedia). [Online]. Available: data frequently. https://en.wikipedia.org/wiki/Document- 2) According to the statistics and reviews of the oriented_database DB-Engines Ranking website [14], Apache [8] (2017) Time series database (Wikipedia). Cassandra and Apache HBase are more [Online]. Available: widely discussed Wide Column Stores on the https://en.wikipedia.org/wiki/Time_series_d Internet. atabase 3) In addition, according to the experimental [9] (2018) Time series (Wikipedia). [Online]. results of Chen et al. [15], Apache HBase is Available: shorter than Apache Cassandra on data access https://en.wikipedia.org/wiki/Time_series time. It is recommended to use Apache HBase [10] (2018) Graph Databases (Wikipedia). as the database for the enterprise. [Online]. Available: https://en.wikipedia.org/wiki/Graph_databas 5. CONCLUSIONS e [11] (2017) Multi-model databases (Wikipedia). This paper gives a basic introduction to each [Online], Available: NoSQL database model and compares the data https://en.wikipedia.org/wiki/Multi- formats and characteristics that each database model_database model is suitable for processing. In addition, a [12] J. H. Lu, Challenge big data, how to process shopping website is also used as an example to Big Data in Facebook, Google, Amazon? Use illustrate how to select a suitable NoSQL database NoSQL to get 10 billion annual hard disk for the enterprise. Hopefully, the content of this data, 2nd ed., Taipei, Republic of China: essay will help readers to choose a suitable TopTeam Information Co., Ltd., 2015. NoSQL database to store huge data. [13] N. Dimiduk, and A. Khurana, HBase in Action, 1st ed., New York, USA: Oreilly & REFERENCES Associates Inc., 2012. [14] SOLID IT Team (2018) DB-Engines [1] H. A. Chen, Database System: Concept, Ranking. [Online]. Available: https://db- Design, and Implementation, 3rd ed., Taipei, engines.com/en/ranking Republic of China: XBOOK MARKETING [15] C. Y. Chen, B. R. Chang, H. F. Tsai, and C. Co., Ltd., 2013. L. Guo, “Empirical Analysis of High [2] (2011) NoSQL databases. [Online]. Efficient Remote Cloud Data Center Backup Available: http://nosql-database.org/ Using HBase and Cassandra,” Scientific [3] (2017) Comparison of object-relational Programming, vol. 2015, article ID 294614, database management systems (Wikipedia). p. 1-10, Dec. 2014. [Online]. Available: [16] J. H. Lu, Hadoop: Practical Technical https://en.wikipedia.org/wiki/Comparison_o Handbook, 2nd ed., Taipei, Republic of f_object- China: TopTeam Information Co., Ltd., 2014. relational_database_management_systems [17] L. George, HBase: The Definitive Guide, 1st [4] S. J. Pi, Establish the cornerstone of Big Data: ed., California, USA: Oreilly & Associates NoSQL Database technique, 2nd ed., Taipei, Inc., 2011.

27 資訊科技國際期刊第十二卷第二期