A Comparative Study of Nosql Databases Abhishek Prasad1, Bhavesh N

Volume 5, No. 5, May-June 2014 ISSN No. 0976-5697 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info A Comparative Study of NoSQL Databases Abhishek Prasad1, Bhavesh N. Gohil2 1,2S.V.National Institute of Technology, Ichchhanath, Surat,India Abstract:In today’s scenario, web applications are facing new challenges while serving millions of users. It can be easily understood and appreciated that while accessing these web applications from all over the world, users expect the services to be always available with high performance and reliability. The rate of growth of successful web services is much faster as compared to increase in the performance of computer hardware. Therefore, after sometime a web application which attempts to provide large number of web services and support large number of users needs the ability to scale. This study explores information on various NoSQL (Not only Structured Query Language) databases and attempts to make a comparison between them based on different criteria. The NoSQL databases were created to offer high performance and high availability especially for large scale and high concurrency applications at the expense of losing the ACID (Atomic, Consistent, Isolated, Durable) properties of the traditional databases in exchange with a weaker BASE (Basic Availability, Soft state, Eventual consistency) features. Keywords: Big Data;NoSQL; Scalability; Sharding; Replication; Comparison change the nature of communication, shopping, I. INTRODUCTION TO NOSQL DATABASE advertising, entertainment, and relationship management. It may also happen that applications A. Understanding of NoSQL database: which don’t leverage it quickly and timely will lag a. What isNoSQLdatabase?:Since last 15 years or so, behind in no time. interactive applications have changed dramatically d. NoSQLbetter suited for cloud computing:Presently, which has led to the change in the data management most new applications use three-tier Internet requirement of those applications. Nowadays, three architecture. These applications run in a public or interrelated megatrends: Big Data, Big Users and private cloud, and support large number of users.In Cloud Computing are driving the adoption of NoSQL three-tier architecture, applications are accessed technology. Therefore, NoSQL is being increasingly through a web browser or mobile application considered as a feasible alternative to relational connected to the Internet. In the cloud, a load balancer databases, especiallyasthey can simplify operating on directs the incoming traffic to a scale-out tier of web or large scale providing betterresults on clusters of application servers that processes the logic of the standard and commodity servers.Further, it is opined application. that a schema-less data model is better for the Traditionallyrelational databases were the popular capturing a variety of data processed today. choice for database tier. These relational databases b. Why NoSQL hasLargeUsers?:Today, with the rise in arecreating a lot of problem. But these are still used, because global Internet use, there has been considerable they havepeculiar features of being centralized, share- increase in the number of users and the time theyspend everything technology that scales up rather than scalingout. online. Smartphones and tablets have started gaining They are not fit for applications that require easy and popularity, as internet is easily accessible anywhere dynamic scalability. NoSQL databases are more suited in through these easy-to-handle devices. such applications since they provide distributed, scale-out It is important to support large numbers of concurrent technologies.Therefore,NoSQLis a better option for highly users.At the same time it is also significant to dynamically distributed nature of the three-tier Internet architecture. support rapidly growing (or shrinking) number of concurrent B. Common Characteristics of NoSQL: users. To support large number of users in addition to the NoSQL databases share a common set of characteristics dynamic nature of application usage patterns, there is a need as mentioned below: for more easily scalable database technology. With a. No Schema required:Data can be inserted into relational technologies, it is difficult to get dynamic NoSQL database without first defining a rigid database scalability while maintaining the performance users demand schema. Also, the format of the data being inserted can for their application. be changed at any time, without disrupting the c. NoSQL can handle vast data:With the advent of ever application. This characteristic provides immense growing technology, it is easier to capture data by application flexibility. accessing third parties sources such as Facebook. With b. Auto-sharding:A NoSQL database automatically a single click one can capture personal user spreads data across servers, without requiring information, geo location data, social graphs, machine applications to participate. Servers can be added or logging data, and sensor-generated data. Data is very removed from the data layer without application huge and ever-expanding.Use of the data can rapidly downtime, with data automatically spread across the servers. Most NoSQL databases also support data © 2010-14, IJARCS All Rights Reserved 170 Bhavesh N. Gohil et al, International Journal of Advanced Research in Computer Science, 5 (5), May–June,2014,170-176 replication, storing multiple copies of samedata across II. REVIEW OF NOSQL DATABASES the cluster and even across data centers to ensure high- availability and support disaster recovery. A. SimpleDB: c. Distributed query support:“Sharding” relational Amazon provides a web services in the form of Simple databases can reduce the ability to perform complex DB and distributed data base. SimpleDB does not need any data queries operation. NoSQL database systems retain rigid structure and is easy to use. Since, SimpleDB a their full query expressive power even when data is structure free , it decides of its own about the requirement of distributed across huge number of servers. indexes and accordingly provides a simple SQL like query d. Integrated caching:To increase thedata throughput interface. [2] and reduce latency, advancedNoSQL database In SimpleDB , to ensure the safety of data and also to technologies transparently cache data in system increase the performance, all data are replicated onto memory. different machines in different datacenters. Because of this e. Reliability (Fault Tolerance):If some machines within special feature, there is no possibility of automatic sharding the system crash, the rest of the machines remain .It leads to safeguarding of data from scaling. Scaling is unaffected and work continues without any stoppage. possible only in default where in the application layer has f. Scalability:In distributed computing, more machines performed the data partitioning by itself. [2] can be easily added to the system according to the Some of the salient features of Simple DB which requirements of the user and the application. encourages /discourages its use are as mentioned below: g. Sharing of resource:Similar to data or resources a. Consistency: SimpleDB provides eventual consistency shared in distributed system, other resources can be but does not provide MVCC due to which conflicts also shared like expensive printers. cannot be detected on the client side. h. Speed: A distributed computing system can provide b. Limitations:Like any system, SimpleDB also has more computing power. It can provide parallel some limitations in term of capacity of domains, computations. inquiry time, and quantity of domains it can handle. i. Performance: The collection of processors in the These limits are not rigid and are likely to change as system can provide higher performance than a long as system is in use. Some of restrictions are as centralized computer. This also gives better tabulated below: price/performance ratio. Table 2: Parameter and limitations of Simple DB [2] C. High Level Comparison with SQL Databases: Parameter Limitation Table 1: Comparison between SQL and NoSQL database [4] Domain Size 10 GB per domain / 1 billion attributes per SQL Databases NoSQL Databases domain One type with minor Many different types of Domains per Account 100 variations between them. databases including key- Attribute value length 1024 bytes Types value, wide-column, Maximum items in Select response 2500 document databases, and Maximum query execution time 5s graph databases. Maximum response size for one Select 1MB Records are stored as rows Varies based on type of statement and columns where each database. For example, column stores specific key-value, Document B. Dynamo: data about that record. databases store all Data Storage Separate data types are relevant data together in Amazon uses Dynamo for the purpose of its own Model stored in separate tables single "document" in internal application .Dynamo is a distributed key-value which are joined together JSON, XML, or another storage system. While performing its function Dynamo when complex queries are format, which can nest executed. values hierarchically. provides an inquiry API, thus allowing customers to recover Vertical scaling: a single Horizontal scaling: a value for a unique key and to put key-value pairs into the server is made simply add more storage with values less than one megabyte. [2] increasingly powerful to commodity servers or In Dynamo, the work load is dispersed based on the meet

A Comparative Study of Nosql Databases Abhishek Prasad1, Bhavesh N

JETIR Research Journal

Learning Key-Value Store Design

Riak KV Performance in Sensor Data Storage Application

Download Slides

Your Data Always Available for Applications and Users

“Consistent Hashing”

Memcached, Redis, and Aerospike Key-Value Stores Empirical

Building Blocks of a Scalable Web Crawler

Analysis of Human Activities on Smart Devices Using Riak-TS

On the Elasticity of Nosql Databases Over Cloud Management Platforms (Extended Version)

Using Erlang, Riak and the ORSWOT CRDT at Bet365 for Scalability and Performance

TIBCO Flogo® Connector for Riak KV Installation