Science, EngineeringDimitar Pilev, & Education, Ventzislav 1, Nikolov (1), 2016, 75-82
Developing of Node.js module for working with an effective temporal model in NoSQL data bases
Dimitar Pilev, Ventzislav Nikolov *
University of Chemical Technology and Metallurgy, 8 Kl. Ohridski, 1756 Sofia, Bulgaria
Received 15 March 2016, Accepted 05 September 2016
ABSTRACT
The ever growing user requirements to access information demands using temporal database models. In such cases not only the data but the time intervals of their validity should be saved. The present communication describes the develop- ment of Node.js software module to process temporal data in the non-relational DBMS MongoDB. Asynchronous software model leads to extremely fast processing of large data amounts of temporal databases. Keywords: temporal databases, Node.js, NoSQL database, MongoDB, asynchronous programming.
known Java EE [1], Apache PHP [2] and ASP. INTRODUCTION NET [3], as well as the new, but recently increas- ingly popular Node.js [4, 5] are among them. The continuous increase of the number of The Node.js [6] platform is quickly gaining users utilizing the internet as means of com- popularity amongst web developers thanks to the munication, payment, shopping, etc. imposes ease and speed of processing user queries. The the use of new technologies when implementing use of the Node.js technology is exponentially web-based applications. The latter should direct increasing. Leading companies such as PayPal, a large number of user queries towards the large LinkedIn, WalMart [7] have rewritten their ap- amount of information stored in the databases. plications using Node.js. One of the main requirements towards any In a number of cases, the data base appears contemporary web-application is related to its as the primary source of delay in processing user quick response time. The main objective is to queries. The delay increases proportionately to decrease the response time of submitted user que- the processed data amount. This is the reason why ries regardless of their number and complexity. leading companies such as Google, Facebook and At present there is a number of platforms Amazon are making the transition towards using for building web-based applications. The well- a new type of databases called non-relational
* Correspondence to: Ventzislav Nikolov, University of Chemical Technology and Metallurgy, 8 Kl. Ohridski, 1756 Sofia, Bulgaria, E-mail: [email protected]
75 Science, Engineering & Education, 1, (1), 2016
databases (NoSQL - Not only Structured Query the extreme popularity of the problem, however, Language) [8, 9]. commercial DBMS such as Oracle and Teradata The NoSQL database provides a mechanism have already published [10], [11] new specifica- for storage and restoration of data using a free, tions of DBMS with temporal support. An addi- coordinated model of data unlike the relational tional surface layer has been developed in [12] model of DB. A non-relational database is an op- to DBMS MySQL to implement it. timized depository containing information of the The purpose of the development described in type key-value. Its design purpose is to ease the this communication is to implement a module for processes of restoration and addition of informa- the Node.js platform providing a effective time tion aiming to optimize the efficiency under the temporal support of data stored in MongoDB conditions of introduction and storage of large database. The module allows the processing of amounts of data. queries related to addition, deletion, modifica- Traditional models of DB and DBMS are tion and searching of temporal information in capable of storing and processing only the pres- the database. ent state of the modeled subject area. These are normally present states, whose old values are An effective temporal model destroyed when change is necessary. The constantly increasing demand of infor- The effective temporal model (ETM) is us- mation imposes the use of temporal DB, where ing effective time representing a combination of not only data is stored, but also the period of valid and transactional time. The beginning of its validity. One of the major reasons for non- the effective period is set at the current time, i.e. reporting the change of data states in the course the moment of recording the cortege in DB (the of time refers to the lack of adequate maintenance beginning of the period marking the transactional of the temporal factor in DBMS. Until recently time, Ts), and its end is set at the time marking the existing DBMS would not allow temporal the end of the period of the valid time, Ve. processing of data. Taking into consideration The addition of data to the relation under
( , ( 1, β¦ , ), ) = β² ππ ππ ππππππππππππ{(ππ1,ππβ¦ , |ππ )}π‘π‘ ( 1, β¦ , | ) ( 1, β¦ , | ) ( , ) β² β² ππ βͺ ππ ππππ π‘π‘ππ ππππ βπ‘π‘ππ οΏ½ ππ ππππ π‘π‘ππ β πποΏ½ β βπ‘π‘ππ οΏ½ ππ ππππ π‘π‘ππ β ππ β βπ‘π‘ β ππππππππππππππ π‘π‘ππ π‘π‘ππ οΏ½ β§ {( 1, β¦ , | )} {( 1, β¦ , | ( , ))} ( 1, β¦ , | ) ( , ) βͺ β² β² ππ ππ ππ ππ ππ ππ ππ ππ ππ ππ ππ β ππ ππ π‘π‘ βͺ ππ ππ ππ β ππ ππ ππ ππ ππππππ π‘π‘ π‘π‘ ππππ β π‘π‘ οΏ½ ππ ππ π‘π‘ β ππ β ππ ππππππ ππ π‘π‘ π‘π‘ οΏ½ β¨ βͺ a) addition of data β©ππ
( , ( 1, β¦ , )) =
ππππππππππππ ππ ππ ππππ {( , β¦ , | )} {( , β¦ , | ( , ))} ( , β¦ , | ) 1 1 1 in opposite case ππ β ππ ππππ π‘π‘ππ βͺ ππ ππππ ππππππππππππ π‘π‘ππ πΆπΆπΆπΆ ππππβπ‘π‘ππ οΏ½ ππ ππππ π‘π‘ππ β ππ β πΆπΆπΆπΆ β π‘π‘ππ οΏ½ οΏ½ ππ b) deletion of data
( , ( 1, β¦ , ), ) = ( ( , ( 1, β¦ , )), ( 1, β¦ , ), ) β² β² π’π’π’π’π’π’π’π’π’π’π’π’ ππ ππ ππππ π‘π‘ππ ππππππππππππ ππππππππππππ ππ ππ ππππ ππ ππππ π‘π‘ππ Fig. 1. Updating data in relation.
76 Dimitar Pilev, Ventzislav Nikolov
ETM is implemented in case facts that are not is still active), then data modification, rather than recorded in DB (a1, β¦ , an) have to be registered. data addition, is required. The latter are usually valid for a definite period of The updating of the values of data (a1, β¦ , an) time. The effective time coincides with the valid which is part of the current state of the rela- time, which starts with the recording of the facts tion, requires the use of the following function: in DB. This means that the facts are valid in the changeETP(te, te) - this function changes the reality modeled being recorded in DB. The data effective time period,β² during which the data in recording brings back a new updated version of the cortege is identically marked regardless of the the relation. meeting or overlapping of the old and the new
Functions referring to work with effective periods (te and te β). temporal intervals are defined in the model with The deletion of a certain cortege consists of its the aim to present the semantics of data updat- logical removal from the current state of the rela- ing. They are: tion (Fig. 1b). The logical removal of the cortege l the function meets(te, teβ ) performing a test is performed by correcting the time period during for meeting of two intervals: which the non-temporal attributes (a1, β¦ , an) are - ( , ) marked. Such correction is performed by setting β² the time for ending the effective time period, ππ ππ ( = ) ππππππππππ π‘π‘ π‘π‘ which time is current at the moment of deletion. β² β ππππππππππππ π‘π‘π‘π‘π‘π‘π‘π‘ ππππ πΈπΈ ππ πΈπΈ π π For this purpose we use the function ( ). l the function ( , ) performing a β ππππππππππππ ππππππππππ ππππππππ [ , ) ( ππππππππππππ) π‘π‘ππ test for partial overlap of two intervals:β² ππππππππππππππ π‘π‘ππ π‘π‘ππ - ( , ) β ππ ππππππππππ 0πΈπΈ π π πΆπΆ πΆπΆ ππππ elseπΆπΆπΆπΆ β π‘π‘ ππ β² ππππππππππππππ π‘π‘ ππ π‘π‘ππ ( < ) Ifβ theππππππππ valuesππππ of attributes in the cortege
β² (a1, β¦ , an) do not exist or, if they do exist but β ππππππππππππ π‘π‘π‘π‘π‘π‘π‘π‘ ππππ πΈπΈ π π πΈπΈ ππ are not part of the current state of the relation, then β ππππππππThereππππ areππππππππππ three casesππππππ ππreferring to the addition the deletion of the cortege will produce no effect. of new data to the relation (Fig. 1Π°). The modification of an existing cortege is Case one: Let the values of attributes performed through the βupdateβ operation (Fig. (a1, β¦ , an) , which do not depend on time, be 1c). It is defined as consecutive execution of the unrecorded in the relation until now, or comprise operations βdeleteβ and βinsertβ. The proposed part of its previous states (Fig. 1Π°). In that case ETM reduces significantly the stored data vol- there is no record when the effective period of ume. There is significant reduction in the excess time overlaps the new period. In that situation we of information stored in the DB. At the same time add a new cortege to the relation. we can obtain a precise evaluation of the data Case two: If the values of attributes (a1, β¦ , an) validity over time. are recorded in the relation and a record exists, An asynchronous programming model with then its period of validity te meeting the new teβ, i.e. the effective period which is used to mark the Node.js cortege, will be updated to [Es, Ee). Case three: If the values of attributesβ² are part Node.js is a platform built on the basis of of the current state of the relation (the effective Google V8 JavaScript [13] for easy building of period, when the data in the cortege is marked, quick and scalable network applications. Node.
77 Science, Engineering & Education, 1, (1), 2016 js uses an βevent-drivenβ asynchronous input/ also switching between separate threads. output model of operation based on a single The Node.js platform is only an environ- process, which makes it easy and efficient for ment, which means that the programmer must real-time applications. do everything by himself. By default there is Node.js allows the development of applica- neither HTTP, nor any other server. A single tions written in JavaScript, which are executed by script contains the entire communication with the server. The platform contains the most popular the users. This significantly reduces the number operation systems such as Windows, Mac and of resources used by the web application. Linux. The environment provides an opportunity to interact with input/output devices through its A data model at MongoDB API written in C++. This allows a connection to other external libraries written in different lan- MongoDB [14] is a document-oriented data- guages by using JavaScript code. base management system, which stores structured Node.js has a module architecture to simplify information in JSON format [15] with dynamic the creation of complex applications. It contains scheme. This makes the information integration built-in asynchronous input/output libraries for in certain applications much easier and faster. working with files, sockets and HTTP commu- MongoDB has gradually become one of the most nication. popular non-relational DB, often referred to as With traditional and well-known platforms of NoSQL. Java EE and PHP queries are processed by creating Table 1 shows the terminology used with a separate thread or process for each query. Unlike MongoDB compared to that of the relational them, Node.js implements an entirely different model. technology operating with a single thread based on Data in MongoDB is stored in the form of asynchronous queries processing (Fig. 2). For this documents set in JSON format (Fig. 3). Within purpose the so-called event cycle is performed, each separate document, the value of a certain which provides not only processing of tenths field may be of a random type including another of thousands simultaneous queries without any document, data set or data sets. concerns related to limited RAM memory, but MongoDB stores all documents in collections. A collection (Fig. 4) is a group of related docu- ments having a set of shared common indexes. Collections are analogous to the tables in the relational databases. The advantage of MongoDB is that it stores information for a particular object in a single document (Fig. 5), unlike the dividing and storing it in several tables typical for the relational DB model. The disadvantage of the non-relational
Fig. 2. Node.js event cycle. Fig. 3. MongoDB document.
78 Dimitar Pilev, Ventzislav Nikolov
Table 1. Terminology used with MongoDB.
SQL MongoDB
Database Database
Table collectionhttp://docs.mongodb.org/manual/reference/glossary/ - term-collection
Row Document
Column Fieldhttp://docs.mongodb.org/manual/reference/glossary/ - term-field
index index
Connection of tables Built-in documents
Primary key Primary key
DB, refers however to the lack of, or to the document from the collection is referred to ef- partially minimum support of connections. The fective time. For this purpose an additional field functionality of the extracting information stored (characteristic) βperiodβ is added to document in several different collections is performed by data, which sets the beginning and end of the ef- the application. fective time period marking the document (Fig. 6). DBMS of MongoDB uses an asynchronous The first step in Node.js module development is model for processing of queries submitted to the connected with the creation of the file package.json. database, which increases significantly the effi- It provides information on the functionality of the ciency of the system working with large data sets. This makes it suitable for storage and processing of temporal data.
Design and implementation of the modele for working with Π΅ΡΠΌ under MongoDB
During the development of the module, a normalized data model is selected where each
Collection Fig. 5. MongoDB storage of data regarding an Fig. 4. MongoDB collection. issued invoice. 79 Science, Engineering & Education, 1, (1), 2016
Fig. 6. Storage of data under the temporal module Fig. 7. Contents of the package.json file describ- supported model. ing the module developed. module developed. the data, and the callback function. Fig. 7 shows the contents of the package.json l remove(rΠ΅mΠΎvΠ΅Obj, callback) β for deletion file used. The first two rows contain the name and of data from the collection. Two parameters are the. version, i.e. the mandatory fields for each set: criteria used to perform the deletion and a Node.js module. The description is followed by callback function typical for the asynchronous the keywords and name of developer. Node.js programming model; interface MongoDB is used in order to perform a l update(oldObj, newObj, pe, callback) β for connection MongoDB. The interface is included modification of the data in the collection. Criteria in the developed module as dependancy (Fig. 7, concerning the update, the new values, the end of row 16). the marking time period and the callback func- The functionality of the module is distributed tion are set; among the following functions: l find(query, options, callback) β for extrac- l connect(mongoUrl, collection, callback) β tion of temporal data from the collection. The for establishing a connection to the MongoDB βqueryβ parameter sets criteria for extraction of server of DB. The submitted input data consists of data from the collection. The second parameter the parameters of connection to MongoDB server βoptionsβ is optional and sets additional settings as well as the collection and callback function such as sorting, restriction of the number of docu- which starts after the connection is established; ments returned by the query, etc. The callback l insert(insObj, pe, callback) β for addition function does the processing of the query result; of data marked with effective time to the col- l close() β for terminating the connection to lection. The execution of the function requires the MongoDB server of DB. the data added to the collection (in the form of a All operations related to data updating and/or document), the end of the time period marking extraction are performed on a single collection.
80 Dimitar Pilev, Ventzislav Nikolov
var update = function(oldObj, newObj, pe, callback){ var etm = require("./lib/index.js");
remove(oldObj,function(res){ var mongoUrl = 'mongodb://127.0.0.1:27017/test';
if(res == 'Done' || res.result.ok==1) etm.connect(mongoUrl, 'proba', function(){
insert(newObj,pe,callback); console.log('Connected correctly to server');
else etm.insert({x:1,y:2}, new Date(2016,2,5),
callback('Cannot update collection!'); function(res){ }); console.log(res.result); } etm.find({}, function(docs){
The implementation of the βupdateβ func- docs.toArray(function(err,data){ tion is in compliance with the requirements of ETM and the asynchronous programming model console.log(data); of Node.js In an ETM operation, βupdateβ is etm.close(); restricted to consecutive execution of the opera- tions βdeleteβ and βinsertβ. An addition of new }); data will be performed in the callback function of the βremoveβ function to observe the sequence }); of execution of both operations. This guarantees }); that the data new values are added only after the deletion of existing ones. }); Fig. 8 shows the programming code which demonstrates the use of the module developed. Fig. 8. Usage of the d module developed. A connection to the MongoDB server of DB is established after calling the module and setting chronous processing. A review is performed on the parameters of connection (row 1 and 2). Then the advantages and disadvantages of one of the an anonymous callback function is executed to most popular non-relational DBMS MongoDB. add the document {x:1,y:2} marked with a time A module has been developed for the Node.js period from the current time of execution of the platform providing to process temporal data in an query to 15.12.2016 (row 6). The βfindβ function environment of MongoDB. The module allows is executed (row 8) only after the set document the update and extraction of data marked with is stored in DB. Thus all documents stored in the effective time. An asynchronous software model βprobaβ collection are printed. leads to extremely fast processing of large data amounts of temporal databases. CONCLUSIONS REFERENCES The present article introduces a new platform for building network applications - Node.js. The 1. Oracle, Java EE Documentation, available: environment allows quick and effective pro- http://www.oracle.com/technetwork/java/ cessing of user queries because of their asyn- javaee/documentation/index.html
81 Science, Engineering & Education, 1, (1), 2016
2. The Apache Software Foundation, available: base.org/ http://www.apache.org/ 10. Database, O., Workspace Manager Devel- 3. PHP, PHP Documentation, available: http:// operβs Guide. September 2010. php.net/docs.php 11. Database, T., Temporal Table Support. Tera- 4. ASP.NET, Get Started with ASP.NET, avail- data Labs, 2012 able: http://www.asp.net/get-started 12. D. Pilev, A. Georgieva, Effective Time Tempo- 5. NodeJS, available: http://nodejs.org/ ral Database Model, International Journal on 6. JSConf Berlin,The European JavaScript Confer- Information Technologies and Security, ence, Berlin, November 7 & 8, 2009, available: 2012. N2: 33-46, ISSN 1313-8251. http://jsconf.eu/2009/ 13. Google V8, V8 JavaScript Engine, available: 7. WalMart, available: http://www.walmart.com/ https://code.google.com /p/v8/ 8. P. Sadalage, NoSQL Databases: An Overview, 14. MongoDB, available: https://www.mongodb. ThoughtWorks, October 2014 org/ 9. Nosql Database, availeble:http://nosql-data- 15. Introducing JSON, available: http://json.org/
82