<<

Conceptual Mappings to Convert Relational into NoSQL

Myller Claudino de Freitas1, Damires Yluska Souza2 and Ana Carolina Salgado1 1Center for Informatics, Federal University of Pernambuco, Professor Luis Freire ave, Recife, Brazil 2Academic Unit of Informatics, Federal Institute of Education, João Pessoa, Brazil

Keywords: Relational Databases, NoSQL Systems, Conceptual Mappings, Data Conversion.

Abstract: Sometimes, data belonging to Relational databases need to be transferred to NoSQL ones. However, the data conversion process between Relational to NoSQL databases is considered as not trivial, since it is necessary to have considerable knowledge about the data models at hand. Regarding the structural heterogeneity underlying this problem, we propose an approach, named as R2NoSQL, which defines conceptual mappings to enhance the data conversion process. In this paper, we present our approach and some implementation and experimental results, which show that, by using the defined conceptual mappings, we obtain a consistent target NoSQL with respect to a source Relational one.

1 INTRODUCTION establish conceptual mappings between RDB elements and the different data structures Due to the increasing amount of data generated by underlying NSDBi in such a way that RDB can be user interactions on the Web or by big data converted to NSDBi. requirements, some companies are focusing on using With this in mind, and considering the structural non-relational databases, usually referred to as heterogeneity between Relational and NoSQL NoSQL systems, standing for ’Not only SQL’ (Han models, this paper presents the R2NoSQL approach et al., 2011). This term has been used to categorize for converting data between the referred models. To databases characterized by horizontal scalability, this end, it compares the data structures belonging to less constrained structure or schema-less, and faster the Relational model with the four main NoSQL access compared to traditional relational databases approaches (key-value, columns, documents and (RDBMS) (McMurtry et al., 2013). graphs), identifying a of possible conceptual Experts comment that despite the rise of NoSQL mappings between RDB and a NSDBi. Also, it databases during the past years, NoSQL is not provides a tool prototype, which implements a case necessarily a replacement for relational databases study with a and a Document (McMurtry et al., 2013). Instead, NoSQL databases based NoSQL system. Experiments have been done comply with big or social data demands or specific to evaluate the consistency of the generated projects which strain Relational ones. Nevertheless, mappings by analysing the results obtained from the sometimes, data belonging to Relational databases same set of queries executed on both systems. need to be transferred to NoSQL ones in order to be This paper is organized as follows: Section 2 used in specific projects. However, the data introduces some concepts; Section 3 presents the conversion process is not trivial, since it is necessary approach; Section 4 describes some obtained results. to have considerable knowledge about the data Related work is discussed in Section 5. Section 6 models at hand. draws our conclusions and points out future work. In this scenario of structural heterogeneity, we define our research problem as follows: Let RDB be a Relational database and NSDB = 2 NoSQL MODELS {NSDB1, ..., NSDBn} a set of databases belonging to NoSQL models, where each NSDBi uses one NoSQL systems are a category of databases that do of the following NoSQL approaches A = {Key- not follow principles of the Relational Model (Han value, , Document, Graph}. We need to et al., 2011). The term “NoSQL” does not relate to a

174

Freitas, M., Souza, D. and Salgado, A. Conceptual Mappings to Convert Relational into NoSQL Databases. In Proceedings of the 18th International Conference on Enterprise Systems (ICEIS 2016) - Volume 1, pages 174-181 ISBN: 978-989-758-187-8 Copyright c 2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

Conceptual Mappings to Convert Relational into NoSQL Databases

specific , but to a group of data models atomic attributes through the representation of value that differ from the relational approach and may lists. Instances may have a different number of have in common some features such as: they are attributes, since there is no need to book storage usually open-source, distributed and horizontally space for values. Also, there is no need to use scalable, and they present schema flexibility or even operations in order to query diverse entities. no schema (Han et al., 2011). NoSQL systems are classified in some categories in which the four main 2.3 Document Model are: Key-value, Columns, Documents, and Graph. Indeed, their implementations may differ from each In Document Model, the data entities are grouped in other, even when the systems belong to a similar documents as objects, which are composed by keys category. In order to base our descriptions, we have (properties) and values. These documents are usually chosen one example of each NoSQL category. They serialized in JSON syntax (McMurtry et al., 2013). are briefly discussed in the following. A document is a collection of objects that are related to a data instance. The various documents 2.1 Key-value Model belonging to the same are stored in a collection of documents.Considering the MongoDB The Key-value Model is the one with the simplest document system (Mongo, 2015), an instance key representation. Its structure consists of a list of pairs (called as an “objectId”) can be set at persistence composed by a key and a value (Istvan et al., 2013). time, or may have its value generated randomly by Usually, Key-value NoSQL implemented the database. It can provide uniqueness values for systems allow, besides simple data types (e.g., other fields by the specification of an index. numerals and strings), the use of lists and sets of This model allows more complex queries values of simple types. This is what happens, for involving different collections of documents. To this instance, in Redis (Redis, 2015), our example of end, it is necessary that a document has a DBRef Key-value system. A Key-value system such as (Database Reference) to another related document or Redis tends to support large volumes of data. Since establish a reference. Despite allowing references, it does not present data schemas, the developer may, DBRef does not guarantee by hand, introduce some metadata by naming the constraint. A query may consider these references or keys. On the other hand, it does not support queries use data embedded within the same document. to be performed on the data, but only on the search keys. Thus, all access is done through the search 2.4 Graph Model keys and only with the key it is possible to access the value. This access usually is accomplished with The Graph model is mainly concerned with lower response times, one of its main benefits. This representation and access, where data items are model does not support relationships in terms of connected by relationships by means of a graph reference keys and no referential integrity constraint. structure (McMurtry et al., 2013). The elements underlying a graph are, namely (McMurtry et al., 2.2 Column Model 2013): nodes, edges and properties. Nodes correspond to data instances, edges refer to At a first sight, this model may be considered as maintained relationships among node instances, and similar to the Relational one, since it is also properties relate to data values. Some systems of organized in terms of rows and columns. However, such category allow the definition of their properties this approach deals with data in a non normalized with the guarantee of unique values. One example way, i.e., by allowing nesting of tables inside tables regards the Neo4j system (Neo4j, 2015). (Lakshman and Malik, 2010). In this approach, rows Nodes and edges can contain labels (terms which do not store a , but a set of attributes of the indicate a category) that classify them into more same type, while the set of attributes of a column specific groups. For the nodes, these labels can be contains the information from a given instance. Such used to differentiate instances. On edges, labels may feature allows queries to be performed more also be used to determine the type of relationship efficiently, although when recovering a complete that is occurring. instance it may become more costly. An edge has an input node and an output node Another important concept regards a “family of linking them. This feature, besides supporting columns”, which means a set of instances of a given references, also guarantees referential integrity by entity. In this structure, it is possible to have non- ensuring that the input node always makes reference

175 ICEIS 2016 - 18th International Conference on Enterprise Information Systems

to the output node. The access keys to the nodes are Definition 7 - Data Item. A data item is an automatically set by the system. However, it is instance or an individual of a real entity in the data possible to establish unique constraints for other set at hand. node properties. In the Relational Model, a data item is a tuple. In NoSQL approaches, it can be a data node, a data document, a column of a column family or simply a 3 THE R2NoSQL APPROACH value from the key value model.

In this section, we present some definitions along 3.2 Our Proposal with the proposed approach. As discussed in the previous sections, each database 3.1 Some Definitions model has specific data structures and concepts, what provides structural heterogeneity conflicts At first, we provide some definitions regarding the among them. These conflicts occur because different concepts underlying E-R and Relational models that choices of construct representation or integrity we consider in our approach. Since we need to constraints are adopted in accordance with the think about mappings between concepts, we define options underlying each data model. Thereby, in this what we consider by “Concept” in each one of the work, the task we are dealing with is concerned with working data models. Regarding the E-R Model, we what is necessary to convert concepts of a given may summarize a Concept as follows. RDB (a Relational database) to a NSDBi (a NoSQL Definition 1 – E-R Concept. The set of one). Thus, it becomes necessary to specify concepts of an E-R conceptual model we are dealing conceptual mappings between concepts Cr ∈ RDB and concepts Cn ∈ NSDBi. with are CE = {Entity, Simple Attribute, Multi-valued Attribute, Composed Attribute, Relationship, Our proposal, named as R2NoSQL approach, is Specialization}. based on three aspects: (i) defining conceptual In the light of the Relational Model, we define a mappings between RDB and NSDBi; (ii) using these Concept as follows. conceptual mappings to allow metadata and data Definition 2 – Relational Concept. Concepts of conversion between the referred databases, and (iii) classifying source tables to help understanding their a relational structure are CR = {, Simple Attribute, (PK), (FK)}. meaning in the . In the following, we As discussed in Section 2, we observe that each provide the definitions underlying these issues. NoSQL has specific data structures. Thereby, we provide the definition of the main 3.2.1 Conceptual Mappings Concepts of the four categories of NoSQL systems Our approach deals with the structural heterogeneity described previously. of the data models and some aspects of database Definition 3 – Key-value NoSQL Concept. A design. In order to cope with these issues, our concept in a Key-value Model may be C = {Search K mapping language handles the different existing Key, Value, Value List, Value Set}. concepts, which belong to the data models, but as a Definition 4 – Column NoSQL Concept. In a design reference, we consider some concepts not Column Model, a concept may be C = {Column C only from the Relational model but also from the E- Family, Line, Column, Value Set, Value List, R conceptual model. Thereby, we consider concepts Primary Key (PK)}. from the Conceptual E-R model, which are not Definition 5 – Document NoSQL Concept. In a directly implemented in a Relational database, but Document Model, a concept may be C = D they are close to real world and can be implemented {Document Collection, Document, Field, Embedded in NoSQL systems. This conceptual mapping is the Field, Field List, ObjectId, DBRef}. base for our conversion solution and without it, the Definition 6 – Graph NoSQL Concept. A process could not happen. These concepts regard Graph model concept C = {Label, Data Node, G particularly the composed and multi-valued Property, Property Set, Id, Edge, Value Set, Value attributes, and also specializations. By establishing List}. that, we deal with a source model and a target model Data indeed are instantiated differently in each and we define the set of possible source Concepts, to one of the referred databases. Nevertheless, we can be considered in a Mapping, as the following: think about a data item or a data instance, in a general way, as follows.

176 Conceptual Mappings to Convert Relational into NoSQL Databases

Definition 8 – Source Concept. A source concept RDB:FK ≡ NSDBK: ∄ Cs is the set of possible E-R or Relational concepts RDB:Specialization ≡ NSDBK:ValueSet which may be used to compose a Mapping. Thus, CS Regarding data items, we may establish a = CE U CR. Proceeding with the union , the mapping in the following way: final set results in CS = {Entity, Simple Attribute, RDB:DataItem ≡ NSDBK:Value | Multi-valued Attribute, Composed Attribute, NSDBK:ValueList Relationships, Specialization, Table, Primary key Although there is no corresponding concept to a (PK), Foreign key (FK)}. Since a conceptual Entity Table, it is possible to simulate such concept by always results in a relational Table, we abstract both using composed search keys. In this case, keys are ones only in the concept Table. composed by a prefix together with the name of a In the same way, we establish a target Concept, given property, in such a way that we may identify as the following. to which entity it is associated. Indeed, it is not a Definition 9 – Target Concept. A target concept defined standard, but one of our proposals. CT is the set of possible NoSQL concepts which may The representation of relationships occurs with the be used to compose a Mapping. CT = CK | CC | CD | storage of the search key values of a given data item CG. inside another one. In many-to-many relationships, Thus, the set of target concepts is composed by this happens in both sides of the data items at hand. the possible concepts which belong to one of the Now let RDB be a source database composed by NoSQL systems instantiated by a specific model. CS and NSDBC, a Column NoSQL system, composed With these definitions in mind, we define, firstly, by CC. Specific structural conceptual mappings in a general way, a Conceptual Mapping, as follows. between CS and CC are defined, as follows. Definition 10 – Conceptual Mapping. A RDB:Table ≡ NSDBC:ColumnFamily conceptual mapping M represents an association RDB:SimpleAttribute ≡ NSDBC:Column between a concept CS and a concept CT of a given RDB:ComposedAttribute ≡ NSDBC:ValueSet NSDSi, where NSDSi ∈ A, and A = {Key-value, RDB:Multi-valuedAttribute ≡ NSDBC:ValueList Column, Document, Graph}. M defines a level of RDB:PK ≡ NSDBC:PK similarity between CS and CT. RDB:FK ≡ NSDBC: ∄ A conceptual mapping M may be understood as a RDB:Specialization ≡ NSDBK: ValueSet way of converting a given CS into a CT. Depending Regarding data items, we may establish a on the target NSDSi, to a given CS, there may be no mapping in the following way: corresponding CT, i.e., there may be no concept in RDB:DataItem ≡ NSDBC:Line the target model that can be used for data In terms of relationships, a NSDBC allows their conversion. When this fact happens, we point it as implementation by two options: (i) a column family an empty or non existing target concept (∄). may compose information from different but related Based on the previous definitions, we establish tables; or (ii) data items may have a reference to specific conceptual mappings between CS and CT, other data items by storing the target primary key. according to the NSDSi at hand. To this end, we The former is the most common option, since it consider the possibility of employing a table allows a better response time. technique, which is the process of Now let RDB be a source database composed by adding redundant data or grouping data, previously CS and NSDBD, a Document NoSQL system, fragmented in a number of relational tables. composed by CD. Structural conceptual mappings Thereby, we may have nested tables or multi-valued between CS and CD are defined, as follows. attributes in one or more target structures, which RDB:Table ≡ NSDBD:DocumentCollection may be sets, lists, documents or other ones, RDB:SimpleAttribute ≡ NSDBD:Field depending on the data model. RDB:ComposedAttribute ≡ Let RDB be a source database composed by CS NSDBD:EmbeddedField and NSDBK, a Key-value NoSQL system, composed RDB:Multi-valuedAttribute ≡ NSDBD:FieldList by CK. Specific structural conceptual mappings RDB:PK ≡ NSDBD:ObjectId between CS and CK may be defined, as follows. RDB:FK ≡ NSDBD:DBRef | RDB:Table ≡ NSDBK: ∄ NSDBD:EmbeddedField RDB:SimpleAttribute ≡ NSDBK:Value RDB:Specialization ≡ NSDBD: EmbeddedField RDB:ComposedAttribute ≡ NSDBK:ValueList Regarding data items, we may establish a RDB:Multi-valuedAttribute ≡ NSDBK:ValueList mapping in the following way: RDB:PK ≡ NSDBK:SearchKey RDB:DataItem ≡ NSDBD:Document

177 ICEIS 2016 - 18th International Conference on Enterprise Information Systems

Relationships are implemented by defining generates a NSDBi as output. object references between objects belonging to ------documents. Thereby, queries may take into account Algorithm1: Data Conversion. these references to get related objects information. ------Now let RDB be a source database composed by Input: RDB ; C and NSDB , a Graph NoSQL system, composed Output: NSDSi ns; S G Begin by CG. Specific structural conceptual mappings //Looks for main tables between CS and CG are defined, as follows. 1: For Each table of rel Do table RDB:Table ≡ NSDBG:LabelNode 2: If ( .classification() is “main”) Then RDB:SimpleAttribute ≡ NSDBG:Property 3: get all table attributes to object RDB:ComposedAttribute ≡ NSDBG:ValueSet 4: For Each table referencing table Do RDB:Multi-valuedAttribute ≡ NSDBG:ValueList //looks for related tables 5: goDeep(table, object); RDB:PK ≡ NSDBG:Id //persists object on ns RDB:FK ≡ NSDBG:Edge 6: persist(object) ; RDB:Specialization ≡ NSDBK:ValueList 7: End For; Regarding data items, we may establish a 8: End If; mapping in the following way: 9: End For; End Data_Conversion; RDB:DataItem ≡ NSDBG:Node ------In the next section, we provide the way we In our approach, the input RDB is composed by classify identified tables in a given RDB. its metadata and data. Tables were already classified 3.2.2 Table Classification and this classification is also used as input. Based on that, the algorithm verifies the kinds of existing tables and, for each type, extracts the set of data Regarding the set of concepts CR ∈ RDB, the main one is always a Table. Since a table may be the items. Through references (FKs), data tables are result of E-R conceptual entities, relationships, traversed and analyzed. At such verification time, specializations, multi-valued or composed attributes, decisions are taken according to each type of table. ------we need to understand what a Table means, and its importance, to the RDB at hand. We have defined a Algorithm2: goDeep. classification of the source Tables as follows. ------Input: table, object; ƒ Main Tables: These are the main tables in a Output: table, object; RDB design. They usually correspond to Begin entities found in the conceptual model. //Looks for tables that reference table ƒ Subclasses: These tables are a complement to 1: table2 := findTableDeep(table); 2: Do Switch table2.classification the definition of a main table. They represent 3: case “common”: specializations of the main tables, but do not 4: get all table2 attributes to object; exist independently. 5: goDeep(table2); 6: case “subclass”: ƒ Relationships: This classification typifies a 7: get all table2 attributes to object; specific kind of table, which implements a 8: goDeep(table2); many-to-many (N:N) relationship in the 11: End Switch; conceptual model. //looks for related tables 10: goUp(table, object); ƒ Common Tables: The other types of tables are End goDeep; defined as common in a source RDB schema. ------Our proposed algorithms take into account such Main tables constitute the basis for the analysis. table classification in order to identify the With a main table at hand, the algorithm verifies if it conceptual mappings to be used. is related with other ones. In this case, it calls another algorithm (Algorithm2–goDeep) where 3.2.3 Conversion Algorithm existing relating tables are identified. Regarding Based on the specified conceptual mappings these selected tables, some options may be taken, namely: (i) If the table is another main table, no between CS and CT, and on the table classification, some algorithms have been developed to allow data other procedure is accomplished because, later, the conversion. A main algorithm, named as opposite direction of the relationship will be Algorithm1-Data Conversion, receives a RDB considered. At this later time, the second table will enriched with a Table Classification as input and refer to the first one; (ii) If the table is a relationship

178 Conceptual Mappings to Convert Relational into NoSQL Databases

table, no action is taken because, only when all the the data items that are referenced. For each type of data items are persisted, relationships can be implemented target database, the algorithm must analyzed; (iii) If the table is a common one, it is implement a specific procedure. In this work, we possible to add its attributes in that main table as show one regarding the MongoDB system. This part of its structure; (iv) If the table is a subclass, algorithm, named as oneTo, was implemented to then its attributes are added to the main table. It is provide DBRef storage. It stores in the understood that it is a specialization of the main one. corresponding document of Table 1 a reference to Each time a new table is found in-depth analysis, list2, and in the one corresponding to Table 2, a the algorithm selects this table and repeats the reference to list1. In terms of many-to-many process until there are no more tables, or a stop relationships, it is necessary to identify the double condition happens. This process is responsible for meaning of these references (manyToMany the denormalization of the data, in which the data algorithm). In MongoDB case, a DBRef of each related to the main table are included in a data item. document involved in the corresponding document ------is stored. For instance, the student Bill Gates held a Algorithm3: goUp. publication. Thus there is a publication document ------reference in the student's document, and there is a Input: table, object; student document reference in the publication Output: object, reference_list, document. Other solution would be to embed all referenced_list; Begin related data into one document. However, this //Looks for tables that reference table approach can let the execution of queries harder. For 1: table2 := findTableUp(table); instance, if student documents contain publication 2: Do Switch table2.classification 3: case “common”: information, more effort would be necessary to 4: get all table2 attributes to object; retrieve people involved in a specific publication. 5: goUp(table2); 6: case “subclass”: //saves data items to set relationship 7: reference_list.add(table); 4 RESULTS AND EXPERIMENTS 8: referenced_list.add(table2); 8: goDeep(table2); 9: case “relationship”: break; In this section, we present some implementation and 10: case “main”: experimental results. 11: reference_list.add(table); 12: referenced_list.add(table2); 13: End Switch; 4.1 Implementation and Example of use End goUp; ------We have developed the R2NoSQL approach in the After the identification of in-depth relationships, Java language. The main functional requirements the tables that the main table refers are searched up underlying the tool’s development are the following: (Algorithm3 – goUp). According to the identified ƒ Set Source Database: it includes the definition relationships, one of the following options will be of the relational DBMS to be used as well as considered: (i) to capture the attributes and move up the metadata and data extraction step. or (ii) to save identified instances in a list. This ƒ Classify Table: The tables extracted from the function separates tables in which there could be Relational database will be classified by the user. composed or multi-valued attributes. It may also ƒ Set Target Database: The user chooses the show that a new entity has been found, and a target NoSQL database. relationship should happen. The procedure is ƒ Execute Data Conversion: It analyzes the repeated until a stop condition is reached. extracted metadata and data from the source After the main table and its related tables of a database, verifies tables’ classification and data item are analyzed, the whole set of attributes is possible conceptual mappings, identifies the persisted as an entity in the target database target NoSQL concepts and persists (Algorithm1, line 6). Just after all instances have corresponding data in the target database. been persisted, the algorithm must define the In this current version, we have developed a relationships among them. prototype which deals with a RDB (e.g., MySQL) as To establish the one-to-one or one-to-many a source database and a NSDBD as a target one. To relationships, generated lists (when a main table the latter, we have used the MongoDB system. mentioned another one) are used. These lists are We provide an example in the following. As included in the data items that have references, and source RDB, we have used a database with 15 tables

179 ICEIS 2016 - 18th International Conference on Enterprise Information Systems

(e.g., Person, Publication, Student). Among them, used mappings and algorithms have generated a there are some relationships (e.g., between Person consistent target NSDBD. In order to verify the latter and Publication), and some specializations (e.g., goal, we checked both obtained query results and Student is a specialization of Person). compared the ones obtained in a NSDBD with respect The Table classification is accomplished by the to the corresponding ones from a RDB. user, since, in this version, we have a semi- automated tool with respect to this functionality. Thus, after extracting the source RDB metadata, the tool asks the user to classify the extracted tables. After table classification, the R2NoSQL tool is able to proceed with the data conversion, in accordance with the defined algorithms (Section 3.2.3). The tool selects instances of each table classified as main, storing the table attributes and their associated values. This happens according to the existing mappings. When this process finishes, the tool looks for instances related to what was analyzed as a complement to the main tables. These related tables can be a representation of a complex value as a composed attribute, a multi-valued attribute or even a specialization. Considering a fragment of the source database at hand, we show some instances belonging to tables Figure 1: Document obtained after data conversion. Person and Student in Table 1 and Table 2, respectively. Taking into account the data presented The working data scenario was the same in Table 1 and Table 2, the tool produces a presented in the example of Section 4.1. This document as depicted in Figure 1. The resulting database was populated with 45 , and queries collection of documents is named by using the name were specified to be executed over them. Table 3 of the main table of that entity. The attributes and shows an examples of query used in our values of the tables were converted into document experiments, which shows all professor attributes fields. However, new fields were introduced as according to his name. It is presented in SQL and in depicted in Figure 1: AR_master regards the key MongoDB . attribute of table Master, a specialization of Student; The first experiment goal was accomplished, Person_X_Publication represents the many-to-many since, by using the tool, we could submit and relationship between Person and Publication. execute the same set of queries in both source and target databases. 4.2 Experiments Regarding the latter goal, we have compared the obtained query results in terms of data items We have conducted some experiments to verify the (instances) and their properties. For each query effectiveness of our approach. The goal of our submitted and executed in MongoDB, we measured experiments was twofold: (i) to check whether a set the degree of similarity of the answers with respect of queries formulated and executed in a RDB may be to the set of answers obtained in the relational either formulated and executed in a generated database (which acted as a gold standard). NSDBD, and (ii) to verify if the query results All queries returned similar instances with obtained from both databases are similar, i.e., if the identical property values (100%). Differences were Table 1: Table Person with a tuple.

Natural CPF RG name Bdate nationality e_mail url user pword profile From Bill 10-28- Washing gates@ms. http://gatesnotes. 74852963214 10987312 American gates gates U Gates 1955 ton com com

Table 2: Table Student with a tuple.

AR CPF Additional_info Adm_semester Adm_year Egressiondate 790099 74852963214 Co author of 19 articles. 1 1981 03-19-1981

180 Conceptual Mappings to Convert Relational into NoSQL Databases

obtained only in terms of query results presentation, databases. This approach is based on conceptual but not regarding the set of obtained data items. mappings defined between structural concepts from Therefore, we can see that the goals of this relational and NoSQL ones. experiment were achieved. It was possible to require Experiments have shown that obtained NoSQL the same information from both source and target database is consisted with the source relational one, databases. In addition, it was possible to obtain the by executing the same set of queries in both source same set of query results from both ones. and target databases. In fact, they produced similar query results. Table 3: A query example used in the experiment. As future work, some enhancements will be MongoDB done: (i) the tool will be extended to accomplish SQL query language data conversion by considering other categories of select pe.*, pf.* NoSQL systems, and (ii) an automated query db.Person.find({cpf: from Person pe inner join Professor "98632541754", conversion process will also be taken into account. pf on pe.cpf=pf.cpf $or:[{type: "ic"},{ inner join IC_Professor i on type: "invited"}]}, pf.cpf=i.cpf {cpf:1, rg:1, where pe.cpf = '95175368429' name:1, birth_date:1, union REFERENCES naturalness:1, select pe.*, pf.* nationality:1, user:1, from Person pe inner join Professor Han, J., Haihong, E., Le, G., Du, J., 2011. Survey on password:1, pf on pe.cpf=pf.cpf profile:1, e_mail:1, NoSQL database. In: Pervasive computing and inner join Invited_Professor ip on type:1, applications (ICPCA), 2011 6th international pf.cpf= ip.cpf additional_info:1}) conference on. IEEE. p. 363-366. where pe.cpf = '95175368429' Istvan, Z., Alonso, G., Blott, M., Vissers, K., A flexible hash table design for 10Gbps key-value stores on FPGAs. In: Field Programmable Logic and 5 RELATED WORK Applications (FPL), 2013 23rd International Conference on. IEEE. p. 1-8. Karnitis, G. and Arnicans, G., 2015. Migration of relational Data conversion approaches regarding Relational and database to document-oriented database: Structure NoSQL models have been tackled. Zhao et al. (2014) denormalization and data transformation. Communication propose an automatic approach for converting Systems and Networks (CICSyN), 7th International relational database schemas to NoSQL ones, which Conference on Computational Intelligence. p. 114–118. establishes conceptual rules for the denormalization of Lakshman, A., Malik, P., 2010. Cassandra: a decentralized the original data. Potey et al. (2015) provide a tool to structured storage system. ACM SIGOPS Operating perform data conversion, in which the target database is Systems Review, v. 44, n. 2, p. 35-40. an equivalent relational schema in a Document McMurtry, D., Oakley, A., Sharp, J., Subramanian, M., and Zhang, H., 2013. Data access for highly-scalable structure. Karnitis and Arnicans (2015) instead provide solutions: Using , , and polyglot persistence. a semi-automatic approach, which allows a Microsoft patterns & practices. comprehension of the relationships that the tables carry MongoDB, 2015. Available at https://www.mongodb.org/. one over the other by a classification strategy. Mpinda Last access on December, 2015. et al. (2015) present a data conversion process that Mpinda, S. A. T., Maschietto, L. G., and Bungama, P. A., aggregates data tables, which are analyzed along with 2015. From relational database to columnoriented the established relationships. nosql database: Migration process. International Our proposal extends some of these concepts. Journal of Engineering Research & Technology We provide a denormalization technique and we (IJERT), 4. p. 399–403. Neo4j, 2015. Available at http://neo4j.com. Last access on deal with some kinds of conceptual relationships, by December, 2015. producing references when possible. We have a Potey, M., Digrase, M., Deshmukh, G., and Nerkar, M., table classification strategy to enrich the overall 2015. Database migration from structured database to process. Finally, our approach may be applied to any non-structured database. International Conference on of the target NoSQL models. Recent Trends & Advancements in Engineering Technology (ICRTAET 2015), p. 1–3. Redis, 2015. Available at http://redis.io/. Last access on December, 2015. 6 CONCLUSIONS Zhao, G., Lin, Q., Li, L., Li, Z. 2014. Schema Conversion Model of SQL Database to NoSQL. In: P2P, Parallel, We presented the R2NoSQL approach, which allows Grid, Cloud and Internet Computing (3PGCIC), 2014 data conversion between relational and NoSQL Ninth International Conference on. IEEE. p. 355-362.

181