SQL, Nosql Or Newsql – Comparison and Applicability for Smart Spaces

SQL, noSQL or newSQL – comparison and applicability for Smart Spaces Christoph Rudolf Advisors: Stefan Liebald, M.Sc. and Dr. Marc-Oliver Pahl Seminar Innovative Internet Technologies and Mobile Communication (WS 16/17) Chair for Network Architectures and Services Department of Informatics, Technische Universität München Email: [email protected] ABSTRACT While this paper focuses specifically on the data structure In this paper, we analyze database architectures in terms used in DS2OS, we start by introducing key architecture of their applicability as a data storage for distributed data components which are necessary for understanding the data in smart spaces. As a concrete example, the paper takes structure. The central element and middleware of DS2OS is the Distributed Smart Space Orchestration System (DS2OS) the Virtual State Layer (VSL), a peer-to-peer system built which acts as a management system for smart spaces and of so called Knowledge Agents (KA) as its peers [16]. The implements a peer-to-peer network. We develop key require- VSL is self-organizing in a sense that peers automatically ments for a database solution based on the data characteris- locate and connect to each other. In addition to that, the tics of DS2OS. These requirements allow us to analyze rep- VSL handles synchronization of data between agents. In resentatives of database architectures regarding their appli- a real-world scenario, these interconnected agents run on cability. We conclude that the simple data models provided computing devices in different physical locations of a smart by NoSQL databases are suitable for modeling the desired space. They provide an interface between a diverse set of data structure. Based on their performance and features, services and the VSL. Services can, for example, manage a we propose PostgreSQL, Redis and InfluxDB as suitable so- smart device and encapsulate its complexity, collect sensor lutions for DS2OS. data or output information to the user. One of the major aspects of a system like DS2OS is the de- Keywords sign of a suitable data model for all possible services and NoSQL, NewSQL, database types, DS2OS, IoT, smart spaces devices. This is especially important to allow for previ- ously unsupported devices to be introduced into the system. 1. INTRODUCTION Therefore, domain-specific data has to be added dynami- Smart spaces are real-world spaces with embedded devices cally.[16] To do so, DS2OS provides a directory of context that capture data about their environment and control as- models that can be extended by developers. A context model pects of it [14]. The data that is shared in smart spaces is a data structure that holds properties of a service and acts originates from distributed sources connected to each other. as an interface. A standard context model for services which This offers challenges regarding the storage of data handled are, for example operating a lamp, is desired to be adopted by a common middleware. The Distributed Smart Space by many services for lamps to decouple service implementa- Orchestration System (DS2OS) is such a middleware and tion from the specific devices. Context models are currently is used as a representative to asses different paradigms for stored in an XML-format.[15] database management regarding their suitability as a data storage in smart spaces. This paper starts by introducing The context model of a service is instantiated when a ser- special characteristics of data in DS2OS. Further, it gives vice initially connects to a KA. The context models consist an overview over database architectures. We then develop of “hierarchically structured typed key-value pairs (...) with requirements to be met by a suitable database solution and additional management metadata” [15] and can be thought select representatives of each architecture type for evalua- of as a tree data structure that allows to address each at- tion. tribute, similar to files in a filesystem. Figure 1 visualizes the interconnected agents, their services, the data as well as 2. DATA STRUCTURES IN THE DS2OS sample attributes for a lamp as service S1. The Distributed Smart Space Orchestration System (DS2OS) is a management system for smart devices. It was started All information depicted in Figure 1 is known to every KA in the Ph.D. thesis of Marc-Oliver Pahl [14] and is actively in the VSL as meta data. However, only the KA that developed at the Chair of Network Architectures and Ser- is directly connected to a service stores the actual data. vices at the Technical University of Munich1. The system Other agents have to actively query the respective KA for contains a middleware for the uniform abstraction of smart this data by referring to its path in the hierarchic structure /KA /S /switchedOn/ spaces. It targets the heterogenic field of today’s smart de- (e.g. 1 1 ) which forms a unique identi- vices and makes them manageable by a central software.[16] fier. They can also subscribe to an address and be notified of changes. These notifications are sent by the KA in charge 1www.ds2os.org of storing the data. Storing the data as well as the structure Seminars FI / IITM WS 16/17, 39 doi: 10.2313/NET-2017-05-1_06 Network Architectures and Services, May 2017 / in a mathematical sense. Each table models an n-ary relation, by specifying n columns with each storing a certain type of information for all entries. The column definition provides the scheme of the table. The data itself is stored as row entries in form of n-tuples.[3] This design approach KA1 KA2 KA3 makes relational databases adaptable to many application domains. They allow to define certain columns as unique key values and to create references between relations. If utilized correctly, this reduces redundancy of data. S1 coloredLamp S2 S3 S4 Relational databases provide powerful querying capabilities based on relational algebra. Relational algebra defines a set switchedOn of operations working on relations. Every output generated brightness by a query combining relations with their operations is a color new relation in itself. This enables complex nested queries. It is within the responsibility of the user or an optimizer Figure 1: Visualization of the DS2OS data structure integrated into a DBMS to make them as efficient as possi- known to each KA. ble. The features of relational algebra are implemented by SQL which acts as a de facto standard query language for modern relational databases.[5] is currently implemented by using the relational database HSQLDB with a database schema of three tables: Relational databases usually do well in terms of concurrency and reaction to failures as they provide ACID transactions. structure : [ address, type, readers, writers, restriction, By being ACID compliant, transactions are supposed to ful- { cachedParameters ] fill the following principles [10]: version : [ address, timestamp,} version ] { } data : [ address, timestamp, value ] { } Atomictiy: transaction are considered as a single unit • The address inside the tree structure references specific en- Consistency: results are consistent database states tries in all three tables. All addresses are associated with a • Isolation: no side-effects among parallel transactions semantic data type in the context of the VSL that is inde- • pendent of the type in the database. Durability: their results are persistent • The structure table is responsible for storing the complete structure depicted in Figure 1 and additional metadata like This is necessary if a series of critical changes is made that restrictions and permissions. The version table receives can only be allowed to complete fully or have no effect at new entries for every change made to a value, as the data ta- all. While transactions are no inherent attribute of rela- ble is not updated in case of a value being changed. Instead, tional databases, it usually distinguishes them from newer a new entry with a timestamp is added, in addition to the approaches. Whether or not transactions are needed is sub- new version in the other table. The version update via insert ject to the application domain. does also happen for all prefixes of the changed address up to the service level. For this reason, separated tables help to insert versions for all parent nodes without duplicating their 3.2 NoSQL values. By inserting new versions, the database for DS2OS While relational databases are adaptable to many scenarios, has to handle mainly insert operations. not all data models fit well with a relational schema. Re- lational databases are inflexible when the schema has to be changed due to redesigns in the underlying application. In 3. DATABASE ARCHITECTURES addition to that, their sophisticated features, like ensuring Due to new database architectures, the field of database data consistency, add overhead which is sometimes unnec- management systems (DBMS) has been diversified over re- essary and critical in contexts of massive amounts of data cent years. While relational databases are prevalent in the being handled. This has become an issue for online services majority of all use cases, unstructured data calls for new ap- over the recent years, which are in need of database systems proaches [11]. Not all DBMS can be assigned solely to a sin- that can be distributed.[7, 9] gle architecture type of databases, as many provide features that are characteristics of other groups. This section pro- NoSQL databases solve these problems with different ap- vides a brief overview over current database architectures. proaches to data organization and an increased focus on distributability. The term NoSQL does not describe a spe- 3.1 Relational database cific group of databases, instead, it is a collective term for Relational databases are currently the most widespread type systems that break the classic relational approach.[8] With- of databases [6], with well-known DBMS like PostgreSQL out the relational basis, NoSQL databases do not provide and the briefly mentioned HSQLDB used in DS2OS.

Load more