Using Semantic Web Technologies in Heterogeneous Distributed Database System: a Case Study for Managing Energy Data on Mobile Devices
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE International Journal of New Computer Architectures and their Applications (IJNCAA)provided 4(2): by Directory 56-69 of Open Access Journals The Society of Digital Information and Wireless Communications, 2014 (ISSN: 2220-9085) Using Semantic Web Technologies in Heterogeneous Distributed Database System: A Case Study for Managing Energy Data on Mobile Devices Zhan Liu, Anne Le Calvé, Fabian Cretton, Nicole Glassey Institute of Business Information Systems University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland {zhan.liu, anne.lecalve, fabian.cretton, nicole.glassey}@hevs.ch ABSTRACT and influences the direction of future performance. Reducing energy consumption in the residential Managing energy data from multiple distributed and sector is important as well, because energy heterogeneous sources is an important issue demand in this sector is notably increasing worldwide. This study focused on using semantic web recently [2]. technologies in an energy data management system. Specifically, we provided a federated approach - a However, such information processing is not mediator server - that allows users to access to multiple managed by an integrated database where all of heterogeneous data sources, including four typical the data relevant to an organization would be types of databases in energy data resources: relational stored and managed in one single unified and database, Triplestore, NoSQL database, and XML. A integrated database. Rather, databases are non- proposed architecture based on this approach is then integrated, distributed and heterogeneous [3]. This presented and our solution can realize the data is especially evident in the context of an energy acquisition and integration without the need to rewrite database management system due to several or transform the local data into a unified data. We reasons. First and foremost, an energy database further examined our architecture by a case study in a Swiss energy company and tested the system based on management system was built according to the a mobile platform. characteristics of the energy usage in a specific region. Thus, the requirements are diverse and as a consequence, database systems in the field of KEYWORDS energy are rather distributed and complicated. For example, electricity consumption function requires Semantic web, heterogeneous distributed database its information system to quickly convert and store system, SPARQL, RDF wrapper, database integration, a large body of non-relational data; thus, a NoSQL energy management, mobile application database like MongoDB [4] fits very well in this case. However, in other functions where the structure of the data is stable with low variability, 1 INTRODUCTION and if such data have relationship with other data in the same data source, a relational database Along with the contributing global warming, should be a good choice. Moreover, Triplestore is social concern over environmental problems has selected if the consumption data are necessary to increasingly become a hot issue in recent years. integrate with other remote data resources, like Companies must systematically manage energy geographic and weather information systems. In use and handle as much energy information as the case of a semi-structured data model, XML possible to get deep and quantitative knowledge of serves well and it is usually used to store and the process of energy consumption [1]. As an exchange information of configuration for important part of information resource, energy different systems. Second, an integrated system information resource supports energy efficiency was not the main goal at the time the database 56 International Journal of New Computer Architectures and their Applications (IJNCAA) 4(2): 56-69 The Society of Digital Information and Wireless Communications, 2014 (ISSN: 2220-9085) systems were built [3]. Third, energy database heterogeneous distributed database management systems that differ from each other may be caused systems. According to Tim Berners-Lee et al. [10], by changes in technology. Last but not the least, in the Semantic Web "provides a common contemporary urban environments and at a framework that allows data to be shared and household level, energy management requires that reused across application, enterprise, and the design of systems be able to integrate remote community boundaries." While rapidly evolving, and spatially distributed monitoring data while it is only recently that semantic web technologies being open, low cost, easy to use and flexible [5]. are becoming available and stable, and practical All these characteristics indeed set barriers to solutions emerge and flourish in many fields. The getting accurate energy information in a global idea involves the concept of Linked Data, which perspective. Only using the existing tools cannot aims at enabling the same kind of possibilities for solve these problems. data, as well as creating a universal medium for Nowadays, the increasing globalization has exchanging information based on the meaning of encouraged waves of mergers and acquisitions, content on the Web [11] in a way that is usable presenting new difficulties for companies to directly by machines. Resource Description Framework (RDF) is a general language to handle huge amounts of complex and disparate describe resources, especially on the web, and information across regions [6]. Simply exchanging SPARQL is a query language for RDF that can basic information today may involve accessing and interpreting a wide variety of formats, data join data from different databases, as well as language, data models, and protocols that go documents, inference engines, or anything else beyond just text. To achieve coordination of that might express its knowledge as a directed labeled graph [12]. diverse computerized operations, it is necessary for a company to have database systems that can To this end, we proposed a uniform approach for operate over a distributed network and can SPARQL querying a heterogeneous distributed encompass a heterogeneous mix of hardware, database system in energy data management. This operating systems, local database management federated method provides transparent query systems and even data models for different access to multiple heterogeneous data sources, databases. Consequently, integrating and querying including relational database, Triplestore, NoSQL data from heterogeneous sources has become a hot database and XML, thus realizing the process data research topic among information researchers. In acquisition and integration without the necessity to general, there are two possible approaches to the rewrite or transform the local data. Most extant architecture of a heterogeneous distributed studies in heterogeneous distributed database database: namely warehouse approach (e.g., [7]) systems only consider a single language (e.g., [3]) and federated approach (e.g., [8]). The separation or only focus on relational data (e.g., [12]). Our is sometimes called centralized and decentralized approach is different from them in two ways: on systems. The first method typically provides a one hand, we do not only look at one specific uniform interface to materialize the integrated database model (e.g., relational database), but also view. The latter approach, on the other hand, is a provide solutions to integrate other database form of virtual integration – the data are brought models. On the other hand, our mediator server together as needed [9]. In this study we focus on does not require local databases to translate or the federated approach, as under this architecture transfer to a unified language; instead, all local local databases can continue their local operations sources remain at the original level thus it has cost and transactions without changing the features of advantages. Moreover, our solution fulfills the local databases; but at the same time participate in energy information calculation from the integrated the federation. Therefore, this approach is more data: e.g., daily, monthly, quarterly, etc. stable and reliable in our case. This study aims to provide a better understanding The burgeoning semantic web technology has for a heterogeneous distributed database system in provided new methods for integration of 57 International Journal of New Computer Architectures and their Applications (IJNCAA) 4(2): 56-69 The Society of Digital Information and Wireless Communications, 2014 (ISSN: 2220-9085) energy data management. More precisely, it efficiency (e.g., [13]; [14]; [2]), both on the contributes by: industrial level (e.g., [15]; [1]) and resident level (1) Proposing architecture for SPARQL (e.g., [2]). For example, the building sector is querying a heterogeneous distributed reported as responsible for 40% of the total database system. This system includes European energy consumption and 36% of the multiple heterogeneous data sources, such as European Union’s total CO2 emission [16]. But at the same time, it offers technological opportunities relational database, Triplestore, NoSQL database and XML. to improve energy efficiency and greenhouse gas emission. Accordingly, a number of researches (2) Implementing the system according to the have been conducted. For example, [17] proposed approach by using semantic established an integrated control system to technologies. It also performs query optimize energy consumption and user comfort in optimization to speed search executions.