FP7-ICT-2013-EU-Japan ClouT: Cloud of Things for empowering the citizen clout in smart cities FP7 contract number: 608641

NICT management number: 167 ア Project deliverable

D3.1 - Reusable components, techniques and standards for City Platform as a Service (CPaaS)

ABSTRACT

The objective of this document is to describe the reusable components in the City Platform-as-a- Service (CPaaS) layer. The services, and the API’s exposed by CPaaS infrastructure, depend on the requirements and on the architecture defined in WP1. WP3 also describes the non-functional requirements, such as the security requirements. This WP is focused on city information infrastructure, components, with the dependents interfaces, security layer and all the infrastructure necessary to access and manage data. WP3 is focused also on the access control and fault-tolerance methods for access and manage data.

D3.1 - Reusable components and techniques for CPaaS

Disclaimer

This document has been produced in the context of the ClouT Project which is jointly funded by the European Commission (grant agreement n° 608641) and NICT from Japan (management number 167ア). All information provided in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose. The user thereof uses the information at its sole risk and liability. This document contains material, which is the copyright of certain ClouT partners, and may not be reproduced or copied without permission. All ClouT consortium partners have agreed to the full publication of this document. The commercial use of any information contained in this document may require a license from the owner of that information. For the avoidance of all doubts, the European Commission and NICT have no liability in respect of this document, which is merely representing the view of the project consortium. This document is subject to change without notice.

The ClouT consortium is composed of the following institutions

No. Participant organisation name Short name Country

Commissariat à l’énergie atomique et aux énergies CEA 1 France alternatives (coordinator)

2 Engineering Ingegneria Informatica SpA ENG Italy

3 University of Cantabria UC Spain

4 STMicroelectronics S.r.l. ST Italy

5 Santander City Municipality SAN Spain

6 Genova Municipality GEN Italy

7 Nippon telegraph and telephone East Corporation NTTE Japan (NTT East) (coordinator)

8 Nippon Telegraph and telephone corporation (NTT NTTRD Japan R&D)

9 Keio University KEIO Japan

10 Panasonic System Solution PANA Japan

11 National Institute of Informatics NII Japan

ClouT – 31.07.2013 Page 2

D3.1 - Reusable components and techniques for CPaaS

EU Editor Cosimo Greco, ENG JP Editor Kenji Tei, NII Authors [Cosimo Greco, ENG], [Levent Gurgen, CEA], [Yazid Benazzouz, CEA], [Stefania Manca, GEN], [Takuro Yonezawa, Keio], [Kenji Tei, NII], [Fuyuki Ishikawa, NII], [Jose Antonio Galache, UC], [Fernando Mons Nunez, SAN] Internal reviewer Marco GRELLA, ST Deliverable type R Dissemination level PU (Confidentiality) Contractual Delivery Date 31/07/2013 Actual Delivery Date 31/07/2013 Keywords ClouT, Cloud Computing, IoT, Smart Cities, CPaaS, Reusable Components

Revision history

Revision Date Description Author (Organisation)

V0.1 10.06.2013 Table of Contents ENG created V0.2 08.07.2013 First draft with ENG contributions from all partners V0.3 12.07.2013 Second draft with ENG contributions from all partners V0.4 19.07.2013 Added Santander ENG contributions. V0.5 19.07.2013 First consolidated ENG version sent for internal review V0.6 26.07.2013 Reviewed version ST V1.0 31.07.2013 Final version ENG

ClouT – 31.07.2013 Page 3

D3.1 - Reusable components and techniques for CPaaS

TABLE OF CONTENTS

TABLE OF CONTENTS ...... 4 LIST OF FIGURES ...... 5 LIST OF ...... 7 LIST OF ABBREVIATIONS AND DEFINITIONS ...... 8 EXECUTIVE SUMMARY ...... 9 1. INTRODUCTION ...... 10

1.1. SCOPE OF THE DOCUMENT ...... 10 1.2. TARGET AUDIENCE ...... 10 1.3. STRUCTURE OF THE DOCUMENT ...... 10 2. SERVICE COMPOSITION PLATFORMS FOR CITIZEN’S APPLICATIONS ...... 12

2.1. INTRODUCTION ...... 12 2.2. SERVICE COMPOSITION AND MASH-UP TOOLS REUSABLE COMPONENTS ...... 13 WIRECLOUD ...... 13 Mycocktail ...... 14 Enterprise Mashup Markup Language ...... 15 OpenSocial ...... 16 Yahoo! Pipes ...... 17 Apache Shindig ...... 18 iPojo ...... 20 BPMN ...... 22 BPEL ...... 24 2.3. DEPENDABLE SERVICE COMPOSITIONS REUSABLE COMPONENTS ...... 25 Robust Service Composition METHOD ...... 25 QoS-based Service/CLOUD Selection Method ...... 26 Metadata-based Behavior Insertion FRAMEWORK ...... 27 Verification Framework of Time and Resource Constraints on Business Process ...... 28 Verification Framework of ECA Specification on Physical Interactions ...... 29 3. BIG DATA PROCESSING ...... 30

3.1. INTRODUCTION ...... 30 3.2. DATA/EVENT PROCESSING AND DECISION MAKING REUSABLE COMPONENTS ...... 31 Esper ...... 31 FI-WARE Gateway Data handling ...... 33 Jboss Drools Expert ...... 35 MongoDB ...... 36 3.3. SELF-HEALING FOR DATA/EVENT STREAMING REUSABLE COMPONENTS ...... 37 Self-healing Framework for Sensory Data ...... 37 Fault Classification Model ...... 39 4. SECURE AND DEPENDABLE ACCESS TO CITY DATA ...... 40

4.1. INTRODUCTION ...... 40 4.2. OPEN CITY DATA HOSTING AND ACCESS REUSABLE COMPONENTS ...... 41 HYPERTABLE ...... 42

ClouT – 31.07.2013 Page 4

D3.1 - Reusable components and techniques for CPaaS

Hbase...... 43 HDFS ...... 45 OPEN STACK SWIFT ...... 47 ...... 48 GE - FI-WARE Implementation ...... 50 CDMI Proxy ...... 50 Rest/soap apis for idas access ...... 51 TESTBED RUNTIME ...... 51 SOA3 ...... 52 4.3. DEPENDABILITY TOOLS FOR ACCESSING CITY DATA REUSABLE COMPONENTS ...... 53 D-Case Monitoring tool ...... 53 5. OTHER REUSABLE CPAAS ASSETS FROM CITIES ...... 57

5.1. INTRODUCTION ...... 57 5.2. GENOA REUSABLE CPAAS ASSETS ...... 57 RoadVisior - eMixer Platform ...... 57 WEATHER StatioN ...... 59 5.3. SANTANDER REUSABLE CPAAS ASSETS ...... 60 GIS PlaTFORM ...... 60 OPEN DATA PLATFORM ...... 65 Oracle Database Platform ...... 67 SAE Platform ...... 69 Incisis Platform ...... 70 5.4. FUJUSAWA REUSABLE CPAAS ASSETS ...... 72 Disaster prevention GIS system ...... 72 digital signage systeM ...... 73 5.5. MITAKA REUSABLE CPAAS ASSETS ...... 74 GIS system ...... 74 6. CONCLUSIONS...... 75

6.1. SERVICE COMPOSITION PLATFORMS FOR CITIZEN’S APPLICATIONS ...... 75 6.2. BIG DATA PROCESSING ...... 76 6.3. SECURE AND DEPENDABLE ACCESS TO CITY DATA ...... 76 6.4. CITY RESOURCES ...... 76 7. APPENDIX ...... 77

7.1. TG3.1 REUSABLE COMPONENTS...... 77 7.2. TG3.2 REUSABLE COMPONENTS...... 93 7.3. TG3.3 REUSABLE COMPONENTS...... 97 REFERENCES ...... 115

LIST OF FIGURES

ClouT – 31.07.2013 Page 5

D3.1 - Reusable components and techniques for CPaaS

Figure 1. ClouT City Architecture for Smart City Ecosystem ...... 10 Figure 2. WIRECLOUD Architecture ...... 13 Figure 3. MYCocktail web tool ...... 14 Figure 4 Open MyCocktail ...... 15 Figure 5. Yahoo pipes mashup tool ...... 17 Figure 6. apache shindig overall architecture ...... 18 Figure 7. apache shindig server architecture ...... 19 Figure 2-8iPojo Component example ...... 21 Figure 9Example of BPD (Stephen A. White) ...... 22 Figure 10. Robust Service Composition Method ...... 25 Figure 11. QoS-based Service ...... 26 Figure 12. Metadata-based Behaviour Insertion Framework ...... 27 Figure 13. Verification Framework of Time and Resource Constrains on Business Process ...... 28 Figure 14. Verification Framework of ECA Specification on Physical Interactions ...... 29 Figure 15CEP principle (Angelo Corsaro) ...... 31 Figure 16Esper integration to OSGi ...... 32 Figure 17Data Handling GE Architecture ...... 33 Figure 18High-Level production rule system ...... 35 Figure 19 Self-Healing Framework for Sensory Data ...... 37 Figure 20 - Fault Classification Model ...... 39 Figure 21 - Hypertable Architecture Overview ...... 42 Figure 22- Hbase architecture overview ...... 44 Figure 23 - Hbase cyclic replication overview architecture ...... 44 Figure 24. HDFS architecture overview ...... 46 Figure 25 - Apache Hadoop Cluster Server Roles ...... 46 Figure 26 - CEPH UNIFIED STORAGE ...... 48 Figure 27 Screenshot of D-Case Monitoring Tool ...... 53 Figure 28 System Overview of D-Case Monitoring Tool ...... 54 Figure 29 Web Page for Detailed Monitoring Data ...... 55 Figure 30 Santander cluster database architecture ...... 67 Figure 31 SANTANDER INCISIS implementation architecture ...... 70 Figure 32 Santander INCISIS FUNCTIONAL SCHEMA ...... 70 Figure 33 INCISIS SERVICE EXAMPLE SCHEMA ...... 71 Figure 30: Fujisawa disaster prevention gis system ...... 72 Figure 31. Overview of Fujisawa Signage ...... 73 Figure 32:Mitaka GIS system ...... 74

ClouT – 31.07.2013 Page 6

D3.1 - Reusable components and techniques for CPaaS

LIST OF TABLES

Table 1. List of ABBREVIATIONS ...... 8 Table 2. List of KEY TERMS ...... 8 Table 3 List of Available Monitoring Programs ...... 54 Table 4 – OPEN STACK SWIFT ...... 77 Table 5- Wirecloud ...... 78 Table 6- MyCoctails ...... 79 Table 7- Enterprise mashup markup language ...... 81 Table 8- ...... 82 Table 9- Yahoo! pipes ...... 83 Table 10- apache shindig ...... 84 Table 11- iPojo ...... 85 Table 12- BPMN ...... 86 Table 13- BPEL ...... 87 Table 14 – ROBUST SERVICE COMPOSITION Method ...... 88 Table 15 – QOS-BASED service/cloud SELECTION method ...... 89 Table 16 – metadata-based BEHAVIOR INSERTION Frmaework ...... 90 Table 17 – BPMN VerifiCation ...... 91 Table 18 – ECA VERIFICATION ...... 92 Table 19- ESPER ...... 93 Table 20- FI-Ware Gateway Data Handling ...... 94 Table 21- JBoss DRools Expert ...... 95 Table 22- MongoDB ...... 96 Table 23- REST/SOAP API’s for IDAS access ...... 97 Table 24- Testbed Manager ...... 98 Table 25 D-case monitoring tool ...... 99 Table 26 – Self-Healing Framework for Sensory Data ...... 100 Table 27 – Fault Classification Model ...... 101 Table 28– HYPERTABLE ...... 102 Table 29 – HBASE...... 103 Table 30 – HADOOP HDFS ...... 104 Table 31 – OPEN STACK SWIFT ...... 106 Table 32 – CEPH ...... 108 Table 33 – Object Storage GE - FI-WARE Implementation ...... 110 Table 34 – CDMI Proxy ...... 112 Table 35 – SOA3 ...... 114

ClouT – 31.07.2013 Page 7

D3.1 - Reusable components and techniques for CPaaS

LIST OF ABBREVIATIONS AND DEFINITIONS

TABLE 1. LIST OF ABBREVIATIONS

CIaaS City Infrastructure as a Service CPaaS City Platform as a Service CSaaS City Service as a Service IoT Internet of Things IoT-A Internet of Things - Architecture TG Task Group Wsan Wireless sensor and actuator network RDF Resource Description Framework OSGi Open Services Gateway Initiative API Application Programming Interface REST REpresentational State Transfer JSON JavaScript Object Notation [RFC 4627]. NGSI Next Generation Services Interface iPojo Plain Old Java Object OSGi Open Services Gateway initiative BPMN Business Process Modeling Notation BPD Business Process Diagram UML Unified Modeling Language BPEL Business Process Execution Language GIS Geographic information system RSS Feed Really Simple Syndication Feed

TABLE 2. LIST OF KEY TERMS

Cloud Storage Cloud Storage is a mode to abstract the physical infrastructure, based on silos (federated servers, cluster servers, etc.) where the data are stored. Sensors Equipment used to measure and convert physical quantity (e.g. temperature) or human information (e.g. social data). Actuators Mechanical devices that perform action on command, but also social networks can be used as actuator.

ClouT – 31.07.2013 Page 8

D3.1 - Reusable components and techniques for CPaaS

EXECUTIVE SUMMARY

This document aims at collecting some existing and reusable components, techniques and standards, focusing on the City Platform as a Service (CPaaS) layer. They belong to various categories, such as the cloud storages, databases and standards used to access and manage data stored in the Cloud.

Each partner has described all possible reusable components to be used in this phase of project: in the document are described all the details of the components, with the assessment, for each one, the Known Limitations of the components and the license of use. Some components are used by the cities involved in the project: Santander and Genova in Europe, Mitaka and Fujisawa in Japan.

The content of this deliverable will be used to define the Reference Architecture for ClouT - and the subsequent implementation of the Smart City Ecosystem. Therefore, not all presented objects will be used in the final version of the architecture, but only those ones that better respond to requirements.

ClouT – 31.07.2013 Page 9

D3.1 - Reusable components and techniques for CPaaS 1. Introduction

1.1. Scope of the Document

This document describes initiatives, tools, solutions and available assets from cities on which leverage on to build the CPaaS layer of the ClouT architecture.

As this deliverable, along with D2.1, will serve as starting point for the initial definition of Cloud Architecture (D1.2 deliverable), for each reusable asset it is proposed a short description of its main features along with several useful details are provided. Furthermore a short assessment is given, in terms of opportunities, capabilities and known limits and threats in adopting the reusable component as part of the ClouT architecture.

Figure 1. ClouT City Architecture for Smart City Ecosystem

1.2. Target Audience

The document targets the ClouT platform developers.

1.3. Structure of the Document

Chapter 2 to 4 introduce all candidate components that are part of the TG3.1, TG3.2 and TG3.3 with reference to Description of Work. Chapter 5 lists a set of other existing assets from ClouT involved cities that may be also included as part of CPaaS. The Appendix reports, for each reusable component, a table containing additional details of the component itself. Finally, conclusions are drawn in Chapter 7.

ClouT – 31.07.2013 Page 10

D3.1 - Reusable components and techniques for CPaaS Each chapter is dedicated to a single topic, as shown in the following picture:

 Chapter 2 is related to the Service composition and mash-up components.

 Chapter 3 describes all the components dedicated to the Big data processing.

 Chapter 4 is referred to the secure access and all the components involved.

 Chapter 5 is related to the city reusable components.

In details, the Chapter 5 lists a set of further existing assets, extracted from ClouT involved cities – Genoa (Italy), Santander (Spain), Mitaka (Japan) and Fujisawa (Japan). Finally, all conclusions are exposed in the Chapter 6, while the Appendix reports a table for each reusable object, containing general and additional details about them.

ClouT – 31.07.2013 Page 11

D3.1 - Reusable components and techniques for CPaaS 2. Service Composition Platforms for citizen’s applications

2.1. Introduction

This section provides a list of tools that are potentially reusable for the service composition and data mashup objective of the ClouT project. The main goal is to provide a tool to the users (experienced developers, non-technical users, etc.) in order that they can build their own services, therefore making them actively part of the city ecosystem. The list varies from service component models to frameworks that provide verification and validation of the composition.

The main idea behind the platform is to enable a user that comes with its own IoT device, to create its service and publish it in the cloud and share (or un-share) it within a given community. Other services can be created by composing high level services from existing base services and platform services.

This section focuses on the service composition platform (at the CPaaS layer) that will give to the users the opportunity of building their own IoT+Cloud services. Variety of tools exists for different types of users, from most experienced ones (professionals) to inexperienced users (end-users). With a service oriented approach, ClouT will define modular IoT and Cloud service models and identify way of interactions between these services. The interaction will be based on an event-based approach to better reflect the IoT device interaction model. The tool will give the opportunity to the users to define actions to take when specific events occur under given conditions (e.g., Event-Condition-Action rules) that will determine the behavior of the created services.

Some of the tools presented here provide validation and optimization techniques for dependability assurance in the service composition platform. The foundational model of service composition is provided for analysis and reasoning, which supports event-driven, context-aware behavior in the complex and dynamic environment. Functional validation is conducted to detect and prevent undesirable situations, possibly caused by feature interactions. Quality of service is also optimized and assured for user-oriented, end-to-end criteria, through efficient exploration of different possibilities of service composition. Thus these frameworks also provide the underlying components that help users, who develop service compositions, refine and configure the composition to achieve dependability properties.

ClouT – 31.07.2013 Page 12

D3.1 - Reusable components and techniques for CPaaS

2.2. Service composition and mash-up tools reusable components

In this paragraph are listed all the mash-up reusable components: for each of them is reported a short description, with figures if needed, and a link to the correspondent component table in the Appendix.

WIRECLOUD

The Wirecloud Mashup Platform is a framework that allows end-users to create specific mashups composing widgets that are the building blocks to create new applications. The widgets (also called gadgets) can be downloaded from a catalogue and can be provided by 3rd parties. In particular the Wirecloud Mashup Platform is composed by three different components: the Application Mashup Editor, the Mashup Execution Engine and the Catalogue.

The Application Mashup Editor is a web-based composition editor used by the end users to compose their own applications. The editor is a workspace where the users can graphically connect the widgets, downloaded from the catalogue, performing an input-output mapping. The widgets can be connected to back-end services or data sources through an extendable set of operators, including filters, aggregators and adapters.

The Mashup Execution Engine is the component that provides the mashup functionalities coordinating the gadgets execution and communication, mashup, state persistence, and cross- domain proxy. The engine functionalities are exposed to the editor through a set of API: new external modules can be added in order to extend the functionalities.

The gadgets used in the editor can be found and downloaded from the Catalogue, that is an independent component with respect to the previous two: the gadgets published in the catalogue are described in a standard description language. In particular the gadgets can be described through an XML Widget description language and with RDF widget description language. The mashups created by the users can be published in the same catalogue: additionally, the catalogue allows to tag and rate widgets to foster discoverability and share ability.

FIGURE 2. WIRECLOUD ARCHITECTURE

ClouT – 31.07.2013 Page 13

D3.1 - Reusable components and techniques for CPaaS The Wirecloud Mashup Platform has been developed in the FI-WARE project as Generic Enabler reference implementation: it supports linked-USDL and it is integrated with FI-WARE's Marketplace, Stores and Repositories.

For further information please refer to the table in the appendix.

MYCOCKTAIL

MyCocktail is a that provides a graphical user interface to easily build mashup easily. MyCocktail provides two different web tools: the mashup builder and the page editor. The mashup builder is a graphical environment that allows combining and modifying data, using specific filters, information coming from REST services, and connecting these data with web gadgets creating mashups. The gadgets can be exported as or Netvibes gadgets being compliant with the open social specification. In particular there are three different types of elements that can be combined in the builder:

FIGURE 3. MYCOCKTAIL WEB TOOL

 Services: it is possible to use in the mashup information obtained by specific services. MyCocktail provides a set of predefined services to get specific information from several popular services or social networks such as Amazon, Delicious, Flickr, Google, Twitter etc. For example it is possible to perform search on Google or get information about the Twitter followers. Also it is possible to import in the editor new custom REST services using the WADL descriptions (Web Application Description Language).  Operators: the data objects obtained from the services invocation can be manipulated using a set of operators. These operators can work on objects, arrays or strings performing a series of functions such as sorting, parsing, splitting etc.  Renderers: are graphical elements and widgets that allow rendering the information obtained and manipulated using services and operators. The final graphical output can be exported in different format such as Google or Netvibes gadgets.

ClouT – 31.07.2013 Page 14

D3.1 - Reusable components and techniques for CPaaS The Page Editor allows designing a web page through a GUI allowing integrating mashups created with MyCocktail with other HTML elements. It is based on FCKEditor, a web text editor. The page editor has a complete toolbar in which the elements and the properties to be included in the design area that shows the result page can be selected.

FIGURE 4 OPEN MYCOCKTAIL

For further information please refer to the table in the appendix.

ENTERPRISE MASHUP MARKUP LANGUAGE

EMML (Enterprise Mashup Markup Language) is an XMLmarkup language for creating enterprise mashups promoted by the Open Mashup Alliance. EMML is a declarative mashup domain-specific language (DSL) that eliminates the need for complex, time-consuming and repeatable procedural programming logic to create enterprise mashups.

EMML provides a mashup-domain vocabulary to consume different data-sources. Additionally, the EMML provides a uniform syntax to invoke heterogeneous service technologies (e.g. Web services, rest services, RSS etc.) and to combine different data formats (JSON, XML, JDBC etc).

EMML can be considered scripts that describe the process flow for a mashup: these scripts must be processed by an EMML engine that interprets EMML statements to perform the mashup. The EMML language provides several functionalities such as:

 Data manipulation using filters  Connection with heterogeneous services  Semantic annotation of services  Merge and split datasets  Support for scripting languages (e.g. JavaScript, JRuby, Groovy, XQuery)  Conditional statements

ClouT – 31.07.2013 Page 15

D3.1 - Reusable components and techniques for CPaaS  Data scraping from HTML page

The Open Mashup Alliance has made the EMML schema available for download as well as an EMML reference runtime implementation that processes mashup scripts written in EMML.

For further information please refer to the table in the appendix.

OPENSOCIAL

OpenSocial is a set of common APIs (Application Programming Interface), initially developed by Google, in order to create a single programming model for social applications such as gadgets that can operate on any social network that uses this standard.

Based on HTML and JavaScript, as well as the framework, OpenSocial includes multiple APIs for social software applications to access data and core functions on participating social networks. Each API addresses a different aspect. It also includes APIs for contacting arbitrary third party services on the web using a proxy system and OAuth for security.

In particular, OpenSocial provides four different categories of APIs that can be used to:

 develop Social Application, such as gadgets, that run in specified OpenSocial containers  implement Web Container: host social networking environments in which the social applications can be executed  allow interaction between web sites and social networks that support the Open Social standard. In particular, existing sites can have access to several information, for instance: o the user profile (user data); o information about user’s friends (social graph); o profile activities (posts, photos, video, news feeds etc.)

The goal of the project is, therefore, to:

 increase power and pervasiveness of the social Web capabilities, providing to the developers the tools needed to build applications that can be accessed by an ever increasing number of users, regardless of social networks used;  increase interoperability between applications;  promoting the portability of web-based components;  stimulate the creativity of the user, in order to have new ideas "from below";  create a "framework" open and free that is compatible with the highest number of social networks.  OpenSocial is not, a product or a service distributed by Google, but it is an open standard supported by a large community of developers. Furthermore, it is important to highlight how the project is significantly different from what has been achieved in recent years by Facebook which, despite the huge community of developers, has focused mainly on the use of proprietary protocols and languages, maintaining a more conservative approach.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 16

D3.1 - Reusable components and techniques for CPaaS

YAHOO! PIPES

Yahoo! Pipes is a visual programming environment that gives people the ability to create web mashups and web-based applications that combine data from multiple sources. The name comes from the Unix Pipes, which allowed connecting to data source filters and utility of different types. Yahoo! Pipes does not require any type of installation because it is an online service, available at http://pipes.yahoo.com, only a Yahoo! account is required.

It is a free service and allows the management of RSS Feeds and creating interesting mashups.

FIGURE 5. YAHOO PIPES MASHUP TOOL

It is also composed of a graphical editor intuitive and very user friendly.

The core component of Yahoo ! Pipes is the Pipe editor that is composed of three panes which are the canvas, the library and the debugger. The user creates a pipe using these panes using Drag & Drop through which the various modules are composed from existing services. Through Yahoo! Pipes it is possible for example to combine many feeds into one, sort, filter and to geocode the favorite feeds browsing the items on an interactive map. After creation the user can decide to publish the pipes on the website http://pipes.yahoo.com to make them visible to other people than can reuse or modify the pipes.

The output can be in different formats widgets, RSS, JSON, PHP.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 17

D3.1 - Reusable components and techniques for CPaaS APACHE SHINDIG

Shindig is an open source project of the Apache Software Foundation that has the aim to implement an open container in compliance with the Open Social gadgets specifications. According to the Apache Foundation, the project's goal "is to allow new sites to start hosting the social apps in less than an hour's work", creating an infrastructure able to render gadgets, process proxy requests, and handle REST and RPC requests. In the market there are many examples of containers that are based on Shindig, such as LinkedIn, hi5, Partuza, WSO2, Jartuza. From the architectural point of view Shindig consists of four basic components:

 Gadget Container JavaScript: component written in JavaScript that manages many gadgets functionalities such as safety, communication, graphical user interface and the access to the OpenSocial API.  Gadget Rendering Server: server used to render the gadget XML file in JavaScript and HTML for the container in order to expose the gadget through the Container JavaScript.  OpenSocial Container JavaScript: the JavaScript environment that lies on the top of the Gadget Container JavaScript and provides OpenSocial specific functionality, such as profiles, list of friends, activities, datastore.  OpenSocial Date Server: is the implementation of a server interface for the access to specific information of the container, and it includes the OpenSocial REST API.

Shindig, from architectural point of view, has a client and a server component. In particular the client part of the system is composed by three elements:  The container for gadgets, fully compliant with the OpenSocial gadgets specifications (GadgetsContainer).  The OpenSocial container itself (opensocial. Container).  The container that supports the exchange of data in the JSON and Caja format to the REST standard (RestfulContainer).

FIGURE 6. APACHE SHINDIG OVERALL ARCHITECTURE

In particular, the latter is an extension of the component OpenSocial container and deals to pass all calls to the API OpenSocial REST endpoint that will take care of both the direct HTTP calls and those from the OpenSocial API.

ClouT – 31.07.2013 Page 18

D3.1 - Reusable components and techniques for CPaaS All calls to the server will be executed by the RestfulContainer and GadgetsContainer through the XmlHttpRequest Gadgets.io that will instantiate an object in order to send requests to HTTP Server.

The Shindig components of the server side are:

 Persistent Data Access Layer: loading mechanism of the persistent data.  Gadget Rendering Server Components: infrastructure for rendering gadgets.  Open Social Components Server: server-side implementations of the OpenSocial API.

FIGURE 7. APACHE SHINDIG SERVER ARCHITECTURE

There are currently two versions of Apache Shindig written in Java and PHP, and a third, written in .NET that is external to the project.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 19

D3.1 - Reusable components and techniques for CPaaS

IPOJO iPOJO (iPojo) (Plain Old Java Object) is a service component runtime aiming to simplify OSGi application development and to build service-oriented component model.

Early efforts, such as the Service Binder and Service Tracker, attempted to give more flexibility to OSGi developers and more recent efforts, such as Declarative Services and Spring Dynamic Modules alleviate some of these issues. iPOJO is another such effort in this area. It has two goals:

 providing services, using services, and dealing with dynamism should require as little effort as possible from the developer;  component configuration, synchronization, and composition should be incorporated to OSGi mechanisms.

For example, trying to create an OSGi-based application with services is challenging. The OSGi API is complex and a lot of knowledge about internal mechanisms has to be known to avoid synchronization issues. iPOJO provides a very simple development model. The component (Figure 2-8) is a central concept in iPOJO. In the core iPOJO model, a component describes service dependencies, provided services, and call backs; this information is recorded in the component's metadata. In fact, a service is an object that implements a given Java interface. In addition, iPOJO introduces a call back concept to notify a component about various state changes.

To provide a service:

@Component @Provides public class MyServiceImplementation implements MyService { //.... }

In this way, the component is declared as the MyService provider and corresponding OSGi service is created automatically.

To require a service:

@Component public class MyServiceConsumer{ @Requires private MyService myservice;

// Just use your required service as any regular field ! }

After components, the next most important concept in iPOJO is the component instance. A component instance is a special version of a component. By merging component metadata and instance configuration, the iPOJO runtime is able to discover and inject required services, publish provided services, and manage the component's life cycle.

ClouT – 31.07.2013 Page 20

D3.1 - Reusable components and techniques for CPaaS

FIGURE 2-8IPOJO COMPONENT EXAMPLE

iPOJO works on any R4.1 OSGi implementation. It also works on many Java virtual machines such as Oracle JRockit, JamVM, Dalvik (Android), and Mika. iPOJO only requires a J2ME Foundation 1.1 virtual machine. So, iPOJO can be embedded inside mobile phone applications or inside your washing machine. iPOJO is small and was designed to stay small. The core size of iPOJO is approximately 205k (compared to 816k for Guice-Peaberry and 2112k for the minimal Spring-DM configuration). In addition to the core, you only deploy the features you require. For example, if you need proxy injection, just deploy the temporal dependency bundle (less than 70Kb). The run-time overhead of iPOJO is also small. On the Peaberry benchmark, iPOJO has the following performance results:

Guice-Peaberry: 276.00 ns/call iPOJO Service Dependency: 118.00 ns/call Spring-DM: 2384.00 ns/call iPOJO Temporal Dependency: 159.00 ns/call iPOJO Temporal Dependency w/ proxy: 173.00 ns/call

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 21

D3.1 - Reusable components and techniques for CPaaS

BPMN

The Business Process Modeling Notation (BPMN) (BPMN) is a graphical notation that depicts the steps in a business process. A business process spans multiple participants and coordination can be complex. Moreover, well-supported standard modeling notation will reduce confusion among business and IT end-users. BPMN depicts the end to end flow of a business process. The notation has been specifically designed to coordinate the sequence of processes and the that flow between different process participants in a related set of activities. The following figure shows a Business Process Diagram.

FIGURE 9EXAMPLE OF BPD (STEPHEN A. WHITE)

A BPD is made up of a set of graphical elements. These elements enable the easy development of simple diagrams that will look familiar to most business analysts (e.g., a flowchart diagram). The elements were chosen to be distinguishable from each other and to utilize shapes that are familiar to most modelers. For example, activities are rectangles and decisions are diamonds. It should be emphasized that one of the drivers for the development of BPMN is to create a simple mechanism for creating business process models, while at the same time being able to handle the complexity inherent to business processes. The approach taken to handle these two conflicting requirements was to organize the graphical aspects of the notation into specific categories. This provides a small set of notation categories so that the reader of a BPD can easily recognize the basic types of elements and understand the diagram. Within the basic categories of elements, additional variation and information can be added to support the requirements for complexity without dramatically changing the basic look-and-feel of the diagram. The four basic categories of elements are: Flow Objects, Connecting Objects, Swimlanes and Artifacts.

A key goal in the effort to develop BPMN to help alleviate the modeling technical gap was to create a bridge from the business-oriented process modeling notation to IT-oriented execution languages that will implement the processes within a business process management system. The graphical objects of BPMN, supported by a rich set of object attributes, have been mapped to the Business Process Execution Language for Web Services (BPEL4WS v1.1), the standard for process execution. For example, the core of jBPM (jBPM, 2013) is a light-weight, extensible

ClouT – 31.07.2013 Page 22

D3.1 - Reusable components and techniques for CPaaS workflow engine written in pure Java that allows you to execute business processes using the latest BPMN 2.0 specification. It can run in any Java environment, embedded in your application or as a service.

Other solutions to BPMN exist for example, UML Activity Diagram, UML EDOC Business Processes, IDEF, ebXML BPSS, Activity-Decision Flow (ADF) Diagram, RosettaNet, LOVeM, and Event-Process Chains (EPCs). The following section illustrates the difference between UML and BPMN and thus it helps to understand more the relationships between these solutions.

BPMN & UML (BPMN)

The unified modeling language (UML) takes an object-oriented approach for modeling applications, while BPMN takes a process-oriented approach for modeling systems. Where BPMN has a focus on business processes, the UML has a focus on software design and therefore the two are not competing notations but are different views on systems. The BPMN and the UML are compatible with each other. A business process model does not necessarily have to be implemented as an automated business process in a process execution language. Where this is the case, business processes and participants can be mapped to constructs such as use cases and behavioral models in the UML.

BPMN TO XML (BPMN 2.0 by Example, 2010)

The XML serialization for BPMN is provided in machine-readable form, which has the OMG Document Number dtc/2010-06-03. It is an international standard formally approved by OMG, it creates XML code that is generated when a person creates a process model.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 23

D3.1 - Reusable components and techniques for CPaaS BPEL

BPEL (BPEL) is the standard for assembling a set of discrete services into an end-to-end process flow, radically reducing the cost and complexity of process integration initiatives.

BPEL (BPEL Definition) is an orchestration language, and not a choreography language. The primary difference between orchestration and choreography relies on the executablility and control. An orchestration specifies an executable process that involves message exchanges with other systems, such that the message exchange sequences are controlled by the orchestration designer. Choreography specifies a protocol for peer-to-peer interactions, defining, e.g., the legal sequences of messages exchanged with the purpose of guaranteeing interoperability. Such a protocol is not directly executable, as it allows many different realizations (processes that comply with it). A choreography can be realized by writing an orchestration (e.g., in the form of a BPEL process) for each peer involved in it.

BPEL has no graphical notation witch conducted to a variety of different notations causing ambiguities and misunderstanding. BPMN is then used to fill this gap.

The following figure (Leymann, 2010) explains more the relationship between BPEL and BPMN using a concrete example.

ORACLE BPEL

Oracle BPEL (ORACLE BPEL Process Manger, 2009) process Manager is a tool for designing and running business processes. This product provides a comprehensive, standards-based and easy to use solution for creating, deploying and managing cross-application business processes with both automated and human workflow steps – all in a service-oriented architecture. Oracle BPEL Process Manager could be used in isolation to implement business processes, but the real power comes when it is used in conjunction with other SOA components. For example, you can instrument your BPEL processes using the Oracle BPEL Process Manager’s sensor framework. Sensors can fire events under conditions specified by the administrators and send events to any endpoint that administrators choose. This makes it easy to monitor processes when there are hundreds or thousands running in parallel.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 24

D3.1 - Reusable components and techniques for CPaaS

2.3. Dependable Service Compositions reusable components

In this paragraph are listed all the Dependable Service Composition reusable components. For each component it is reported a short description and a link to the correspondent component table in the Appendix.

ROBUST SERVICE COMPOSITION METHOD

This method provides foundations for service composition that deal with differences in service functionality and quality as well as their faults or changes. It can be instantiated as an analysis and design method by human developers, or an algorithm for automated service composition. The core of the method is a modeling structure to efficiently represent and analyze slightly different service functions. It can be used to: - understand the differences - efficiently find matching between service functions - efficiently find alternatives for a certain function - and assess feasibility or sustainability about realization of a certain function by external providers. For the automated usage, an efficient algorithm is provided: it uses the model and it constructs a robust service composition plan by analyzing alternatives or risks of lock-in. The algorithm has some customizations using the alternative information on a genetic algorithm so that semi-optimal solutions are obtained efficiently, even though the problem is much more complex than standard ones by considering alternatives.

FIGURE 10. ROBUST SERVICE COMPOSITION METHOD

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 25

D3.1 - Reusable components and techniques for CPaaS

QOS-BASED SERVICE/CLOUD SELECTION METHOD

This method provides an algorithm for QoS-based selection of services or clouds to use. This method basically follows and handles the standard problem of QoS-based service selection, which selects services to use in a service composition according to a variety of QoS (Quality of Service). The methods considers both of preferences over multiple criteria (e.g., price is more important than execution time) and end-to-end constraints (e.g., total execution time must be less than $1 per invocation). The method extends this standard setting by also considering network QoS between interacting services. It thus includes model that defines aggregated quality of services and networks involved in a service composition. It also provides an efficient algorithm for QoS optimization by customizing a genetic algorithm to reflect location of services in a network model.

FIGURE 11. QOS-BASED SERVICE

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 26

D3.1 - Reusable components and techniques for CPaaS

METADATA-BASED BEHAVIOR INSERTION FRAMEWORK

This framework provides a way to insert additional behaviors into service compositions to achieve value-added functions often for non-functional requirements. It defines an algorithm to properly insert additional behaviors according to constraints or rules. These constraints or rules work on metadata attached on service compositions. Thus they are simple and one specification item can trigger insertions in multiple execution points in multiple compositions (e.g., insert logging behavior before "PRIVACY" data is sent to a "THIRD PARTY").

FIGURE 12. METADATA-BASED BEHAVIOUR INSERTION FRAMEWORK

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 27

D3.1 - Reusable components and techniques for CPaaS

VERIFICATION FRAMEWORK OF TIME AND RESOURCE CONSTRAINTS ON BUSINESS PROCESS

This framework provides a capability to verify time and resource constraints on service compositions or business processes through model checking. It provides a small annotation syntax on BPMN (Business Process Modeling Language) for time and resource constraints. It has conversion rules from BPMN with the annotations into UPPAAL, a timed model checker, which exhaustively explores possible state changes to verify the constraints.

FIGURE 13. VERIFICATION FRAMEWORK OF TIME AND RESOURCE CONSTRAINS ON BUSINESS PROCESS

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 28

D3.1 - Reusable components and techniques for CPaaS

VERIFICATION FRAMEWORK OF ECA SPECIFICATION ON PHYSICAL INTERACTIONS

This framework provides a capability to verify complex combinations of ECA (Event-Condition- Action) rules that have effects on shared spaces often by multiple users. It provides a language for high-level specification of physical interactions (e.g., visual, audio). It also defines conversion rules of the description into SPIN, a model checker. Several templates are provided for proper formula to check, which are difficult to properly define by temporal logic.

x FIGURE 14. VERIFICATION FRAMEWORK OF ECA SPECIFICATION ON PHYSICAL INTERACTIONS

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 29

D3.1 - Reusable components and techniques for CPaaS

3. Big data processing

3.1. Introduction

This section gives the currents state of the art in terms of reusable data processing components that can be considered when developing ClouT’s data processing subsystem. Given the ubiquity and increasing number of devices in the future Internet of Things, scalable data processing infrastructures are necessary to collect and process a ”terabyte torrent” coming from trillions of devices. The ClouT’s data processing subsystem will provide an event processing engine that will collect events from the city, process them to obtain higher level information that can be accessed by other services and applications (e.g., subscribers). Traditional SQL-based database languages are not any more adequate for managing the high stream of data from IoT. The querying and processing techniques should be revised in order to take into account the real-time nature of the applications. One part of the processing should be when possible done at the device level, thus exploiting devices processing capabilities.

Data need to be processed by data stream management systems in real-time and not by using the typical store-and-process paradigm. This section thus analyses noSQL solutions for data processing. In addition, since ClouT will provide an event processing mechanism with a publish/subscribe facility, existing solutions on that are also listed in this section. ClouT project also deals with the non-functional issues of the data processing system, such as the fault detection. In fact, the data should be “clean” in order to be properly used by real-time applications. Faulty data should be detected and isolated, and necessary measures should be taken if the faults are frequent. One of the big challenges is that this should be done in real-time on a big quantity of data. Some solutions of online fault detection are presented in this section.

ClouT – 31.07.2013 Page 30

D3.1 - Reusable components and techniques for CPaaS

3.2. Data/event processing and decision making reusable components

In this paragraph are described all the Data/Event processing and decision making reusable component. For each of them it is reported a short description with a link to the correspondent component table in the Appendix.

ESPER

The Esper engine (Inc, 2012) has been developed to address the requirements of applications that analyze and react to events. The Esper engine works a bit like a database turned upside- down. Instead of storing the data and running queries against stored data, the Esper engine allows applications to store queries and run the data through. Response from the Esper engine is real-time when conditions occur that match queries. The execution model is thus continuous rather than only when a query is submitted.

Esper provides two principal methods or mechanisms to process events: event patterns and event stream queries. Event pattern are expression-based event that matches expected sequences of presence or absence of events or combinations of events. It includes time-based correlation of events. Esper event stream addressed event stream analysis requirements of CEP applications. Event stream queries provide the windows, aggregation, joining and analysis functions for use with streams of events. These queries are following the Event Processing Language (EPL) syntax. EPL has been designed for similarity with the SQL query language but differs from SQL in its use of views rather than tables. Views represent the different operations needed to structure data in an event stream and to derive data from an event stream.

Figure 15 shows the event processing principle that Esper follows. First the data we would like to analyze need to be structured in the form of objects before it can be thrown down the pipe to our CEP engine. Esper provides multiple choices for representing an event for example Java object and DOM node. There is no absolute need for you to create new Java classes to represent an event. Then, we need to inform the engine about the kind of objects it will have to handle. Events can then be thrower of the pipe of the CEP engine. The CEP engine will then filter the data it receives, and trigger events whenever that data meets the selection rule, or fulfils the pattern defined in the statement in the form of EPL.

FIGURE 15CEP PRINCIPLE (ANGELO CORSARO)

ClouT – 31.07.2013 Page 31

D3.1 - Reusable components and techniques for CPaaS

ESPER&OSGI

Esper is a server side component and it can be deployed on platforms that perform data gathering and processing. It is possible to deploy Esper in terms of a bundle into an OSGi gateway. The Figure 2-8 shows a possible integration of Esper to OSGi.

FIGURE 16ESPER INTEGRATION TO OSGI

An integration of Esper into OSGi has been developed by the FI-WARE project. Next section briefly presents that component.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 32

D3.1 - Reusable components and techniques for CPaaS

FI-WARE GATEWAY DATA HANDLING

FI-Ware Gateway data handling (Fiware Data Handeling) is a complex event processing component proposed by Orange Labs France. It is based on Esper4FastData CEP engine. It is able to collect vast amounts of asynchronous events of different types and correlate them into single events, called Complex Events. It can read from and write to numerous different channels using various different protocols. It is driven using a domain specific language called “Dolce”.

The Gateway Data Handling GE of the Figure 17 is also the first stage of intelligence transforming data into events using smart rules. Applications are now able to collect in real-time large amounts of data, but only relevant data avoiding boring and asynchronous data analysis. You will not manage only raw data but also define some local rules to add value on raw data and send only relevant events when a typical situation happens. You will also be able to add and change the rules.

FIGURE 17DATA HANDLING GE ARCHITECTURE

The Data Handling API is NGSI compliant. It accepts events from any NGSI compliant Event Producer. In the IoT Service Enablement architecture, the Gateway Device Management GE publishes device events and data towards the Data Handling GE. In this setup, the Gateway Device Management GE registers itself as an NGSI context provider, by calling the NGSI-9 register Context method exposed by the Data Handling GE. After context registration, the Gateway Device Management GE sends events by calling the NGSI-10 update Context method of the Data Handling GE.

Data Handling GE implementation allows rules building by graphically wiring blocks together. The resulting diagram is then converted into EPL syntax, which is the CEP rule language. The

ClouT – 31.07.2013 Page 33

D3.1 - Reusable components and techniques for CPaaS Graphical rules editor is optional, but useful to manage CEP rules, in a user-friendly environment.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 34

D3.1 - Reusable components and techniques for CPaaS

JBOSS DROOLS EXPERT

Drools Expert is a declarative, rule based, coding environment. This allows you to focus on "what it is you want to do", and not on the "how to do this".

Drools life is a specific type of rule engine called Production Rule System (PRS) and was based around the Rete algorithm (usually pronounced as two syllables, e.g., REH-te or RAY-tay). The Rete algorithm, developed by Charles Forgy in 1974, forms the brain of a Production Rule System and is able to scale to a large number of rules and facts. A Production Rule is a two-part structure: the engine matches facts and data against Production Rules - also called Productions or just Rules - to infer conclusions which result in actions. The process of matching the new or existing facts against Production Rules is called pattern matching. The figure below shows the high-level architecture of production system. when then ;

FIGURE 18HIGH-LEVEL PRODUCTION RULE SYSTEM

The following is a simple "reactive" monitoring example that sends a message every four hours when the alarm is raised. The calendar attribute ensures the rule only executes on weekdays. Monitoring examples like this would be a long running application. rule "Weekday Alarm Response" timer(int 4h) calendar "weekday" when a : Alarm( ) then sendMessage( "There is an alert" + a); end

An interesting fact is that Drools can be integrated to jbpm. The drools camel server (drools- camel-server) (Drools-Camel Server) module is a war (Web application Archive) file which you can deploy to execute Knowledge Bases remotely for any sort of client application. This is not limited to JVM application clients, but any technology that can use HTTP, through a REST interface. This version of the execution server supports stateless and state full sessions in a native way.

ClouT – 31.07.2013 Page 35

D3.1 - Reusable components and techniques for CPaaS For further information please refer to the table in the appendix.

MONGODB

MongoDB (MongoDB) (from "humongous") is an open-source document database, leading of NoSQL databases. Data in MongoDB has a flexible schema. Collections do not enforce document structure, although you may be able to use different structures for a single data set. In MongoDB, different data models may have significant impacts on application performance.

MongoDB (MongoDB Overview) is open-source. It features: document data model with dynamic schemas, full and flexible index support, auto-Sharding for horizontal scalability, built-in replication for high availability, text search, rich queries and advanced security.

Instead of storing data in rows and columns as one would with a relational database, MongoDB stores a binary form of JSON documents (BSON). Relational databases impose flat, rigid schemas across many tables. By contrast, MongoDB is an agile NoSQL database that allows schemas to vary across documents and to change quickly as applications evolve, while still providing the functionality developers expect from relational databases, such as secondary indexes, a full query language and strong consistency.

In addition, MongoDB is built for scalability, performance and high availability. Auto-sharding allows MongoDB to scale from single server deployments to large, complex multi-data center architectures. Leveraging native caching and RAM, MongoDB provides high performance for both reads and writes. Built-in replication with automated failover enables enterprise-grade reliability and operational flexibility. MongoDB also provides native, idiomatic drivers for all popular programming languages and frameworks to make the development natural.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 36

D3.1 - Reusable components and techniques for CPaaS

3.3. Self-healing for data/event streaming reusable components

In this paragraph are listed the Self-healing for data/event streaming reusable components. For each of them it is reported a shot description and a link to the correspondent component table in the Appendix.

SELF-HEALING FRAMEWORK FOR SENSORY DATA

This framework provides runtime detection, classification, and correction capabilities for sensory data faults that appear in sensory data. It includes a fault classification model and a model-learning component based on statistical pattern matching.

FIGURE 19 SELF-HEALING FRAMEWORK FOR SENSORY DATA

Figure 19 illustrates overview of the framework. This framework addresses a full cycle of self- healing mechanism, which includes detection, diagnosis, resiliency mechanisms and fault models.

Detection phase is self-explanatory, and neighborhood vote with the statistical analysis of data is sufficient for successful detection. Classification is a part of the diagnosis. It relies on the duration and the impact faults have on the data behavior. To complete the classification, we have a phase of fault model learning. Models are later applied to correct readings from the respective faulty nodes. Each of these phases can be implemented with a different set of applicable algorithms.

ClouT – 31.07.2013 Page 37

D3.1 - Reusable components and techniques for CPaaS In the absence of ground truth, data modeling is vital. A good set of models enables grouping data into good or faulty, including the type of fault. The challenge here is to define the distinction between faulty behavior and an outlier of a new event. If the expected range of readings is known, a reading that falls out of that range is not necessarily faulty, but it might indicate the occurrence of a new event. We focus on the case where this knowledge is not available and propose model learning from the past behavior of the network. Expectations of correct behavior are established based on collected readings over the initial phase of operation. In this way, the network is not bound by a-priori assumptions and has the freedom to establish what ground truth is. Network learns a model of fault for each faulty node and can apply that model for the fault correction.

Both modeling and classification rely on data features, which are usually statistical in nature. Since the fault can belong only to a limited number of proposed classes, statistical pattern recognition to learn an appropriate model of the fault is suitable. This model is then applied to correct the readings of a respective node.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 38

D3.1 - Reusable components and techniques for CPaaS

FAULT CLASSIFICATION MODEL

This model provides a decision tree to classify data faults into four types: bias fault, drift fault, malfunction fault and random fault. Applied to sensory readings, this model proposes a complete and consistent classification based on frequency and the continuity of occurrence and observable and learnable patterns.

FIGURE 20 - FAULT CLASSIFICATION MODEL

Figure 20 illustrates this classification. From here we can define types of faults. If a sensor reading is represented with ri + εi , where ri is a true value of a measured phenomena and εi is a fault, we can define fault classification as follows:

• Discontinuous - Fault occurs from time to time, occurrence of εi is discrete.

– Malfunction – Frequent occurrence of faulty readings, εi> τ, where τ is threshold frequency. Also, there is no observable pattern in the fault occurrences.

– Random – Infrequent occurrence of faulty readings, εi≤τ.

•Continuous - After the certain point in time, a sensor returns constantly inaccurate readings, and it is possible to observe a pattern in the form of a function:

εi = f(t,[α1,α2....])

where αi are coefficients of the function and t is time.

– Bias - The function of the error is a constant, εi = const. This can be a positive or a negative offset.

– Drift - The deviation of data follows a learnable function, such as polynomial change

ε = α ∗ ε + α ∗ ε + ⋯ + α

The classification is independent of the underlying cause of a fault.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 39

D3.1 - Reusable components and techniques for CPaaS

4. Secure and dependable access to city data

4.1. Introduction

The goal of this task is twofold. Firstly, it aims at providing a City Data Storage and Access framework and integrating APIs for hosting and accessing in a secure way the big amount of data produced by the City applications and the available sensors. Secondly, it will provide dependability assurance tools that can be used for monitoring the dependability of running systems.

With regard to Open City data hosting, several Cloud Storage solution among Object Storage, Distributed and Data Base have been identified as candidates for building up the Data Storage and Access Services of ClouT platform. These different technologies of distributed data management, for their different features and limits, can be combined as the best in order to enable a fast storage and retrieval of the variety of data produced by virtualized sensors and city applications connected to the Cloud. These data may differ in terms of size (for example from a single measurement of sensor to a large image file), amount, format, expected frequency of access, type of usage (processing, long archiving).

Cloud databases are web-based services conceived for executing queries on structured data stored on cloud data services. With respect to traditional approaches, cloud databases provide the main capabilities of a database, that is simple querying of structured data and real-time lookup, while being easy to use. Furthermore, most of the cloud databases present a distributed deployment model: information are distributed among differently located hardware, but this is transparent to the client, that see data as it resided in a unique place.

Storing and processing sensor measurements should support advanced queries and correlations of information (for analytic purposes) and for these reasons, after this preliminary study, the cloud database services turned out as the best technology for storing sensor data. Nevertheless, historical data should be kept for a reasonable long period to enable statistical analysis over time, even years or decades. And databases are not suitable for archiving large set of data for a long period of time.

In this sense, different platforms, allowing not only to gather measurements and store historical values from the measurements retrieved by the network, but also to send commands to the deployed nodes and to subscribe to different events in the network.

Regarding to the network management, it is important to be able to remotely access and control the deployed platform, in order to send/receive commands from the nodes as well as to change their behavior through remote flashing procedures.

Cloud Object storage model was introduced quite recently in the world of distributed storage as an alternative to file system storage, to face the explosion of mostly unstructured data being stored on storage systems. With respect to file system, cloud object storage is much more scalable, simple and provides important resiliency and reliability features. It is based on a flat organization of containers and objects and use unique identifier to retrieve them. This data model is flat in the sense that objects are organized in container with no relationship between

ClouT – 31.07.2013 Page 40

D3.1 - Reusable components and techniques for CPaaS each other. Storage objects are replicated across multiple and commodity-hardware servers in different locations in order to ensure reliability, The Cloud object storage provided limited functionalities such as CRUD (create, read, update, delete) on objects through REST APIs, which allows accessing files for users from anywhere.

Among the disadvantages of Cloud Object storage with respect to file systems, it is worth mentioning the fact that usually the throughput is slower and also that in case of update of an object, the change has to be propagated to all of the replicas in order for requests to return the latest version (consistency lack). For this reason, while object storage it is the best solution for those data that are updated un-frequently, like multimedia files or backups, it is not suitable for data that change frequently.

Interoperability and standardized access to structured and unstructured data in the cloud are considered urgent requirements for Cloud storage. Furthermore, it should support complex queries for applications that store data or objects in the cloud.

At this purpose, in 2010 SNIA (Storage Network Agency) introduced the Cloud Data Management Interface standard [CDMI], which is now designated by ISO/IEC as an international standard. Adding a CDMI layer on heterogeneous and distributed storage services, we enable interoperability and we provide support for querying the storage cloud based upon a regular expression. For this reason, CDMI has been identified as key asset on which to leverage to build the Data Access layer of the ClouT platform.

As the second point, this task will provide dependability assurance tools that can be used for monitoring the dependability of running systems. Providing city data through Internet of Things and cloud platform promises to penetrate into social infrastructure systems. Some of them are used for critical systems, such as structure monitoring and medication management, where failure of getting data and resources may damage human life, human health, and property. Systems support for better monitoring, fault detection, and fault recovery is thus required to keep infrastructure healthy. The infrastructure is, however, a complicated system consisting of wireless sensor nodes, database servers, network appliances, etc. There is no existing system that can manage such infrastructure in a comprehensive manner.

To tackle the issue in defining effective monitoring, technique for dependability case (D-Case) are examined. It allows managers to describe management strategy, especially with a graphical tool, which is linked to analysis of dependability requirements as well as action programs.

The best Cloud Data Management approach for ClouT will be delineated in the next project phase in light of requirements collected in D1.1 and of other specific needs that may surface when the first ClouT reference architecture will be identified.

4.2. Open city data hosting and access reusable components

In this paragraph are listed all the cloud storages, distributed file systems, security platforms and an implementation of cloud data management interface. There is also, for each component, a link to the correspondent component table in the Appendix.

ClouT – 31.07.2013 Page 41

D3.1 - Reusable components and techniques for CPaaS

HYPERTABLE

Hypertable is a “NoSQL” database based on Google File System technology. It permits to attach, as distributed file systems, Hadoop HDFS (Google File System open source implementation).

The files are distributed in all the data nodes and replicated in backup nodes. The availability of the service is 100% also in case of failure of one data node. The software computation is based to Hadoop MapReduce framework. Hypertable permits to process a big amount of data in parallel by pushing the piece of code out to the machines where the data resides.

The final aggregation provides a way to re-order the data based on any arbitrary field. Hypertable uses the Google technology, which is implemented to create big tables with an index key. Applications that use this technology include , , , , , YouTube, and many more.

The security and data encryption are demanded to the back-end file system (physical disks, Hadoop HDFS or Ceph). No security is implemented to access the data with Hypertable API or Hypertable built in clients.

The access to the data is performed using SQL like language. It is also possible, and supported, the JDBC connector. Hypertable provides the API for the following languages: Java, PHP, Python, Perl, Ruby, C++.

The high performance of access to data is possible thanks to the indexing mechanism, based on hash tables in the database. The best performance of Hypertable is in the random access. As that, this database implementation is well suited for real time scenarios.

FIGURE 21 - HYPERTABLE ARCHITECTURE OVERVIEW

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 42

D3.1 - Reusable components and techniques for CPaaS

HBASE

HBase is a "NoSQL" and distributed database. HBase is designed to be a "datastore", but, with the developed functionalities, it can be considered also a "database". It offers all the features of a RDBMS (typed columns, triggers , indexes, etc). It is possible also to implement complex queries.

HBase has been developed to scale well in horizontal (adding new nodes) and in vertical (adding more hardware to each node). HBase clusters can be expanded adding new RegionServers. Each server can be hosted on low cost hardware. Adding new hardware (new clusters) it is possible to increase performance in terms of data storage space RDBMS can scale well, but only from the point of view of the size of a single database server. To increase performance, RDBMS requires specialized, expensive hardware and storage devices.

HBase, built on top of HDFS, provides high performance in record query (searches, updates and inserts) for large big tables. HBase, internally, stores the data in indexed files that are memorized on HDFS for high-speed access. As well, HDFS distributes, and replicates, the data in the data nodes.

With the default Hbase configuration, everyone connected to the system can perform read and write to tables in the system itself. This, of course, is not acceptable. HBase can be configured to provide User Authentication: only authorized users can perform actions with Hbase. The authorization system is implemented at the RPC level. This is based on the Simple Authentication and Security Layer (SASL). This authentication supports Kerberos. SASL implements, for each connection/transaction, the following services: authentication, encryption negotiation and message integrity verification.

Because HBase is on top to HDFS and ZooKeeper, secure HBase uses secure HDFS and secure ZooKeeper. For this, HBase servers create a secure service session to communicate with HDFS and ZooKeeper.

HDFS access control is based (as implemented in UNIX physical file systems) on users, groups and permissions. The files created by HBase are putted in HDFS with “hbase” user (default user created by installation). The control access is based on the username of the system. HDFS guarantees that the user used in the installation is secure and trusted with some authentication steps.

In order to guarantee the access control HDFS uses Apache ZooKeeper, an open source framework that coordinates distributed applications. ZooKeeper implements an Access Control List (ACL) to control, (read and write actions), to control the users operations. This mechanism is similar to the HDFS authentication mode.

ClouT – 31.07.2013 Page 43

D3.1 - Reusable components and techniques for CPaaS

FIGURE 22- HBASE ARCHITECTURE OVERVIEW

FIGURE 23 - HBASE CYCLIC REPLICATION OVERVIEW ARCHITECTURE

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 44

D3.1 - Reusable components and techniques for CPaaS

HDFS

The Apache Hadoop HDFS is a distributed file system designed to run on low cost hardware. It has many similarities with existing distributed file systems (like Ceph, GlusterFS, etc.). HDFS is designed, natively, to be fault tolerant system. HDFS permits fast access, more than the other distributed file systems, to application data. It is designed for applications that need large data sets access. HDFS permits to access data also in streaming mode.

HDFS instance is implemented in a high number of machines; each node stores a part of the file system’s data. The probability of failure is too low thanks to the fact that each installation is based on a large number of hardware and the data are replicated in more than one node. The detection of a faults is demanded directly to a core, native, component of HDFS.

HDFS has a master to slave architecture. A typical HDFS cluster consists of a single NameNode, server managing the file system namespace and controls the access to files by clients. Other components of HDFS architecture are the DataNodes (in general one per node in the cluster). The Data Node manages storage attached to the node that they run on. HDFS implements a file system namespace that permits to a user to store data in files. In HDFS a file is split into more than one block of data. These blocks are stored in more than one DataNode. In the NameNode is executed the process that controls the file system namespace’s operations (open, close and rename files and directories). It calculates blocks map into DataNodes. The DataNodes servers are responsible for serving read and write requests from the file system’s clients. The DataNodes execute the creation, or deletion or replication, of data blocks.

HDFS is designed to manage very large scale files stored in servers in a big cluster. HDFS memorizes the file storing it as a sequence of data blocks; all blocks in a file, with the exception of the last block, have the same dimension size. The data blocks are replicated and stored in more than one server node (fault tolerant management). The data block dimension and replication of a file are configurable in HDFS system. It is possible, for each application using HDFS, to specify the number of replicas, and then the level of fault tolerance, of a file. The number of replications could be specified when a file is created and, of course, can be changed later. As is in all data storage systems, the files are created, or updated; it is not possible, for example, append data to a file previously created. The NameNode instances are responsible for all actions in the replication of blocks.

The Hadoop HDFS implements a file permissions control for files and directories that is much more of the POSIX implementation. The file and the directory are associated with “owner” and “group” (like UNIX physical file system). The object file, or the object directory, has separate permissions for the owner user, for users are members of the group, and for all the other users. For files, the read permission is required to read the file, and the write permission is required to write or append to the file. For directories, the read permission is required to list the contents of the directory, the write permission is required to create or delete files or directories, and the execute permission is required to access a child of the directory. Obviously, there aren’t the executable files.

HDFS is Amazon S3[S3] compliant. The security access is based on Kerberos v5

ClouT – 31.07.2013 Page 45

D3.1 - Reusable components and techniques for CPaaS

FIGURE 24. APACHE HADOOP HDFS ARCHITECTURE OVERVIEW

FIGURE 25 - APACHE HADOOP CLUSTER SERVER ROLES

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 46

D3.1 - Reusable components and techniques for CPaaS

OPEN STACK SWIFT

Open Stack Swift is a cloud distributed object storage. It is conceptually similar to Amazon S3[S3] and implements a subset of its API in WSGI middleware. Like S3, Swift neither can be mounted as file system nor can be used as a raw block device. Rather, it allows storing, retrieving and deleting objects and related metadata in virtual containers (which in Amazon S3 are “buckets”) via a RESTful HTTP interface.

The distributed and multi-tenant architecture can potentially scale up to thousands of storage nodes with dozens of Petabytes of storage and has no single point-of-failure. Distributed means that each blob data is replicated across a cluster of storage nodes, where the number of replicas is configurable (at least three replicas are recommended for production infrastructures).

Objects in Swift are accessed via the REST interface, and can be stored, retrieved, and updated on demand. The object store can be easily scaled across a large number of servers. It is composed of the following main elements (the following description has been extracted from the project’s documentation website1):  Proxy Server - it orchestrates the storage access requests by searching for the location of the account, container, or object in the ring and by routing the request accordingly;  Ring - it maps the names of elements stored on disk with their physical location in the clusters. There are separate rings for accounts, containers, and objects;  Object Server- it is a blob storage server that can store, retrieve and delete objects stored on local devices, where objects are stored as binary files on the filesystem and metadata in the file’s extended attributes (xattrs);  Container Server and Account server- they allow listing respectively the objects in the container (the listings are stored as sqlite database files), and the container (per cluster, per tenant)  Replication - it manages the consistency of replicated objects in the cluster in case of system failure;  Updater - it processes the failed updates of container and data (for example in failure scenarios or periods of high load). With regard to the back-end file system, the only requirement is that it supports xattrs (extended attributes file system), which is a file system feature that allows to relate computer files with metadata not interpreted by the file system itself.

The security of the data is based on Swauth and Keystone implementations.

For further information please refer to the table in the appendix.

1http://docs.openstack.org/developer/swift/overview_architecture.html

ClouT – 31.07.2013 Page 47

D3.1 - Reusable components and techniques for CPaaS

CEPH

Ceph is a distributed object store and file system developed to provide high performance in accessing the data objects and lies on a reliable, scalable and fault tolerant architecture.

The following description and figures have been extracted from the project’s website2.

FIGURE 26 - CEPH UNIFIED STORAGE

Ceph allows accessing exabytes of data from thousands of clients. A Ceph node leverages low cost hardware and daemons, and a Ceph cluster manages a large numbers of nodes, which communicate with each other to replicate and redistribute data in dynamic mode. Ceph monitor can also run into a cluster of Ceph monitors to oversee the Ceph nodes in the Ceph Storage Cluster (a monitor cluster guaranties the high availability of the service).

Ceph provides the high scalability of the clusters based on RADOS, which is a distributed object store providing scalable storage service for a large variety of sized objects. Ceph components use CRUSH algorithm to compute information, in efficient mode, instead of having to depend on a centralized table. Ceph’s high-level stack includes a native interface to the Ceph Storage Cluster. It also exposes service interfaces made on top of the system to permit a good accessible using all the API technologies (Java, PERL, Python, etc.).

The Ceph architecture has been developed into two layers. The first one is RADOS based, previously introduced. The second layer is composed of the Ceph file system, which is made on top of that abstraction: file data is divided over objects, and the metadata server cluster provides the distributed access to a POSIX file system namespace (directory hierarchy) that’s ultimately backed by more objects.

The security access to Ceph objects can be configured in various modalities: no security (all users can access all objects) or “authentication user” (like Kerberos authentication). Cephx, the authentication sub system, uses shared “secret keys” for user authentication. The copy of the “secret key” is stored in the client and in the monitor. The authentication protocol is such that both parties are able to prove the identity of the secret key in independent mode. This provides

2http://www.ceph.com/

ClouT – 31.07.2013 Page 48

D3.1 - Reusable components and techniques for CPaaS reciprocal authentication, which means the cluster is sure the user possesses the secret key, and the user is sure that the cluster has a copy of the secret key.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 49

D3.1 - Reusable components and techniques for CPaaS

OBJECT STORAGE GE - FI-WARE IMPLEMENTATION

This Generic Enabler provides robust, scalable object storage functionality through an open, standardized interface: it exposes a CDMI interface on top of OpenStack Swift. These RESTful APIs can be accessed from any client technology that can communicate over HTTP.

By building on top of OpenStack Swift, all the benefits of this rapidly maturing open-source cloud storage solution can be realized. The highly-available, distributed, and scalable features of Open Stack SWIFT can be exposed using commodity hardware.

For further information please refer to the table in the appendix.

CDMI PROXY

CDMI Proxy has been developed within the FP7 European Research Project VENUS-C.

CDMI Proxy is a free implementation of SNIA CDMI standard3.

It is a storage server exposing CDMI-compliant interfaces for (currently) the following storage back-ends:

 Local disk: storing contents on a POSIX file system4  AWS S3: storing contents in Amazon public cloud  Azure Blob: storing content in Azure public cloud

The main features of CDMI Proxy are:

 Python based  It supports CDMI Blobs and Multi-level containers (also for single-level backends, e.g. AWS or Azure)  It stores metadata of the stored objects in CouchDB:

The CDMI Clients that work with CDMI Proxy include:

 libcdmi5: java SDK for running CDMI calls  libcdmi6 : python SDK for running CDMI calls  cdmifs7: FUSE[FUSE] based file system using the CDMI standard (v1.0)  r2ad8: demo clients for OCCI[OCCI] and CDMI

For further information please refer to the table in the appendix.

3 http://www.snia.org/tech_activities/standards/curr_standards/cdmi

5https://github.com/livenson/libcdmi-java 6https://github.com/livenson/libcdmi-python 7https://github.com/koenbollen/cdmifs 8http://r2ad.net/

ClouT – 31.07.2013 Page 50

D3.1 - Reusable components and techniques for CPaaS

REST/SOAP APIS FOR IDAS ACCESS

IDAS platform offers two APIs based on two different Web Services technologies, SOAP and REST and providing different functionalities mainly based on five different issues:

 Data Model: IDAS organizes data into a hierarchy of different components. This block offers functionality that allows to create, update, query and delete information of such components (services, assets, devices and groups).  Sensor Data: for operations related to retrieve the last measure and historical measures published by a device.  Command: for the operations related to sending commands and receiving results.  Subscription: for operations related to subscription to events.  Provision: for operations related to configuring internal components.

Every block has several entities, every entity has a specific and unique URI, several operations (GET read, POST create, PUT update, DELETE delete), not all entities have all the functions. All API requests/responses use JSON format, except notifications which are received in XML format by the subscribed applications.

IDAS states as main asset in the FI-WARE IoT Back-end Device management Generic Enabler (GE) within the Open Innovation Lab initiative.

For further information please refer to the table in the appendix.

TESTBED RUNTIME

Dense deployments need efficient modules and mechanism for managing deployed nodes in a remote way. Testbed Runtime module offers these set of functionalities through several entities defined next:

 Sensor Network Authentication and Authorization (SNAA): the SNAA component offers the basic functions for access control, through Shibboleth-based authentication and authorization. This allows users to access to different resources of the platform according to the corresponding rights associated to each user.  Reservation System (RS): this module allows authenticated users to make, query and edit reservations of IoT nodes and supports the persistence of these reservations. In this sense, a set of nodes can be selected for being sent a determined command or, indeed, for being remotely flashed using MOTAP (Multihop Over The Air Programming) procedure.

Wireless Sensor Network API (WSN API): the WSN API represents the back-end implementation of the set of functionalities required for interacting with the IoT nodes with commands such as reset, reprogram, check if a node is alive, add/remove virtual links and many others. The iWSN API provides also the implementation of a channel for exchanging debug and control message with IoT nodes.

ClouT – 31.07.2013 Page 51

D3.1 - Reusable components and techniques for CPaaS SOA3

Service Oriented Authorization, Authentication and Accounting (SOA3) is set of RESTful web services providing User Management, Authentication, Authorization, Accounting and Identity Federation.

The User Management Service is based on an LDAP and provides REST interfaces for performing CRUD operations on users, groups and roles : on these identities the Authentication Service provides REST and Java Methods to easily perform username/password authentication.

The Authorization Service is based on ABAC (Attribute Based Access Control) model. It takes authorization decisions on the basis of stored policies composed by four sets of attributes:

 Subject, defining who requires doing something (e.g. userid, roles, groups...)

 Action, defining what he/she wants to do (e.g. read, write...)

 Resource, defining on what the action should be applied (e.g. a certain image, document...)

 Environment, defining the external context on which the action should be performed (e.g. Hour, day ...)

The Accounting Service, also available as Java API, provides several options for record retrieving and report generation in a simple and flexible data model. It also provides a billing module.

The Identity Federation Service can be considered as a part of the Authentication Service, providing the possibility to compose a federation of domains in which a user of one domain can access the others by using the same credentials. The technology used is SAML, which in SOA3 is used with the extension SAML Condition to Delegate for providing Access Delegation.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 52

D3.1 - Reusable components and techniques for CPaaS

4.3. Dependability tools for accessing city data reusable components

D-CASE MONITORING TOOL

D-Case Monitoring Tool enables developers of ClouT applications to show their dependability with statically captured evidences, such as results of testing, fault injection, verification, etc. It also enables managers of ClouT applications to monitor their dependability in run time by collecting dynamic evidences from the running systems. It is provided as an Eclipse plug-in, on which users can author and execute D-Case documents, an extended form of assurance cases written using Goal Structuring Notation. In this tool, static evidence is represented with an evidence node to which one or more reasonable materials are associated. Dynamic evidence is represented with a monitor node to which a piece of program is associated.

FIGURE 27 SCREENSHOT OF D-CASE MONITORING TOOL

Figure 27 shows a screenshot of D-Case Monitoring Tool. It is a plug-in for the well-known development tool Eclipse providing the following two functionalities: editing and running a D- Case. The whole configuration is depicted in Figure 28.

ClouT – 31.07.2013 Page 53

D3.1 - Reusable components and techniques for CPaaS

FIGURE 28 SYSTEM OVERVIEW OF D-CASE MONITORING TOOL

D-Case is an extended form of Assurance Cases, which is widely used to show safety of infrastructural construction in European countries. Its idea is to (1) set the top goal, which is the condition that the developers need to meet, (2) divide the goal reiteratively based on strategies into a tree of small pieces of sub goals, and (3) show static evidence to the leaf sub goals in the tree. If all the leaf sub goals get their evidence, the top goal is considered to be met. The evidence can be shown in a D-Case using the evidence node, to which any form of material can be associated.

The new scheme we propose is a dynamic way of getting evidence from running systems. Since ubiquitous sensor networks are highly dynamic system, static evidence, such as results from test cases, etc., is not always useful. Rather, values in running ubiquitous sensor networks are more important to know whether the networks are running in a healthy state.

TABLE 3 LIST OF AVAILABLE MONITORING PROGRAMS

Name Lines Monitoring Target

SQLQuery 111 Faults in stored sensor data.

SNMPSwitch 328 Traffic of network switches

SNMPSwitchPort 132 Ports of a network switch

MuninClient 82 Wrapper of Munin

DNS 43 DNS entry

ping.sh 35 Reachability of an IP node

df.sh 28 Disk capacity of a host USB device connectivity of a lsusb.sh 51 host uptime.sh 30 Load of a host

ClouT – 31.07.2013 Page 54

D3.1 - Reusable components and techniques for CPaaS ifconfig.sh 43 Network interface of a host

To this end, we enable programmers to associate a piece of program to the monitor node in D- Case. This piece of program is what we term as a Monitoring Program. We then enable developers to execute a D-Case, resulting that all the monitoring programs associated to monitor nodes are executed. We have successfully implemented monitoring programs, such as those getting values from SNMP-enabled devices, failure detection mechanism from a series of sensor data, and so on.

Figure 27, shows a D-Case being executed, in which the colored oval nodes are monitor nodes. The red ones mean that the monitoring program associated to the node detected some faults, while the blue ones mean the targeted system is running in a healthy state in the monitoring programs' scope. Developers, by watching at a whole D-Case can easily find faults in running ubiquitous sensor networks. They can also know detailed information gathered by the monitoring programs on web pages, whose sample can be found in Figure 29, that are exported by D-Case runtime.

FIGURE 29 WEB PAGE FOR DETAILED MONITORING DATA

ClouT – 31.07.2013 Page 55

D3.1 - Reusable components and techniques for CPaaS D-Case, not only enabling developers to visually find faults, allows them to add an Action node to monitor nodes. The role of an action node is to automatically resolve or mitigate the effect of the faults. The idea of action nodes is to associate a piece of recovery program to the action nodes. In Figure 27, the node at the bottom connected to all the monitor nodes is the action node. Simple example of an action node include notifying developers/managers via e-mail, remotely rebooting a sensor node, modify routing paths in a sensor network, etc…

Since D-case leverages natural languages and structuring figures, developers can assemble a D- Case as detail, consistent, and exhaustive as they can. Oppositely, a D-Case may be too abstract, inconsistent, or defective if the developers fail to use good strategies to divide goals into sub- goals. To ease the development of a D-Case we recommend developers to first conduct risk analysis, and then put its result into a D-Case. This will increase exhaustiveness of a D-Case with help of the established risk analysis methods like FTA and FMEA.

For further information please refer to the table in the appendix.

ClouT – 31.07.2013 Page 56

D3.1 - Reusable components and techniques for CPaaS 5. Other Reusable CPaaS Assets from Cities

5.1. Introduction

In this paragraph are described all the components used by the cities platforms involved in the ClouT project.

5.2. Genoa Reusable CPaaS assets

In this paragraph are listed all the reusable components proposed by Genoa municipality.

ROADVISIOR - EMIXER PLATFORM e-miXerprovides a service-oriented middleware infrastructure enabling the integration of data/services supplied by different operators in our domain of Traffic and Travel Information.. Key interoperability feature and element in e-m iXeris a multi-standard interface, which is able to interconnect and combine data from different systems and map them to a common data and service model. Diversified content/services offerings are then made accessible and suitable for use by Traffic Information Service Providers (TISPs) via a standardized access layer: a B2B interface implementing several European ITS reference standards in the various addressed mobility domains (traffic, public transport, parking, etc.).

The e-miXer core drives the process of data acquisition based on parameters hosted by the registry. The storage connector enables a temporary caching or permanent storage of the information. e-miXer engines process data for specific purposes (e.g. routing calculation, traffic data processing etc.). e-miXer provides both B2C and B2B services:

ClouT – 31.07.2013 Page 57

D3.1 - Reusable components and techniques for CPaaS  The e-miXer client interface enables the provision of services for the end users (both public and restricted access)  B2B interfaces are available for external service providers whenever necessary (e.g. providers of IVR, SMS, email services)

GeoServer is an open source software server written in Java that allows users to share and edit geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards.

Being a community-driven project, GeoServer is developed, tested, and supported by a diverse group of individuals and organizations from around the world.

GeoServer is the reference implementation of the Open Geospatial Consortium (OGC) Web Feature Service (WFS) and Web Coverage Service (WCS) standards, as well as a high performance certified compliant Web Map Service (WMS). GeoServer forms a core component of the Geospatial Web.

GeoServer can create maps in a variety of output formats. OpenLayers, a free mapping library, is integrated into GeoServer, making map generation quick and easy. GeoServer is built on Geotools, an open source Java GIS toolkit.

Geoserver is a free software.

ClouT – 31.07.2013 Page 58

D3.1 - Reusable components and techniques for CPaaS The municipality of Genoa has implemented a Geoserver solution to handle some geo-data related to safety (Civil Protection, Industrial Areas, Public Works, Safety).

WEATHER STATION

The Weather Station System developed in collaboration with Civil protection can export a series of detailed information about the temperature, the humidity level, and weather in general, for more than twenty areas of the Municipality.

The Weather Station is based on a server side scripting language (PHP) in order to produce data. The platform is composed by a Web server with a PHP processor module which generates the resulting Web page. The platform’s scripts act as filters taking inputs from a series of parameters and outputting another stream of data. The output will be HTML or WAP. The data are updated every five minutes.

Data returned from Weather Station

Measure Unit Description

Temperatura: °C Temperature

Umidità: % Humidity

Vento: km/h da Wind

Vento medio: km/h Averagewind

ClouT – 31.07.2013 Page 59

D3.1 - Reusable components and techniques for CPaaS Pioggia giorno: mm Rainday

Rain Rate: mm/h Rain Rate

Dewpoint : °C Dew point

Pressione: hPa Pressure

Temp minima: °C Minimum Temp

Temp massima: °C Max Temp

Raffica max: km/h Max Wind Speed

Rain rate max: mm/h Max Rain rate

Pioggia mese: mm MonthRain

Pioggia anno: mm YearRain

5.3. Santander reusable CPaaS assets

In this paragraph are listed all the reusable components proposed by Santander municipality.

GIS PLATFORM

Santander City Council has a GIS Platform for storing and processing Geo referenced Data, accordingly to the City needs. This platform uses ESRI Technology and offers a big amount of Interoperability.

Some of the geo referenced data reused in this project are running in this platform so it considered also a reused asset in the Project. Here is a simplified vision of the Platform Architecture:

ClouT – 31.07.2013 Page 60

D3.1 - Reusable components and techniques for CPaaS

ArcGIS Server is a powerful and flexible platform for providing a variety of spatial data and services to GIS users and applications requirements. It provides the ability for our organization to implement server-based spatial functionality for focused applications utilizing the rich functionality of ArcObjects. Building robust, scalable applications is not a simple task, and proper application design is required to support systems having good performance.

ArcGIS Server is a distributed system consisting of several components that can be distributed across multiple machines. Each component in the ArcGIS Server system plays a specific role in the process of managing, activating, deactivating, and load balancing the resources that are allocated to a given server object or set of server objects. The components of ArcGIS Server can be summarized as follows:

 GIS server: responsible for hosting and managing server objects, is the set of objects, applications, and services that makes it possible to run ArcObjects components on a server. The Gis Server is divided in the next set of components: o Server Objects: A server object is a software object that manages and serves a GIS resource such as a map or a locator. For example, a server object named SantanderMap may serve a map document of data for the city of Santander, while the server object SantanderGeocode may serve an address locator for geocoding addresses. ArcGIS Server objects are themselves ArcObjects components.

Server objects are managed and run within the GIS server. A server object may be preconfigured and preloaded in the server and can be shared between applications. Server applications make use of server objects and may also use other ArcObjects components that are installed on the GIS server.

ClouT – 31.07.2013 Page 61

D3.1 - Reusable components and techniques for CPaaS o Server Object Manager (SOM): The GIS server is composed of a SOM, which is a service running on a single machine, and SOCs (later described), which run on one or more machines (container machines). The SOM manages the set of server objects that are distributed across one or more container machines. When an application makes a direct connection to a GIS server over a LAN or WAN, it is making a connection to the SOM, so the parameter that is provided for the connection to be made is the name or Internet Protocol address of the SOM machine (URL). o Server Object Container (SOC): The container machine or machines actually host the server objects that are managed by the SOM. Each container machine is capable of hosting multiple container processes. (A container process is a process in which one or more server objects are running.) Container processes are started and shut down by the SOM. The objects hosted within the container processes are ArcObjects components that are installed on the container machine as part of the installation of ArcGIS Server. All server objects have the potential to run on all container machines and are balanced equally across all container machines. Therefore, it is important that all container machines have access to the resources and data necessary to run each server object. It is also important to note that the GIS server assumes that all container machines are configured equally, such that they are all capable of hosting the same number of server objects. Server object resources and data are discussed in more detail in the next section o Server Directory: The server directory is a location on a file system. The GIS server is configured to clean up any files it writes to a server directory. By definition, a server directory can be written to by all container machines. The GIS server hosts and manages server objects and other ArcObjects components for use in applications. In many cases, the use of those objects requires writing output to files. For example, when a map server object draws a map, it writes images to disk on the Web server machine. Other applications may write their own data; for example, an application that checks out data from a geodatabase may write the checkout personal geo database to disk on the server.

Typically, these files are transient and need only be available to the application for a short time, for example, the time for the application to draw the map or the time required to download the checkout database. As applications do their work and write out data, these files can accumulate quickly. The GIS server spawns a special SOC process that will automatically clean up its output if that output is written to a server directory.

A server directory can be configured such that files created by the GIS server in it are cleaned based on either file age or time since they were last accessed. The maximum file age is a property of a server directory. All files created by the GIS server that are older than the maximum age or have not been accessed during the time defined by the maximum age are automatically cleaned up by the GIS server.

In addition to supporting output of image files to the file system, ArcGIS Server also supports streaming of images from the SOC (via the Web server) to the user. This is useful since the streaming occurs over port 80, which avoids having to open additional TCP ports for drive mapping from the SOC to the Web server.

ClouT – 31.07.2013 Page 62

D3.1 - Reusable components and techniques for CPaaS

 Web browser: The Web server hosts server applications and Web services written using the ArcGIS Server application program interface (API). These server applications use the ArcGIS Server API to connect to a SOM, make use of server objects, and create other ArcObjects for use in their applications. These Web services and Web applications can be written using the ArcGIS Server Application Development Framework (ADF), which is available for both .NET and Java developers. Examples of Web applications include mapping applications, disconnected editing applications, and any other application that makes use of ArcObjects is appropriate for Web browsers. Examples of Web services include those for exposing map and geocode server objects that desktop GIS users can connect to and consume over the Internet. It is possible for users to create their own native .NET or Java Web services whose parameters are not ArcObjects types but do perform a specific GIS function. For example, it is possible to write a Web service called FindNearestHospital that accepts (x,y) coordinates as input and returns an application- defined Hospital object that has properties such as the address, name, and number of beds.

Web applications connect to GIS servers within their organization over the LAN. In this sense, the Web application or Web service is a client of the GIS server. Users connect to Web applications and Web services over the Internet or Intranet, but all the Web applications' logic runs in the Web server and sends HyperText Markup Language (HTML) output to the browser client. The Web application itself makes use of objects and functionality running within the GIS server. This allows the development of Web applications to make use of ArcObjects in the server as would a desktop application connecting to the GIS server in client/server mode over the LAN or WAN.

As users interact with their browsers, they make requests to the Web application, which in turn makes requests on the SOM. The SOM returns a proxy to a server object or server objects that are running within the GIS server. The Web application uses this proxy to work with the object as if it existed in the Web applications process, but all execution happens on the GIS server.

 Desktop applications: Users can connect to ArcGIS Server using ArcGIS Desktop applications to make use of map and geocode server objects running in the server. Users use ArcCatalog to connect to a GIS server directly on the LAN. They can also specify the URL of a Web service catalog to indirectly connect to a GIS server over the Internet to make use of map and geocode server objects exposed by that Web service catalog. In addition, the set of server objects and their properties are managed by the GIS server administrator using ArcCatalog. Our administrators can connect to the GIS server over the LAN and use ArcCatalog to add and remove map and geocode server objects. They can also configure how server objects should be run including the set of container machines that are available for the server and the directories on the server that they can use to write any output.

 ArcSDE: ArcSDE is ESRI's technology for accessing and managing geospatial data within relational databases. ArcSDE technology supports reading and writing of multiple standards including (among other data storage options) Open Geospatial Consortium, Inc. (OGC), standards for simple features; the International Organization for Standardization (ISO) standard for spatial types; and the Oracle Spatial format. ArcSDE capabilities are next:

ClouT – 31.07.2013 Page 63

D3.1 - Reusable components and techniques for CPaaS o It is open and interoperable across multiple database management systems (DBMS). In Santander City Council it is running over Oracle Platform. o It is standards based, using as its native data structure the OGC binary simple features standard and the ISO spatial type. o It supports full, open SQL access to geo databases stored in Oracle, IBM DB2, IBM Informix, and PostgreSQL. o It provides high performance and scales to a large number of users. ArcSDE geo databases outperform all other solutions for storage and retrieval of spatial data.

ClouT – 31.07.2013 Page 64

D3.1 - Reusable components and techniques for CPaaS

OPEN DATA PLATFORM

Santander City Council has recently deployed an Open Data platform for opening to the Citizen, all public data that resides in his internal databases. Main focus of this platform is directed towards Companies and Entrepreneurs, in order to take advantage of this data and create products and services based on these, thereby promoting the business and the creation of jobs. The focus has also been placed on providing citizens proactively data, looking for a better understand of how an administration works internally, and eliminating, in some cases, slow and costly administrative procedures to access data that, although public, it is not available to citizens.

The architectural definition of the platform is described in the next picture:

For a better understand of the platform, here is a description of main systems:

Front-End:

Front-End is the visible part of the Platform, it is the graphical interface for the final user. From the technological point of view, it is composed by three popular components, developed by the open source community, that had been reused and adapted to specific needs of the platform. These components are

1. Wordpress: Prestigious CMS, aimed, from his birth, to build Blogs, but over time has incorporated features to be considered the content management system most used in the network. This component has the responsibility to implement the final graphical user interface that provides access to the data and the Open Data Portal.

2. CKan: It is a CMS focused in Open Government Projects, used and managed mainly by United Kingdom Government, and reused by main Open Data Portals World Wide. This component is in charge to outfit with all infrastructure for defining and meta-dating

ClouT – 31.07.2013 Page 65

D3.1 - Reusable components and techniques for CPaaS datasets and resources, and also supply APIs to allow developers, to automate data consumption. 3. Virtuoso: Virtuoso is a Web tool that allows the terminology definition, for the creation and promotion of Semantic Web. Within the Platform, his role is to define those specific words of a Municipality or local Authority, that had not been defined so far by any standardization corporation or other Open Data Platforms.

Back-End

Open data platform, has also a Back-End subsystem in charge of doing data gathering tasks and supporting Sparql engine. This subsystem is composed by two components, also developed by Open Source Community, and adapted for the particular platforms needs. These components are:

1. iCMS: A data gathering system developed by Government of Andalucía, which main function is to constantly maintain Front-End feeded with updated data. It should be mentioned that the data contained on the website are not mere snapshots of available data at a given time, but the system is being updated constantly by Municipal databases to always provide updated data. Therefore, this tool or technology component plays a transcendental role in the Platform to enable a real time open data platform.

As an example, in the case of census dataset, if we query the number of people from one day to another, we see that the total amount will be varied depending on the registrations and cancellations occurred in that time interval.

This real-time updating is made through ad-hoc drivers developed for every Municipality data producer.

2. Marmota/Neogolsim: This tool is responsible for providing the SPARQL engine to the platform. The purpose of this engine is the one hand, provide data in RDF format (format of choice for reuse) and on the other, allow creating cross-data queries with other Open Data Platforms world-wide, thus providing platform with advance features that allows it to follow the Semantic Web path 3. Finally this platform is a reusable component within the Project because actually it is serving several information that are valuable for the Multimodal City Transport use cases. Also this platform can be a good channel to make any additional municipality data available need by this project.

ClouT – 31.07.2013 Page 66

D3.1 - Reusable components and techniques for CPaaS

ORACLE DATABASE PLATFORM

Santander City Council has actually a Municipality database Cluster based on Oracle Technology that stores and processes most of City data, including valuable data for the Clout Project as traffic measures, and Participatory Sensoring events. Also other secondary database platforms are running in this platform, like MySql or Microsft Sql Server, but we will focus on the Oracle one, that has the best reusable capabilities for this project. Here is a global vision of the architecture:

FIGURE 30 SANTANDER CLUSTER DATABASE ARCHITECTURE

The Oracle cluster comprises multiple interconnected servers that appear as if they are one server to end users and applications. The cluster is mounted on Oracle RAC technology that makes the hard work and enables us to cluster Oracle databases, and to bind multiple servers so they operate as a single system.

Oracle RAC also provides high availability and scalability for all application types, and enables us to implement grid computing architecture. Having multiple instances accessing a single database, prevents the server from being a single point of failure. Applications deployed on Oracle RAC don’t need to adapt code to support it.

Our Oracle Cluster has two database instances, each one contains memory structures and background processes. The Oracle RAC unify this instances and create a single-instance with

ClouT – 31.07.2013 Page 67

D3.1 - Reusable components and techniques for CPaaS same processes and memory structures as well as additional process and memory structures that are specific to Oracle RAC. Any one instance's database view is nearly identical to any other instance's view in the same Oracle RAC database; the view is a single system image of the environment.

Each instance has a buffer cache in its System Global Area (SGA). But with Cache Fusion, Oracle RAC environments logically combine each instance's buffer cache to enable the instances to process data as if the data resided on a logically combined, single cache.

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction as fast as possible, and to ensure data integrity, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global En-queue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances, which increases the size of the SGA for an Oracle RAC instance.

After one instance caches data, any other instance within the same cluster database can acquire a block image from another instance in the same database faster than by reading the block from disk. Therefore, Cache Fusion moves current blocks between instances rather than re-reading the blocks from disk. When a consistent block is needed or a changed block is required on another instance, Cache Fusion transfers the block image directly between the affected instances. Oracle RAC uses the private interconnect for inter instance communication and block transfers. The GES Monitor and the Instance En-queue Process manages access to Cache Fusion resources and en-queue recovery processing.

This platform that actually gives database support to reusable components of this project like Traffic data or Participatory sensoring events, must be also re-used to support some kind of relational data storage needs.

ClouT – 31.07.2013 Page 68

D3.1 - Reusable components and techniques for CPaaS

SAE PLATFORM

Transportation service of Santander City Council is equipped with an Exploitation Supporting System. This system, also known as SAE Platform, is a set of solutions that merges several technologies to improve service and management of transportation. The main component of this Platform is the onboard equipment composed by a GPS locator, a central process unit and a communication system, capable to communicate in real-time (or a configurable frequency) the vehicle's position to the control center. Another major component is the control center, in charge of receiving all the data generated by the on-board Equipment and give our staff the possibility to make decisions, through an operation console, and take actions over the related vehicle. The applications of the SAE platform are divers, however, in Santander Municipality are focused on managing bus fleet. This system allows Municipality Transportation Service:  To track frequency of the bus stopping in regular bus lines.  To report passengers about estimated time to arrive of the next line bus.  Optimize number of vehicles in line accordingly to each moment needs.

 Locate all vehicles through a GUI projected over Municipality Cartography

SAE platform provide tools that allow carrying out a comprehensive tracking of the transportation network to enhance the social service offered to the citizen and to optimize the planning of metropolitan transportation lines.

All the information generated by this system is a valuable asset for the Clout Project. For Instance, the bus real stop frequency log, the arrival bus estimations, the planned lines timetable, the bus stop locations, are clear examples of candidate data to be reused in the project.

ClouT – 31.07.2013 Page 69

D3.1 - Reusable components and techniques for CPaaS

INCISIS PLATFORM

Within SmartSantander Project, Santander City Council implemented a Participatory Sensoring App, named Pace of the City, to enable extension of IoT deployment to the Citizen mobile devices. This initiative allows Citizen to send measures and Events to the SmartCity Platform. Measures contribute to feed system with more valuable data, and Events contribute to create a perspective view of what is happening on the city on each moment, and give the possibility to study human behavior within an Urban Area.

FIGURE 31 SANTANDER INCISIS IMPLEMENTATION ARCHITECTURE

Some of this reported events, implies the Santander City Council to take actions, in order to give response to matters of his responsibility. For example, if a hole in the street is reported, City Council must take actions to repair it as soon as possible to avoid major damage or Citizen injuries.

For this Purposes, Santander City Council deployed an internal Platform, named INCISIS, which allows Municipality workers, to make the management of the event. The functional schema is next:

Participatory Participatory Services Service Service

Filtering Task Allocation Reception Normalization

FIGURE 32 SANTANDER INCISIS FUNCTIONAL SCHEMA

Citizen Participatory Municipality service receives events in the platform and acts as a first level helpdesk. This service makes a filtering of the events and normalization of the information

ClouT – 31.07.2013 Page 70

D3.1 - Reusable components and techniques for CPaaS received. In those cases where a fast action or response can be given, this service resolves and closes the event. If Event requires a further action, this service allocates it, to a more specialized Municipality Service and the Event is treated as a second level of Helpdesk.

After reception of the task the service can perform several actions identified in the next figure: Service Resolve & Close Resolve & Reallocation Reallocation Cancel

FIGURE 33 INCISIS SERVICE EXAMPLE SCHEMA

 Resolve and Close: When a solution is attached to only the service allocated, this does the job, and closes the event describing the actions taken.  Resolve an Reallocation: For events affecting multiple services, the service performs the work necessary to solve its part of the task and reassign it to a new service reporting the actions taken.  Reallocation: A service has been allocated with an event by mistake or simply considered it is not on his scope, so it reallocates the event to a new service reporting the reason for the reallocation  Cancel: The service cancels the incidence and reports a quite justifiably reason.

To Conclude, INCISIS platform is a value platform for the ClouT project, because it is closely related with the Participatory sensoring uses cases planned for this Project. Also, this platform offers SOAP APIs to query or register events, so any exchange of data with other platform of the project could be accomplished.

ClouT – 31.07.2013 Page 71

D3.1 - Reusable components and techniques for CPaaS

5.4. Fujusawa reusable CPaaS assets

In this paragraph are listed all the reusable components proposed by Fujusawa municipality.

DISASTER PREVENTION GIS SYSTEM

As possible CPaaS assets in Fujisawa, Fujisawa has disaster prevention GIS system. The system has following services – quick providing list of disaster information, equipment map for disaster prevention, evacuee search engine and communication BBS, etc. However, in 3.11 earthquake disaster in Japan, the GIS system did not work perfectly. Thus, it is important to create this kind system with dependability. In addition, it is necessary to combine this kind of GIS system with more various sensors.

Figure 34: Fujisawa disaster prevention gis system

ClouT – 31.07.2013 Page 72

D3.1 - Reusable components and techniques for CPaaS

DIGITAL SIGNAGE SYSTEM

Fujisawa, Keio University and several organizations has been experimented digital signage system in Fujisawa, called Fujisawa Signage. 15 digital signage are set to different 15 places such as city office, citizen center, cultural center in Fujisawa. Each signage can show each different contents which is useful for citizens who visits the place where the signage is placed. In addition, citizens themselves can post their original information to digital signage through several authentication steps.

Figure 35. Overview of Fujisawa Signage

ClouT – 31.07.2013 Page 73

D3.1 - Reusable components and techniques for CPaaS

5.5. Mitaka Reusable CPaaS assets

In this paragraph are listed all the reusable components proposed by Mitaka municipality.

GIS SYSTEM

Mitaka city provides publically available web based GIS systems. It provides 24 kinds of maps in 6 categories; city status, living, welfare and health, security and untroubled life, sight-seeing /culture/nature, and child-raising/education. For example, traffic information category, many information about traffic such as “there are many bicycle in this road” and “it is hard to recognize surround situation in this sloping road” is provided to map. In terms of sensor-related information, this map provides amount of radiation in many points in Mitaka because citizen’s concern to radiation has been increased due to 3.11 disaster. However, the information is very static, not real-timely updated. Moreover, other kinds of sensors such as temperature, humidity or noise sound level are not currently provided. Thus, it is important to provide real-time IoT information to citizens by using this kind of GIS visualizing system.

Figure 36:Mitaka GIS system

ClouT – 31.07.2013 Page 74

D3.1 - Reusable components and techniques for CPaaS 6. Conclusions

This document presents different solutions for different problems using heterogeneous technologies. Synthetic tables are provided in order to facilitate the analysis that we will do on these components in order to choose the most adequate ones in the light of the project use cases and objectives. Further analysis (e.g., compatibilities between the components (used languages, data models, etc.), maturity analysis, API compatibility, etc.) and selection will be done in the project and reflected in the next deliverable of the Work package 1, D1.2: First version of Reference Architecture and reusable components.

As described in the introduction of this document, each chapter has been dedicated to a specific topic (regarding to the Infrastructure layer); for each topic one or more reusable objects have been identified, as shown in the following schema:

6.1. Service composition platforms for citizen’s applications

The components presented in the Service composition platforms section include solutions from 2 main domains: data mashup tools (SWIFT, WireCloud, Mycocktail, EMML, OpenSocial, Yahoo Pipes and Apache Shinding) and service/business process composition solutions (iPojo. BPMN and BPEL). Besides, the section also presented five methods to make the composition robust, respecting QoS properties, verifying time and resource constraints and detecting conflicting business rules. While the data mash up tools target non-technical end- users in order to create rapidly applications from existing data sources, service composition tools target more experienced developers that require a minimum dependability and Quality of Service requirements. While mash up technologies are mostly web based platforms, service composition tools may require more sophisticated tools in order to provide required

ClouT – 31.07.2013 Page 75

D3.1 - Reusable components and techniques for CPaaS properties to the services. Other criteria for selection of the components will be used programming languages, licenses (free, open) and compatibility with other neighbor components in the architecture.

6.2. Big data processing

The components presented in the big data processing section can similarly be grouped into 3 classes: processing of data (Esper, FI-WARE Gateway Data Handling, JBoss Drools Expert), storage of data (MongoDB) and solutions for self-healing of faulty data coming from IoT devices. These components represent the data processing features that are necessary on the platforms in order to process the IoT device data. Note that, all about data storage and secure access to city data in the cloud are the issues belonging to the Section “Secure and dependable access to city data “.

6.3. Secure and dependable access to city data

In the Secure and dependable access to city data paragraph are presented in two groups, Open city data hosting and tools for accessing city data. The hosting of data is performed by the noSQL database software (Hypertable and HBASE) and the cloud storage and distributed file systems (Open Stack Swift, Ceph and Hadoop HDFS). The access to the resources (data stored in database or file stored in the storages) is centralized thanks to the CDMI Proxy framework. The security access to the resources is performed by the SOA3 framework. The component presented in dependability tools for accessing city data reusable components is D-Case Monitoring Tool. This tool is used to monitoring the resources.

6.4. City resources

In the city resources are presented all the components used by the municipalities. There tools used to store data (Oracle database for example), tools used to monitor of the risk catastrophic event (Disaster prevention for example). More of the tools presented in this section are GIS systems.

ClouT – 31.07.2013 Page 76

D3.1 - Reusable components and techniques for CPaaS 7. Appendix

7.1. TG3.1 Reusable Components

In this paragraph are reported the tables with all reusable components you introduced in chapter 2 “Service Composition Platform for Citizen’s applications”, chapter 3 “Big data Processing”, chapter 4 “Secure and dependable access to city data” and chapter 5 “Other Reusable CPaaS Assets from Cities”.

TABLE 4 – OPEN STACK SWIFT

Proposing Partner ENG

Info Name Open Stack SWIFT Owner(s) OpenStack Foundation

Link to code https://github.com/openstack/swift Open Stack Swift: http://www.openstack.org/software/openstack- storage/ ID Card ID Project, plus link Patents or other IPR exploitation (licenses) The OpenStack project is provided under the Apache 2.0 license Description Open Stack Swift is cloud distributed object storage…. Type Type Programming language/based technologies Python

Exposed APIs RESTful APIs provided by OpenStack services All end user applications and ClouT architectural components that need Target User(s) to use ClouT storage and are able to interact with RESTful APIs

Anatomy Rackspace: http://www.rackspace.com/cloud/

Companies Supporting The OpenStack Foundation: References http://www.openstack.org/foundation/companies/

Adopted  RESTful; Standard(s)  HTTP; The goal of Open Stack Swift is the capability to store petabyte of data in a distributed, standard, infrastructure of cluster servers.

Open Stack Swift is not a file based storage system, but a very distributed, with redundancy capabilities, cloud storage infrastructure.

Thanks to the replication of data (stored and replicated in more than Assessment Capabilities one server), Open Stack Swift has an high level of secure in case of /Opportunities failure of one server.

ClouT – 31.07.2013 Page 77

D3.1 - Reusable components and techniques for CPaaS

Known Limitations/Threats The object stored must be less than 5 GB………

TABLE 5- WIRECLOUD

Proposing Partner ENG

Info Name Wirecloud UPM CoNWeT Lab (Universidad Politécnica de Madrid's School of Owner(s) Computer Science) Link to code https://github.com/Wirecloud/

rd FI-WARE project Wirecloud page, http://catalogue.fi- ware.eu/enablers/documentation-6 ID Ca ID Wirecloud homepage : http://conwet.fi.upm.es/wirecloud/ Project, plus link Patents or other IPR Affero General Public License version 3 (AGPL v.3) : exploitation http://www.gnu.org/licenses/agpl-3.0.html (licenses) The WirecloudMashup Platform is a framework that allows end-users to create specific mashups composing widgets that are the building blocks to create new applications. The WirecloudMashup Platform, is composed by three different components: the Application Mashup Description Editor, the Mashup Execution Engine and the catalogue. Type Web application Server side:  Python Client side: Programming  Javascript language/based  HTML

technologies  CSS The platform provides two different type of API:  The Widget API is a JavaScript API that allows deployed widgets in a Mashup Execution Engine to gain access to its Anatomy functionalities  The Application Mashup API is a RESTful API that provides the functionality to create and modify workspaces and to manage the resources available for building these workspaces. Exposed APIs

Target User(s) End users, software developers The Wirecloud platform has been developed and used in the FI-WARE project (www.fi-ware.eu): moreover Wirecloud is currently used in different EU research projects such as Outsmart (http://www.fi-ppp- References outsmart.eu/) and Envirofi (http://www.envirofi.eu)

ClouT – 31.07.2013 Page 78

D3.1 - Reusable components and techniques for CPaaS Adopted Standard(s) RESTFul, HTTP WirecloudMashup Platform provides several functionalities that could be useful for the mashup tool to be developed in ClouT. In particular the graphical editor, that allows to connect the gadgets data, is simple to use for non-technical users and also the catalogue allows to share mashups with other stakeholder of the ecosystem fostering reuse of Capabilities services in different contexts (example the different pilots of ClouT

/Opportunities project). Also the AGPL license allows a possible reuse of this platform.

Some problems could be related in technical incompatibilities with other components of ClouT architecture: a deep study of the API will be conducted in order to understand the real reusability of the platform Assessment functionalities from external components. The platform doesn’t allow creating or modifying the widgets using the graphical editor: the single widget has to be created through software programming and then imported in the catalogue. Known Also it seems that the widgets cannot be exported or shared in different Limitations/Threats containers (e.g. open social gadget containers) limiting the use of the mashup to the Wirecloud platform.

TABLE 6- MYCOCTAILS

Proposing Partner ENG

Info Name MyCocktail - Romulus Mashup Builder Owner(s) InformáticaGesfor, Romulus Project Binary: http://sourceforge.net/projects/mycocktail/files/releases/v1/MyCocktail_1.5.0.z ip/download

Source:https://mycocktail.svn.sourceforge.net/svnroot/mycocktail/trunk/MyCoc ktail/ Link to code Online instance: http://www.ict-romulus.eu/MyCocktail/# ID Card ID Romulus project: http://www.ict-romulus.eu

MyCocktail website: http://www.ict-romulus.eu/web/mycocktail/home Project, plus link Patents or other IPR exploitation (licenses) , Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0 ) MyCocktail provides two different web tools: the mashup builder and the page editor. The mashup builder is a graphical environment that allows combining and modifying data, using specific filters, information coming from REST services, and connecting these data with web gadgets creating mashups. The Page Editor allows designing a web page through a GUI allowing to integrate mashups created with

Description MyCocktail with other HTML elements. Type Web Application Server side:

Anatomy  Java Client side: Programming  Javascript language/based  HTML technologies  CSS Exposed APIs None

ClouT – 31.07.2013 Page 79

D3.1 - Reusable components and techniques for CPaaS

Target User(s) End users, software developers MyCocktail platform has been developed in the European ICT Romulus Project. It has been used by other European research projects such as ICT Omelette project References (http://www.ict-omelette.eu)

Adopted Standard(s) Open Social Specifications, WADL, Rest The MyCocktail web application has different features that could be interesting in the ClouT contest: in particular the possibility to import and use REST services to create mashup could be useful in ClouT considering that many of the services provided in the CPaaS layer can be provided as REST. Also the functionality to Capabilities export gadgets in other platforms that support open social specifications enlarge /Opportunities the possibility to reuse the mashups in external environments.

The MyCocktail GUI doesn’t provide a clear way for the representation of the

Assessment mashups, such as arrows that connect the different elements: these could confuse Known the final user in particular if he is not a technician. Also the platform doesn’t Limitations/Thre expose API for external systems that would use the server side engine to perform ats mashup.

ClouT – 31.07.2013 Page 80

D3.1 - Reusable components and techniques for CPaaS

TABLE 7- ENTERPRISE MASHUP MARKUP LANGUAGE

Proposing Partner ENG

Info Name Enterprise MashupMarkup Language (EMML) Owner(s) Open Mashup Alliance (OMA) (http://www.openmashup.org/) EMML reference runtime implementation: Link to code http://www.openmashup.org/download/

ID Card ID Project, plus link Open Mashup Alliance: http://www.openmashup.org Patents or other IPR exploitation Creative Commons License of Attribution No Derivatives: (licenses) http://creativecommons.org/licenses/by-nd/3.0/ EMML (Enterprise MashupMarkup Language) is an XML markup language for creating enterprise mashups promoted by the Open Mashup Alliance. EMML is a declarative mashup domain-specific language (DSL) that eliminates the need for complex, time-consuming, and repeatable procedural programming logic to create enterprise Description mashups. Type Domain Specific Language based on XML Programming language/based technologies XML Exposed APIs N/A Anatomy

Target User(s) Software developers EMML is an open language specification that is promoted by the Open References Mashup Alliance that includes several international IT market leaders such as Adobe, HP and Intel. The EMML itself is not standard but the objective of the OMA is to Adopted submit the EMML specification to a recognized industry standards Standard(s) body. EMML represents one of the few attempts of standardization in the field of mashup: additionally, the language is sponsored by a strong consortium that includes big IT international actors. The language seems to cover most of the possible mashup needs in the ClouT contest

Capabilities considering that support heterogeneous services and data formats. The /Opportunities possible adoption in ClouT will be investigated in the next project phases.

Assessment The main limitation that can be identified at this stage is related to the licence of EMML: the Creative Commons License of Attribution No Known Derivatives implies that the language can be freely reused but it cannot Limitations/Threats be modified to create for example an extended version that could be more suitable for the ClouT requirements.

ClouT – 31.07.2013 Page 81

D3.1 - Reusable components and techniques for CPaaS

TABLE 8- OPENSOCIAL

Proposing Partner ENG

Info Name OpenSocial Owner(s) OpenSocial Foundation

Link to code http://docs.opensocial.org/display/OSD/Specs Project, plus link OpenSocial Foundation: http://opensocial.org ID Card ID Patents or other IPR exploitation (licenses) Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0) OpenSocial is a set of common APIs (Application Programming Interface), based on HTML and JavaScript, for social software applications that allow to access data and core functions on Description participating social networks. Type Web application framework Programming language/based

technologies Java, PHP, C#, Javascript, HTML OpenSocial defines a set of common application programming Exposed APIs interfaces for web-based applications. Anatomy Target User(s) Software Developers Initially developed by Google it has been supported and implemented by several Platforms and social networks. In particular different References enterprise social platforms are compliant with OpenSocial specifications: some example are Liferay, Jive, and SugarCRM Adopted Standard(s) OpenSocial is itself an open standard OpenSocial is a mature standard that has been adopted by several Capabilities enterprise platforms in the last years. The gadget approach that can be /Opportunities implemented through its specification is in line with the presentation layer of the Mashup editor that is planned to develop in ClouT.

Assessment Known Limitations/Threats No particular limitation has been identified at this stage

ClouT – 31.07.2013 Page 82

D3.1 - Reusable components and techniques for CPaaS

TABLE 9- YAHOO! PIPES

Proposing Partner ENG

Info Name Yahoo! Pipes Owner(s) Yahoo!

Link to code N/A Card Project, plus link http://pipes.yahoo.com/pipes/ ID Patents or other IPR exploitation Proprietary License: (licenses) http://info.yahoo.com/legal/us/yahoo/pipes/pipes-4396.html Yahoo! Pipes is a visual programming environment that gives people the ability to create web mashups and web-based applications that combine data from multiple sources. The core component of Yahoo! Pipes is the Pipe editor that is composed of three panes which are the canvas, the library and the debugger. Description The output pipes can be in different formats: widgets, RSS, JSON, PHP. Type Web Application

Programming language/based technologies RSS, JSON

Anatomy Exposed APIs N/A

Target User(s) End Users

References Yahoo! Pipes can be considered a mature service being supported since 2007 by a big IT player as Yahoo. Adopted Standard(s) Capabilities Yahoo pipes is stable and widely used platform. The user interaction approach, particularly easy, used in its editor could be replicated in the

/Opportunities ClouT mashup editor. Yahoo pipes is not an open source platform so will be not possible to reuse components in the ClouT project. Also the Yahoo pipes cannot be

Assessment Known considered a complete Mashup editor but a limited data mashup tool Limitations/Threats that is not designed to perform service mashup to create complex applications.

ClouT – 31.07.2013 Page 83

D3.1 - Reusable components and techniques for CPaaS

TABLE 10- APACHE SHINDIG

Proposing Partner ENG

Info Name Apache Shindig Owner(s) Apache Software Foundation

Link to code https://svn.apache.org/repos/asf/shindig/trunk/ Project, plus link Apache Shindig, http://shindig.apache.org ID Card ID Patents or other IPR exploitation (licenses) Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0) Shindig is an open source project of the Apache Software Foundation that has the aim to implement an open container in compliance with the Open Social gadgets specifications. There are currently two versions of Apache Shindig written in Java and Description PHP, and a third, written in. NET that is external to the project. Type Application framework Programming language/based technologies JavaScript, PHP and Java Exposed APIs OpenSocial REST API Anatomy Target User(s) Software Developers Apache Shindig is used in several OpenSocial sites and projects such as: Apache Rave (http://rave.apache.org/), LinkedIn (http://www.linkedin.com/), hi5 (http://www.hi5.com/), Yahoo! References (http://apps.yahoo.com/ ), Orkut (http://www.orkut.com/), Zing (http://code.google.com/p/zing/) Adopted Standard(s) OpenSocial specifications Apache Shindig is one of the most stable and used reference implementation of the OpenSocial specification. Its implementation is available in both java and language giving the possibility to choose

Capabilities /Opportunities the more suitable language for the ClouT needs: the reuse of this component will be highly taken in consideration. Assessment Known Limitations/Threats No particular limitation has been identified at this stage.

ClouT – 31.07.2013 Page 84

D3.1 - Reusable components and techniques for CPaaS

TABLE 11- IPOJO

Proposing Partner CEA

Info iPojo Name Owner(s) Apache Software Foundation

Link to code http://felix.apache.org/site/apache-felix-ipojo.html http://felix.apache.org/documentation/subprojects/apache-felix-ipojo.html ID CardID Project, plus link http://www.apache.org/licenses/ Patents or other IPR exploitation (licenses) iPOJO(Plain Old Java Object) is a service component runtime aiming to simplify OSGi application development and to build service-oriented component model.

Description Type Service component model Programming Java language/based

technologies Exposed APIs

Anatomy Service/application developers

Target User(s) http://felix.apache.org/documentation/subprojects/apache- References felix-ipojo/apache-felix-ipojo-why-choose-ipojo.html

OSGi

Adopted Standard(s) iPOJO simplifies development of PSGi services and adds features, including Capabilities component configuration, synchronization, and composition. It designed to be

/Opportunities small: the core size of iPOJO is approximately 205k.

It requires learning of its component approach Assessment Known Limitations/Threats

ClouT – 31.07.2013 Page 85

D3.1 - Reusable components and techniques for CPaaS

TABLE 12- BPMN

Proposing Partner CEA

Info BPMN Name OMG (Microsoft, IBM, etc.) Owner(s) Among others, an open source implementation: http://activiti.org/download.html Link to code

ID CardID http://www.bpmn.org/ Project, plus link Specifications are open. Open source and commercial implementations exist (e.g., Activiti is open source and cross platform, Patents or other IPR http://activiti.org/download.html) exploitation (licenses) The Business Process Modeling Notation (BPMN) is a graphical notation that depicts the steps in a business process. A business process spans multiple participants and coordination can be complex. Moreover, well-supported standard modeling notation will reduce confusion among business and IT end- users. Description Graphical process modeling Type Programming language/based Language agnostic. Activiti written in Java technologies Deploying the process, Starting a process instance, completing tasks, suspending and activating a process, query API, exception management, unit Anatomy testing, etc. Exposed APIs

Application development tool provider Target User(s)

References -

Adopted Standard(s) UML compatible

Can be easily serialized into XML

Capabilities Provides a complete specification. Facilitate integration of IoT processes into /Opportunities existing business processes. Assessment

Known Limitations/Threats Can be too complex for simple IoT applications

ClouT – 31.07.2013 Page 86

D3.1 - Reusable components and techniques for CPaaS

TABLE 13- BPEL

Proposing Partner CEA

Info BPEL Name OASIS Owner(s) http://www.oracle.com/technetwork/middleware/bpel/overview/index.html Link to code

ID CardID https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsbpel Project, plus link OASIS standard, open specification, open source and commercial Patents or other IPR implementations exist exploitation (licenses) BPEL (BPEL s.d.)is the standard for assembling a set of discrete services into an end-to-end process flow, radically reducing the cost and complexity of process integration initiatives. Description Type Process Execution Language Programming Implementations in Java and .NET exist. language/based XML for data exchange, WSDL for service description, XPath for manipulating

technologies XML data A complete API from Oracle implementation can be found at: http://docs.oracle.com/cd/E11036_01/integrate.1013/b28986/toc.htm Exposed APIs Anatomy

Target User(s) Application development tool provider

References

Adopted Standard(s) Related to BPMN Mature technology with many implementations: Capabilities

/Opportunities http://en.wikipedia.org/wiki/Comparison_of_BPEL_engines Assessment Known Limitations/Threats Can be too complex for simple IoT applications

ClouT – 31.07.2013 Page 87

D3.1 - Reusable components and techniques for CPaaS

TABLE 14 – ROBUST SERVICE COMPOSITION METHOD

Proposing Partner NII

Info Name Robust Service Composition Method Owner(s) NII

Link to code N/A QAS, http://grace-center.jp/research/research_projects/prj_perqas- html?lang=en ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A This method provides foundations for service composition to ensure validity of functionality as well as optimization and constraint satisfaction in quality. The method also takes into consideration possibilities of failures and slight differences in functions of similar Description services. Type Method / Tool Specification and Proof-of-Concept Implementation Programming

language/based technologies None Exposed APIs N/A Anatomy - Developers and users who design service compositions Target User(s) - System components that trigger automated composition

References

Adopted Standard(s) N/A Effective in situations with either of the following characteristics:

- Service functions of the same kind are actually different in interfaces or how general inputs/outputs are required/produced

- Services are somewhat fragile and tend to change/stop in operation or quality, due to technical aspects or dependencies on third-party

providers

- Compositions require complex workflow with careful consideration of pre/post-conditions of each service

Assessment Capabilities - The space of possible service combinations are very large to /Opportunities understand and make decisions manually

Needs input of ontology-based description of service functions and Known quality, by providers/mediators of involved services or developers of Limitations/Threats service compositions

ClouT – 31.07.2013 Page 88

D3.1 - Reusable components and techniques for CPaaS

TABLE 15 – QOS-BASED SERVICE/CLOUD SELECTION METHOD

Proposing Partner NII

Info Name QoS-based Service/Cloud Selection Method Owner(s) NII

Link to code N/A QAS, http://grace-center.jp/research/research_projects/prj_perqas- html?lang=en ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A This method provides an algorithm for QoS-based selection of services or clouds to use, considering network QoS in addition to service Description (computation) QoS. Type Method / Tool Specification and Proof-of-Concept Implementation Programming language/based

technologies None

my Exposed APIs N/A - Developers and users who select service providers/plans to use and cloud providers/plans to deploy components service compositions Anato - System components that trigger automated selection of services and Target User(s) clouds

References

Adopted Standard(s) N/A Effective in situations with either of the following characteristics:

- There are many choices of service providers and plans to use, or cloud providers and plans to deploy components

- There are many quality criteria (price, availability, response time, etc.) to consider upon selection of providers and plans

- There are end-to-end constraints (constraints on the whole composition, such as total price) on the quality criteria and selection of the whole combinations are difficult to satisfy the constraints

- Attractive providers beyond continental boundaries are considered Assessment Capabilities (e.g., US cloud provider from Japan) and network latency is not neglect /Opportunities able

Known Needs concrete model of quality of services and its values for each Limitations/Threats service

ClouT – 31.07.2013 Page 89

D3.1 - Reusable components and techniques for CPaaS

TABLE 16 – METADATA-BASED BEHAVIOR INSERTION FRMAEWORK

Proposing Partner NII

Info Name Metadata-based Behavior Insertion Framework Owner(s) NII

Link to code N/A QAS, http://grace-center.jp/research/research_projects/prj_perqas- html?lang=en ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A This framework provides a way to insert additional behaviors into service compositions to achieve value-added functions often for non- functional requirements. It defines an algorithm to properly insert additional behaviors according to constraints or rules, which are Description efficiently described by abstract metadata. Type Method / Tool Specification and Proof-of-Concept Implementation Programming language/based technologies None Exposed APIs N/A

Anatomy - Developers and users who design service compositions Target User(s) - System components that trigger automated composition

References

WS -BPEL (Web Service Business Execution Language, OASIS, https://www.oasis- open.org/committees/tc_home.php?wg_abbrev=wsbpel) Adopted WS-CDL (Web Service Choreography Language, W3C, Standard(s) http://www.w3.org/TR/ws-cdl-10/) Effective in situations with either of the following characteristics:

- Non-functional requirements are achieved by inserting additional behaviors such as logging, authentication, payment, etc.

- These behaviors are scattered in various points of execution, inside a service composition or in multiple service compositions

Capabilities - Insertion of such behaviors is switched on or off, depending on the /Opportunities environments or preferences Assessment

Known Careful analysis and definition are necessary to define metadata, Limitations/Threats especially to be efficiently deal with requirements that appear later

ClouT – 31.07.2013 Page 90

D3.1 - Reusable components and techniques for CPaaS

TABLE 17 – BPMN VERIFICATION

Proposing Partner NII

Info Verification Framework of Time and Resource Constraints on Business Name Process Owner(s) NII/JAIST Link to code N/A QAS, http://grace-center.jp/research/research_projects/prj_perqas- html?lang=en ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A This framework provides a capability to verify time and resource constraints on service compositions or business processes through model checking. It provides a small annotation syntax on BPMN (Business Process Modeling Language) for time and resource Description constraints. Type Method / Tool Specification and Proof-of-Concept Implementation Programming

language/based technologies UPPAAL Model Checker Exposed APIs N/A Anatomy

Target User(s) - Developers who design or verify service compositions

References

Adopted BPMN (Business Process Modeling Language 2.0, OMG, Standard(s) http://www.omg.org/spec/BPMN/2.0/) Effective in situations with either of the following characteristics:

- Services are relied by non-sharable resources including human experts/crowds, thus resource constraints should be considered

- Time constraints such as deadlines are significant

Capabilities - State transitions are too complex to have manual analysis such as /Opportunities worst case execution Assessment

Known Some abstraction or simplification may be necessary to avoid state Limitations/Threats explosion

ClouT – 31.07.2013 Page 91

D3.1 - Reusable components and techniques for CPaaS

TABLE 18 – ECA VERIFICATION

Proposing Partner NII

Info Name Verification Framework of ECA Specification on Physical Interactions Owner(s) NII

Link to code N/A QAS, http://grace-center.jp/research/research_projects/prj_perqas- html?lang=en ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A This framework provides a capability to verify complex combinations of ECA (Event-Condition-Action) rules that have effects on shared spaces Description often by multiple users. Type Method / Tool Specification and Proof-of-Concept Implementation Programming language/based

technologies SPIN Model Checker Exposed APIs N/A Anatomy Target User(s) - Developers who design or verify service compositions

References

Adopted Standard(s) N/A Effective in situations with either of the following characteristics:

- Services with complex preconditions and post conditions are considered, especially physical effects on shared spaces

- Post conditions (effects) of services work on shared spaces such as the real places, thus requires very careful consideration of feature Capabilities interactions or undesirable situations caused by activations of multiple /Opportunities services, often by multiple users Assessment

Known Requires some knowledge on model checking and temporal logic to use Limitations/Threats beyond the default verification items

ClouT – 31.07.2013 Page 92

D3.1 - Reusable components and techniques for CPaaS 7.2. TG3.2 Reusable Components

In this paragraph are reported the tables with all reusable components you introduced in chapter 2.

TABLE 19- ESPER

Proposing Partner CEA

Info ESPER Name Inc, Esper Team and EsperTech Owner(s)

http://esper.codehaus.org/ Link to code

ID CardID http://esper.codehaus.org/ Project, plus link

Patents or other IPR GNU General Public License (GPL) (GPL v2) exploitation (licenses) The Esper engine (Inc 2012) has been developed to address the requirements of applications that analyze and react to events. The Esper engine works a bit like a database turned upside-down. Instead of storing the data and running queries against stored data, the Esper engine allows applications to store queries and run the data through. Response from the Esper engine is real-time when conditions occur that match queries. The execution model is thus continuous rather than only when a query is submitted. Description Software for complex event processing Type Programming

language/based Implementations exist in Java and .NET technologies Reference implementation API can be found at : Anatomy http://esper.codehaus.org/esper-4.6.0/doc/api/index.html Exposed APIs

Data processing service developer, Application developer Target User(s)

References -

- Adopted Standard(s) Capabilities One of rare open source complex event detection engines. It is complete and /Opportunities reusable.

Compared with commercial solutions, Esper is still missing: (http://blog.octo.com/en/the-esper-cep-ecosystem/)  more user-friendly interfaces – that is, a graphical language as an alternative to EPL authoring  enhanced simulation and back testing features – they are expected in the upcoming 5.0 version, with the announced event capture and

Assessment replay features  true HA functionalities – native load balancing and cluster management – which still require custom development and Known architectural design, including careful integration with an external Limitations/Threats messaging system between checkpoints

ClouT – 31.07.2013 Page 93

D3.1 - Reusable components and techniques for CPaaS

TABLE 20- FI-WARE GATEWAY DATA HANDLING

Proposing Partner CEA

Info

Name FI-WARE Gateway Data handling Owner(s) FI-WARE Project http://catalogue.fi-ware.eu/enablers/gateway-data-handling-ge- Link to code esper4fastdata http://catalogue.fi-ware.eu/enablers/gateway-data-handling-ge-

ID CardID solcep Project, plus link

Patents or other IPR http://catalogue.fi-ware.eu/enablers/terms-and-conditions-8 exploitation (licenses) FI-Ware Gateway data handling is a complex event processing component proposed by Orange Labs France. It is based on Esper4FastData CEP engine. It is able to collect vast amounts of asynchronous events of different types and correlate them into single events, called Complex Events. It can read from and write to numerous different channels using various different protocols. It is driven using a domain specific language called “Dolce”. Description Software component Type Programming language/based Implemented in Java (OSGi). A version for Android is also available technologies The Data Handling API is NGSI compliant. Anatomy Exposed APIs

Data processing service developer, Application developer Target User(s)

References -

Gateway Data Handling GE is fully integrated with the other enablers of FI- Ware, especially using the Open Mobile Alliance (OMA) Next Generation Service Interface Context Enabler (NGSI 9 / NGSI 10) Adopted Standard(s) Capabilities Easy integration into OSGi environments. Restful management API needs only a /Opportunities servlet container, without depending on third party framework.

Restricted access to the FI-WARE enablers. Difficulties to test and

Assessment Known evaluate. IPR issues can arise. Limitations/Threats

ClouT – 31.07.2013 Page 94

D3.1 - Reusable components and techniques for CPaaS

TABLE 21- JBOSS DROOLS EXPERT

Proposing Partner CEA

Info Jboss Drools Expert Name Owner(s) JBoss, RedHat http://www.jboss.org/drools/drools-expert.html

Link to code http://www.jboss.org/drools/drools-expert Project, plus link ID CardID  EJ-Technologies provide licenses for JProfiler for free, for the JBoss Drools project.  JetBrains has donated an IntelliJ license for use on open source Drools Patents or other IPR projects exploitation (licenses) Drools Expert is a declarative, rule based, coding environment. This allows you to focus on "what you want to do", and not the "how to do". Description Rule description language Type Programming Java language/based technologies Exposed APIs Complete API on drools http://docs.jboss.org/jbpm/v5.1/javadocs/

Anatomy Data processing service developer, Application developer Target User(s) - References

Adopted Standard(s) Can be integrated to BPMN, Capabilities

/Opportunities Already existing implementations with BPMN.

- Assessment Known Limitations/Threats

ClouT – 31.07.2013 Page 95

D3.1 - Reusable components and techniques for CPaaS

TABLE 22- MONGODB

Proposing Partner CEA

Info

Name MongoDB Owner(s) http://www.10gen.com/

http://www.mongodb.org/ Link to code http://www.mongodb.org/ ID CardID Project, plus link Open-source Patents or other IPR exploitation (licenses) MongoDB is an open-source document database, and the leading NoSQL database. Data in MongoDB has a flexible schema. Collections do not enforce document structure. Although, you may be able to use different structures for a single data set. Description Type Database Programming MongoDB distribution platform: C++, OS: Windows, Linux, OS X, Solaris language/based Drivers and clients: C, C++, Java, Python, etc.

technologies http://api.mongodb.org/ Exposed APIs Anatomy Data processing service developer, Application developer

Target User(s) - References

-

Adopted Standard(s) MongoDB stores a binary form of JSON documents (BSON)

MongoDB is an agile NoSQL database

Auto-sharing allows MongoDB to scale from single server deployments to large, complex multi-data center architectures. Capabilities /Opportunities Free cloud-based service for monitoring MongoDB deployments

Assessment Specific to documents

Known Limitations/Threats

ClouT – 31.07.2013 Page 96

D3.1 - Reusable components and techniques for CPaaS

7.3. TG3.3 Reusable Components

In this paragraph are reported the tables with all reusable components you introduced in chapter 2.

TABLE 23- REST/SOAP API’S FOR IDAS ACCESS

Proposing Partner UC

Info REST/SOAP APIs for IDAS access Name Owner(s) https://m2m.telefonica.com/ http://forge.fi-ware.eu/plugins/mediawiki/wiki/fiware/index.php/IDAS Link to code http://forge.fi-ware.eu/plugins/mediawiki/wiki/fiware/index.php/IDAS Project, plus link

ID CardID Closed source. The IDAS platform has been internally developed by Telefónica based on previously developed in-house technology. It is based on a generic architecture for integrating heterogeneous and disperse M2M devices, sensor & actuator technologies. TID is the owner of the platform. It will be available for Patents or other IPR being used in the Core Platform through its API. The terms of use of the IDAS exploitation (licenses) would need to be further discussed and agreed. IDAS has been conceived as an end-to-end open platform intended to be used in a broad range of Internet-of-Things application scenarios and services. Together with providing a basic support to IoT service providers, different types of service users are envisaged, varying from ‘individual end-users’ to ‘industrial users’, automating the acquisition and management of the information retrieved from generic wireless sensor and actuator networks. Description Type Database

Programming The IDAS platform has been built on Linux Redhat 5.5. language/based technologies REST and SOAP web service interfaces Anatomy Exposed APIs Service provider, Application developer

Target User(s) - References

SensorML, O&M, SOS,SWE (Sensor Web Enablement), SIP communication protocol, Open Mobile Alliance (OMA) Service Environment (OSE) Adopted Standard(s) Capabilities /Opportunities

Specific to documents

Assessment Known Limitations/Threats

ClouT – 31.07.2013 Page 97

D3.1 - Reusable components and techniques for CPaaS

TABLE 24- TESTBED MANAGER

Proposing Partner UC

Info

Name Testbed Manager Owner(s) http://www.smartsantander.eu/

http://www.smartsantander.eu/wiki/ Link to code http://www.smartsantander.eu ID CardID Project, plus link To be determined during the project. Patents or other IPR exploitation (licenses) Testbed Manager module provides efficient procedures and mechanisms for managing dense IoT deployments in a remote and efficient way. Description Type Management Programming Testbed manager is developed in Java. language/based

technologies REST web service web-based client called wisegui and written in JavaScript Exposed APIs Researcher/Experimenter, Network manager Anatomy Target User(s) -Wise bed project. http://wisebed.eu/site/ References

Adopted Standard(s) Capabilities /Opportunities

Specific to documents

Assessment Known Limitations/Threats

ClouT – 31.07.2013 Page 98

D3.1 - Reusable components and techniques for CPaaS

TABLE 25 D-CASE MONITORING TOOL

Proposing Partner KEIO

Info D-Case Monitoring Tool Name D-Case Consortium / Keio University

Owner(s)

Link to code http://www.il.is.s.u-tokyo.ac.jp/deos/dcase/

http://www.dependable-os.net/osddeos/index-e.html ID Card ID Project, plus link

Patents or other IPR Copyright Fuji Xerox Co., Ltd. 2011-2012 exploitation (licenses) (http://www.il.is.s.u-tokyo.ac.jp/deos/dcase/Download.html) An Eclipse-based graphical tool that uses dependability case, which is a kind of assurance cases, to show a system is ready to avoid/address its potential faults (risks). In order to provide tools that make city data dependable, i.e, available, reliable, consistent, for applications accessing city data, D-Case can be used to ensure dependability of communications between the applications themselves and some running systems that disseminate the city data to the network.

D-Case achieves this by monitoring applications, running systems, and networks between them. Description Software Type Programming language/based Java, Goal Structuring Notation (GSN) technologies N/A Anatomy Exposed APIs Developers of ClouT applications

Managers of ClouT applications and the platform Target User(s) Matsuno, Y.; Nakazawa, J.; Takeyama, M.; Sugaya, M.; Ishikawa, Y., "Towards a Language for Communication among Stakeholders," Dependable Computing (PRDC), 2010 IEEE 16th Pacific Rim International Symposium on , vol., no., References pp.93,100, 13-15 Dec. 2010

Goal Structuring Notation Adopted Standard(s) Capable of (1) authoring assurance cases to specify strategies and evidences to Capabilities achieve dependability in arbitrary systems, and (2) executing assurance cases

/Opportunities to collect evidences from running systems.

Current implementation does not allow us to author a complex D-Case Assessment Known documents, such as those including inter-document relationship, combinations Limitations/Threats of multiple conditions, etc.

ClouT – 31.07.2013 Page 99

D3.1 - Reusable components and techniques for CPaaS

TABLE 26 – SELF-HEALING FRAMEWORK FOR SENSORY DATA

Proposing Partner NII

Info Name Self-Healing Framework for Sensory Data Owner(s) National Institute of Informatics

Link to code N/A N/A Project, plus link ID Card ID Patents or other IPR exploitation (licenses) N/A This framework provides runtime detection, classification, and correction capabilities for sensory data faults that appear in sensory data. It includes a fault classification model and a model-learning Description component based on statistical pattern matching. Conceptual Framework/Tool Specification and Proof-of-Concept Type Implementation Programming language/based technologies N/A Exposed APIs N/A Anatomy

Target User(s) Developers who develop self-healing components in CPaaS

References [SHF][SHF]

Adopted Standard(s) No/A Capabilities ・This framework provides detection, classification, and correction

/Opportunities capabilities for sensory data.

・This framework targets densely deployed sensor networks. It will not

Assessment Known work well in sparsely deployed sensor networks, because it adopts fault Limitations/Threats detection based on neighborhood voting.

ClouT – 31.07.2013 Page 100

D3.1 - Reusable components and techniques for CPaaS

TABLE 27 – FAULT CLASSIFICATION MODEL

Proposing Partner NII

Info Name FaultClassification Model Owner(s) National Institute of Informatics

Link to code N/A N/A Project, plus link ID Card ID Patents or other IPR exploitation (licenses) N/A This model provides a decision tree to classify data fault sin to four types of data fault: bias fault, drift fault, malfunction fault, and random fault. Applied to sensory readings, this model proposes a complete and consistent classification based on frequency and the continuity of Description occurrence and observable and learnable pattern. Type Model Programming

language/based technologies N/A Exposed APIs N/A Anatomy

Target User(s) - Self-healing framework for sensory data

References [CM]

Adopted Standard(s) N/A Capabilities

/Opportunities It provides a decision tree for sensory fault classification

Assessment Known This model is only for sensory data faults from physical sensors, not for Limitations/Threats ones from virtual sensors

ClouT – 31.07.2013 Page 101

D3.1 - Reusable components and techniques for CPaaS TABLE 28– HYPERTABLE

Proposing Partner ENG

Info

Name Hypertable

Owner(s) Hypertable Inc.

Link to code http://hypertable.com/download/0977

ID ID Card Project, plus link Hypertable: http://hypertable.com/

Patents or other IPR exploitation (licenses) Open source licence: GPL Version 3

Hypertable is a NoSQL database based on Google BigTable technology. It is a high scalable and non-relational database implementation. It is written in C++ language and permits to have high performance thanks to the different distributed file systems Description that can be attached to it (Hadoop HDFS or Ceph).

Type Software

Programming language/based technologies C++

Drivers for the following languages:

Exposed APIs Java, PHP, Python, Perl, Ruby, C++

Target User(s) Data Access Services of ClouT CPaaS

 Belvedere Trading FINANCIAL  eBay INTERNET SEARCH  Rediff.com INTERNET EMAIL  Stratoshear MOBILE ADVERTISING  Tiscali S.p.A. TELECOMMUNICATIONS  Wirth Research AUTOMOTIVE ENGINEERING Below a link to an exhaustive list of Hypertable implementations References and partners: http://hypertable.com/customers/

Google File System (GFS), Hadoop MapReduce , Google Bigtable , Adopted Standard(s) Sawzall Anatomy Very high performance if used on top of Hadoop HDFS or Ceph storage data. It provides the data versioning (which for example is not a feature of object storage such as CEPH) based on time stamp. It could be used to archive small data such as single sensor measurements and as search engine for this kind of data. Well

Capabilities /Opportunities suited for analyzing log data.

Not completely SQL support. Only a subset of SQL language is implemented.

No security schemes are implemented on top of the system. Assessment Security is demanded to the lower layer (File system or object storage).

As to implement the security access it is necessary to implement a Known Limitations/Threats system on top of Hypertable.

ClouT – 31.07.2013 Page 102

D3.1 - Reusable components and techniques for CPaaS TABLE 29 – HBASE

Proposing Partner ENG

Info

Name Hadoop HBase

Owner(s) Apache Foundation

Link to code http://svn.apache.org/viewvc/hbase/trunk ID ID Card Project, plus link Apache HBase: http://hbase.apache.org/

Apache Software License, Version 2.0 Patents or other IPR exploitation (licenses) http://hbase.apache.org/license.html

Hbase is a NoSQL database implementation, part of Apache Hadoop project. It is a distributed database modeled on Google BigTable technology and written in Java language. This database permits to manage, in random access, billions of data. The goal of the project is to have very large scale tables (billions of rows X millions of columns). Apache HBase provides Bigtable capabilities to Hadoop and HDFS.

Hbase is not a RDBMS, but it is a column oriented database, of course it doesn’t support SQL query. Each table is based on columns and rows, where each row must have a primary key and the access must be Description performed using that key.

Type Software

Programming

Anatomy language/based technologies Java

Exposed APIs RESTful API’s

Target User(s) Systems that interacts with RESTful APIs

 Adobe;  Facebook;  Twitter; Below a link to an exhaustive list of HBase implementations: References http://wiki.apache.org/hadoop/Hbase/PoweredBy

Adopted Standard(s) RESTful API’s, HDFS, MapReduce paradigm

High performance and data throughput. Hbase is designed to be used in case of a big number of data accessible in real time. It is possible, as Adobe implemented in his system using cheapest hardware, to access data in 50ms.

Hadoop MapReduce is implemented as native in Hbase. This means that the elaborations are distributed into the servers. The MapReduce Capabilities /Opportunities permits to increase performance simply adding hardware, clusters.

Assessment Designed for large scale data analysis. Not suited for real time data analysis.

The columns are not sotred (It does not exist the concept of sorted Known Limitations/Threats columns).

ClouT – 31.07.2013 Page 103

D3.1 - Reusable components and techniques for CPaaS

TABLE 30 – HADOOP HDFS

Proposing Partner ENG

Info

Name Hadoop HDFS

Owner(s) Apache

Link to code http://mirror.nohup.it/apache/hadoop/common/hadoop-1.2.0/

Project, plus link Apache Hadoop: http://hadoop.apache.org/ ID ID Card Apache Software License, Version 2.0 Patents or other IPR exploitation (licenses) http://projects.apache.org/projects/hadoop.html

The Hadoop Distributed File System (HDFS) is a distributed and general purpose file system designed to run on commodity hardware which enables the real time streaming in efficient mode.

It has a very fault tolerant file system architecture and can run on low cost hardware. It is designed to support file of big dimension: a typical file stored on HDFS is bigger than gigabytes and can reach the dimension of terabytes.

In a typical corporate installation, HDFS is distributed on hundreds of thousands machines; this permits to reduce at minimum the hardware failure. Thanks to Java build, Hadoop HDFS can be ported on all operating systems.

The architecture of HDFS is based on one node that contains the Metadata and separated node(s) containing the data files. The data blacks are also replicated in more than one node for fault tolerance. Rebalance nodes: in case the free space in one node reach a define Description threshold, the data are rebalanced between the nodes.

Type Software Component Anatomy Programming language/based technologies Java

The Hadoop HDFS API, in PERL, python, Ruby and PHP, are exposed Exposed APIs through Thrift(http://wiki.apache.org/hadoop/HDFS-APIs)

Software systems that need to storage in safe mode big quantity of Target User(s) data.

 Cloudera;  IBM;  Talend; Below a link to an exhaustive list of Hadoop HDFS implementations: References http://wiki.apache.org/hadoop/PoweredBy

Adopted Standard(s) http, HTTPS, FTP, POSIX

High performance, extremely scalable and fault tolerant storage. It could be used alone as back-end storage for Hypertable (a stand- alone deployment) and as storage layer for Map Reduce framework Capabilities /Opportunities for data processing. Assessment

ClouT – 31.07.2013 Page 104

D3.1 - Reusable components and techniques for CPaaS Hadoop HDFS is the correct choose if you need high performance in the file storage scenarios.

In scenarios where it is necessary to storage data in a NoSql database associated to big data (for example big pictures). Hadoop HDFS and Hbase are the correct choose because the high interoperability of the systems. These combination, Hadoop HDFS and Hbase, permits to create a system with high performance and high scalability using low cost hardware.

It could be also used as a file system implementation for OpenStack Swift object store, though this feature is currently under development9.

HDFS is inefficient for handling small size files. Another limitation lies in its single-node namespace server architecture. In fact, since the name-node is a single container of the file system metadata, this turns in to a problem for file system growth. For example in order to store 100 million files a name-node should have at least 60GB of Known Limitations/Threats RAM

9https://issues.apache.org/jira/browse/HADOOP-8545

ClouT – 31.07.2013 Page 105

D3.1 - Reusable components and techniques for CPaaS

TABLE 31 – OPEN STACK SWIFT

Proposing Partner ENG

Info Name Open Stack SWIFT Owner(s) OpenStack Foundation

Link to code https://github.com/openstack/swift Open Stack Swift: http://www.openstack.org/software/openstack-storage/

ID CardID Project, plus link

Patents or other IPR exploitation (licenses) Swift is an open source software release under the Apache 2 license. Open Stack Swift, a cloud storage platform part of the Open Stack project, is the most notable Cloud Object Storage platforms as well as….. It allows to store or retrieve objects data using virtual containers. The architecture of Open Stack Swift is modular and scalable, since it is possible to add and remove the container nodes dynamically to improve performance. The access to data is controlled by the proxy server that balances the requests and its security can be managed either by the Open Stack Identity Service (code-named Keystone) or Swauth sub system. The data are distributed and crypt with the algorithm Description chosen by the end-user (among AES, DES, RSA, end some others). Type Software Programming language/based technologies Python

natomy Exposed APIs RESTful APIs provided by OpenStack services A

All end user applications and ClouT architectural components that need to use Target User(s) ClouT storage and are able to interact with RESTful APIs Several commercial storage services are based on Swift, including: Rackspace10 (from which Swift originated), Coudscaling11, KTucloud Services12 ,HP13, Internap14.

References Among companies supporting the OpenStack Foundation15 are Ubuntu, HP, RedHat, Suse

Adopted Standard(s) RESTful; HTTP; MD516 The goal of Open Stack Swift is the capability to store peta-byte of data in a distributed, standard, infrastructure of cluster servers.

Open Stack Swift is not a file based storage system, but a very distributed, with redundancy capabilities, cloud storage infrastructure.

Assessment Thanks to the replication of data (stored and replicated in more than Capabilities one server), Open Stack Swift has an high level of secure in case of /Opportunities failure of one server.

10http://www.rackspace.com/cloud/ 11http://www.cloudscaling.com 12https://en.ucloudbiz.olleh.com/portal/ktcloudportal.epc.productintro.ss.info.html 13https://www.hpcloud.com/sites/default/files/Right-storage-for-you.pdf 14http://www.internap.com/agile/flexible-cloud-hosting-solutions/cloud-storage/ 15http://www.openstack.org/foundation/companies/ 16http://en.wikipedia.org/wiki/MD5

ClouT – 31.07.2013 Page 106

D3.1 - Reusable components and techniques for CPaaS

The object stored must be less than 5 GB, nevertheless there is no limit in data stream Its model is not fully CDMI complaints it doesn’t support multi-level Known container structure and it currently doesn’t provide CDMI APIs, ( as CDMI layer Limitations/Threats on top of SWIFT, see Object Storage GE - FI-WARE Implementation).

ClouT – 31.07.2013 Page 107

D3.1 - Reusable components and techniques for CPaaS

TABLE 32 – CEPH

Proposing Partner ENG

Info

Name CEPH

Owner(s) Inktank, Inc.

ID ID Card Link to code https://github.com/ceph/ceph

Project, plus link CEPH Storage, www.ceph.com

Patents or other IPR exploitation (licenses) Ceph is provided under LGPL version 2.1 license

Ceph Storage is a free cloud storage designed to present objects of all types (for example block of data or file of any type) from a single distributed node cluster. Ceph storage architecture is completely distributed; this means that there aren’t points of failure with a high level of availability.

CEPH supports up to Exabyte object size. The data replication mechanism, in the distributed nodes, makes its architecture very robust and fault tolerant. It is possible to scale the object storage environments using economic hardware and to replace it easily when a malfunction occurs.

Ceph system provides to the clients a rich library to access the

Description storage; the libraries are available in C, C++, Java, Python and PHP.

Type Software

Anatomy Programming language/based technologies C++

Exposed APIs Ceph exposes API in C, C++, Java, Python and PHP languages.

Target User(s) Software that needs to store big objects.

 Ceph: Reliable, Scalable, and High-Performance Distributed Storage (http://ceph.newdream.net/papers/weil-thesis.pdf)  RADOS: A Fast, Scalable, and Reliable Storage Service for Petabyte- References scale Storage Clusters (http://ceph.newdream.net/papers/weil- rados-pdsw07.pdf)

Adopted Standard(s)  RESTful; High scalable storage platform, support Exabyte of data. It could be used as back-end storage to be attached to CDMI Proxy.

Ceph storage platform can be used as a backend system of Hypertable database. This means the if there is the need to store big files and associated metadata, this solution combined for example with a Cloud DB (for storing metadata) such as Hypertable or HBASE, can be Assessment desirable because permits to have high performance with a low cost Capabilities /Opportunities hardware.

ClouT – 31.07.2013 Page 108

D3.1 - Reusable components and techniques for CPaaS

The main limitation of CEPH is its single name-node architecture, tracking where data is stored on the data nodes in the cluster. Currently, it doesn’t provide CDMI interfaces, however this feature is Known Limitations/Threats in the roadmap.

ClouT – 31.07.2013 Page 109

D3.1 - Reusable components and techniques for CPaaS

TABLE 33 – OBJECT STORAGE GE - FI-WARE IMPLEMENTATION

Proposing Partner ENG

Info

Name Object Storage GE - FI-WARE Implementation

Owner(s)

Link to code https://github.com/osaddon/cdmi ID ID Card FP7 EU Research Project FI-WARE. This component is included in the Project, plus link Fi-WARE Generic Enabler cataloguehttp://catalogue.fi-ware.eu/

The code is under Apache LicenseVersion 2.0.

It is an implementation of the FI-WARE ObjectStorage Specification, which is a Copyright © 2013 INTEL. Usage terms are specified in FI- WARE Open Specifications Interim Legal Notice17.

The implementation of this Generic Enabler is available to FI-PPP Patents or other IPR partners according to the terms and conditions specified in the FI-PPP exploitation (licenses) Collaboration Agreement18

This is an implementation of the FI-WARE Object Storage Specification, developed in the context of FP7 EU Research Project FI-WARE and included in the Fi-WARE Generic Enabler (GE) catalogue . This GE Description provides an implementation of CDMI interfaces for OpenStack Swift.

Type Software component

Anatomy Programming language/based technologies Python

Exposed APIs RESTful APIs

All ClouT services, end-user applications and software components at Target User(s) sensor level that need to store and retrieve data from the Cloud

References

Adopted Standard(s) CDMI

This component allows attaching OpenStack SWIFT to a CDMI layer (for example the one provided by CDMI Proxy), since this object storage solution currently doesn’t provide such a standard interfaces Capabilities /Opportunities for accessing its object and containers. Assessment

17https://forge.fi- ware.eu/plugins/mediawiki/wiki/fiware/index.php/Open_Specifications_Interim_Legal_Notice 18http://www.fi-ppp.eu/guidance-notes-for-proposers/

ClouT – 31.07.2013 Page 110

D3.1 - Reusable components and techniques for CPaaS CDMI allows container being nested, that is, a container can have other containers inside. CDMI containers and data objects make a tree like structure.

Instead, this CDMI implementation, in the current version, does only support container, data object and capability specification. According to the OpenStack container and data object model, this implementation only allows a user to create a container at the top level; once a container is created, it can only hold data objects inside. This flat model Known Limitations/Threats could represent a limitation in the data organization.

ClouT – 31.07.2013 Page 111

D3.1 - Reusable components and techniques for CPaaS

TABLE 34 – CDMI PROXY

Proposing Partner ENG

Info

Name CDMI Proxy

Owner(s) Ilja Livenson, KTH PDC

Link to code https://github.com/livenson/vcdm ID ID Card FP7 European Research Project VENUS-C, http://resources.venus- Project, plus link c.eu/cdmiproxy/docs/intro.html

The terms of use of the software are governed by the Apache 2 license. Copyright (c) 2011, Ilja Livenson, KTH PDC. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

- Neither the name of the KTH PDC nor the names of its contributors Patents or other IPR may be used to endorse or promote products derived from this exploitation (licenses) software without specific prior written permission.

CDMI Proxy is an open source implementations of SNIA CDMI specifications. CDMI (Cloud Data Management Interface) defines a

standard interface to access object (creation, retrieving, updating and deleting) in cloud storages.

CDMI Proxy, in the last implementation, permits to manage objects in

Anatomy the following types of cloud storages:

 Local disk  AWS S3 Description  Azure Blob

Type Software component

CDMI-Proxy has been written with Twisted network engine (http://twistedmatrix.com/trac/wiki/TwistedProject)in Python Programming (v2.6+) and has been tested on a number of platforms: Linux, Windows language/based and MacOS X. Required components: CouchDB (v. 1+), PyCrypto technologies (https://www.dlitz.net/software/pycrypto/), OpenSSL,

CDMI-Proxy is a server exposing CDMI-compliant interface (see CDMI Exposed APIs table)

All ClouT services, end-user applications and software components at sensor level that need to store and retrieve (large and big amount of) Target User(s) data from the Cloud.

ClouT – 31.07.2013 Page 112

D3.1 - Reusable components and techniques for CPaaS

References

Adopted Standard(s) CDMI

Other storage back-ends could be attached to CDMI Proxy to address different ClouT storage needs: (a) open source Cloud Object Storage (see for example Open Stack SWIFT and CEPH)or a Distributed File System (see for example Hadoop HDFS)for storing big amount of data; (b) a powerful Cloud noSQL Data Base which could be more suitable for storing small data (such as single measurement from sensor devices) that need to be accessed very frequently (for analytics purposes) and for which a this type of DB offer more search and analysis support (see for example Hypertable and Hadoop HBase component description) Capabilities /Opportunities than an object store allows to do. Assessment

Currently neither open-source object storage nor Distributed File System back-ends are supported. Furthermore, it should be updated in Known Limitations/Threats order to support more recent version of CouchDB (the metadata store).

ClouT – 31.07.2013 Page 113

D3.1 - Reusable components and techniques for CPaaS

TABLE 35 – SOA3

Proposing Partner ENG

Info Name SOA3 Owner(s) ENG Link to code N/A N/A

ID Card ID Project, plus link Patents or other IPR exploitation (licenses) N/A A set of services for User Management, Authentication, Description Authorization, Accounting, Identity Federation and Delegation. Type Web Applications/Java API Programming language/based technologies Java /J2EE Exposed APIs RESTful APIs End Users (for User Management and Policies definition) and Target User(s) Applications (for Authentication and Authorization queries) Anatomy Vision Cloud (www.visioncloud.eu), Imarine (www.i- References marine.eu)

 RESTful;  HTTP;  SAML Adopted Standard(s)  XACML SOA3 provides a set of services which totally manage Authentication, Authorization and Accounting, supporting CRUD operation on users, roles, groups, and authorization policies. It supports SAML based Identity Federation and Capabilities /Opportunities Access Delegation as well.

Assessment Known Doesn't support X509 Authentication and ACL based Limitations/Threats authorization

ClouT – 31.07.2013 Page 114

D3.1 - Reusable components and techniques for CPaaS

References

IoT. Strategic Research Agenda. Available online: http://www.internet-of-things- research.eu/pdf/IoT_Cluster_Strategic_Research_Agenda_2011.pdf

Cloud Computing. The future of Cloud Computing, Opportunities for European Cloud Computing beyond 2010 Online: cordis.europa.eu/fp7/ict/ssai/events-20100126-cloud- computing_en.html.

World Future Council. Uexküll, Jakob. Shaping our future: Creating the World Future Council. Foxhole, Devon, United Kingdom. UNT Digital Library. http://digital.library.unt.edu/ark:/67531/metadc13722/. Accessed November 14, 2012.

Drools-Camel Server. (n.d.). Retrieved from http://docs.jboss.org/drools/release/6.0.0.Beta3/droolsjbpm-integration- docs/html_single/index.html

EUROTECH. (n.d.). Retrieved 07 02, 2013, from http://www.eurotech.com/en/products/software+services/everyware+device+cloud/edc+wha t+it+is

Fiware Data Handeling. (n.d.). Retrieved 07 05, 2013, from http://catalogue.fi- ware.eu/enablers/gateway-data-handling-ge-solcep

FI-WARE PUB/SUB. (n.d.). Retrieved 07 03, 2013, from https://forge.fi- ware.eu/plugins/mediawiki/wiki/fiware/index.php/FIWARE.OpenSpecification.Data.PubSub

Fiware-training Webinars (12 2012).

IBM. (2012). Building Smarter Planet Solutions with MQTT and IBM WebSphere MQ Telemetry. IBM redbooks.

Esper. Inc, E. T. (2012). Esper Reference. Version 4.7.0. iPojo. (n.d.). Retrieved 07 03, 2013, from http://felix.apache.org/documentation/subprojects/apache-felix-ipojo.html jBPM. (2013). 03: 07.

JSON. (n.d.). Retrieved 07 03, 2013, from http://www.json.org/

JSON ENGINE. (n.d.). Retrieved 07 03, 2013, from http://code.google.com/p/jsonengine/

BPEL. Leymann, P. D. (2010). BPEL vs. BPMN 2.0 should you care? Germany.

M2M.IO. (n.d.). Retrieved 07 02, 2013, from http://help.m2m.io/entries/21577233-connecting- to-m2m-io

MongoDB. (n.d.). Retrieved 07 03, 2013, from http://www.mongodb.org/

MongoDB Overview. (n.d.). Retrieved from http://www.10gen.com/products/mongodb

ClouT – 31.07.2013 Page 115

D3.1 - Reusable components and techniques for CPaaS Mosquitto. (n.d.). Retrieved 07 02, 2013, from http://mosquitto.org/

Mosquitto. (n.d.). Retrieved 07 02, 2013, from http://mosquitto.org/

MQTT. (n.d.). Retrieved 07 02, 2013, from http://mqtt.org/

Native JSON. (n.d.). Retrieved 07 03, 2013, from https://developer.mozilla.org/en- US/docs/Using_native_JSON

NGSI API. (n.d.). Retrieved 07 03, 2013, from https://forge.fi- ware.eu/plugins/mediawiki/wiki/fiware/index.php/Publish/Subscribe_Context_Broker_- _Context_Awareness_Platform_-_User_and_Programmer_Guide

NGSI10. (n.d.). Retrieved 07 03, 2013, from http://www.openmobilealliance.org/Technical/release_program/docs/NGSI/V1_0-20101207- C/OMA-TS-NGSI_Context_Management-V1_0-20100803-C.pdf

BPEL. (2009). ORACLE BPEL Process Manger. ORACLE DATA SHEET.

QEST. (n.d.). Retrieved 07 02, 2013, from http://qest.me/

Really Small Message Broker. (n.d.). Retrieved 07 02, 2013, from https://www.ibm.com/developerworks/community/groups/service/html/communityview?co mmunityUuid=d5bedadd-e46f-4c97-af89-22d65ffee070

RFC4627. (n.d.). Retrieved 07 03, 2013, from http://www.ietf.org/rfc/rfc4627.txt

BPMN. Stephen A. White, I. C. Introduction to BPMN.

UBJSON. (n.d.). Retrieved 07 02, 2013, from http://ubjson.org/

WSBPEL. (n.d.). Retrieved 07 02, 2013, from https://www.oasis- open.org/committees/tc_home.php?wg_abbrev=wsbpel

XIVELY. (n.d.). Retrieved 07 02, 2013, from http://xively.com/

Event-Driven Application Servers, Alexandre Vasseur, T. B. (2007).

MQTTs. Andy Stanford-Clark, H. L. (2007). MQTT For Sensor Networks (MQTTs). Protocol Specification Version 1.0.

DDS and CEP. Angelo Corsaro, P. (n.d.). Stream processing with DDS and CEP .

BPEL. (n.d.). Retrieved 07 05, 2013, from ttp://www.oracle.com/technetwork/middleware/bpel/overview/index.html

BPEL. Definition. (n.d.). Retrieved from http://en.wikipedia.org/wiki/Business_Process_Execution_Language

BPMN. (n.d.). Retrieved 07 02, 2013, from http://www.bpmn.org/

BPMN. (2010). BPMN 2.0 by Example. OMG.

IOT. Butler email Discussion on IOT model.

Context Broker. (n.d.). Retrieved 07 03, 2013, from http://catalogue.fi- ware.eu/enablers/publishsubscribe-context-broker-context-awareness-platform

ClouT – 31.07.2013 Page 116

D3.1 - Reusable components and techniques for CPaaS IOT-A. (IoT-A (257521)). D1.2 Internet-of-Things Architecture IOT-A. Project Deliverable D1.2 – Initial Architectural Reference Model for IoT.

Drools Expert. (n.d.). Retrieved 07 03, 2013, from http://www.jboss.org/drools/drools-expert

Drools Expert . (n.d.). Retrieved 07 05, 2013, from http://docs.jboss.org/drools/release/6.0.0.Beta3/drools-expert-docs/html_single/index.html

DoW. FP7 EU Research Project “Cloud of Things for empowering the citizen clout in smart cities ", Grant agreement no: 608641, Version date: 2013-03-26 - Annex I - "Description of Work"

DoW [D2.1]. FP7 EU Research Project “Cloud of Things for empowering the citizen clout in smart cities", Grant agreement no: 608641, Version date: 2013-03-26 - Deliverable D2.1 - Reusable components, techniques and standards for the CIaaS

ROBUSTCOMP. Florian Wagner, Benjamin Kloepper, Fuyuki Ishikawa, Shinichi Honiden, Towards Robust Service Compositions in the Context of Functionally Diverse Services, The 21st International World Wide Web Conference (WWW 2012), pp.969-978, April 2012

QoSSEL. Adrian Klein, Fuyuki Ishikawa, Shinichi Honiden, Towards Network-aware Service Composition in the Cloud, The 21st International World Wide Web Conference (WWW 2012), pp.959-968, April 2012

BPMNVeri. Kenji Watahiki, Fuyuki Ishikawa, Kunihiko Hiraishi, Formal Verification of Business Processes with Temporal and Resource Constraints, The 2011 IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC 2011), pp.1173-1180, October 2011

ECAVeri. Fuyuki Ishikawa, Basem Suleiman, Kayoko Yamamoto, Shinichi Honiden, Physical Interaction in Pervasive Computing: Formal Modeling, Analysis and Verification, The ACM International Conference on Pervasive Services (ICPS 2009), pp.133-140, July 2009

FUSE. File System in , http://en.wikipedia.org/wiki/Filesystem_in_Userspace

OCCI. Open Cloud Computing Interface,http://occi-wg.org/

S3. Amazon Simple Storage Service, http://aws.amazon.com/documentation/s3/

QoSSEL. Adrian Klein, Fuyuki Ishikawa, Shinichi Honiden, Towards Network-aware Service Composition in the Cloud, The 21st International World Wide Web Conference (WWW 2012), pp.959-968, April 2012

INSERT. Ryuichi Takahashi, Fuyuki Ishikawa, Kenji Tei, Yoshiaki Fukazawa, Intention-based Automated Composition Approach for Coordination Protocol, The IEEE 11th International Conference on Web Services (ICWS 2013 Application & Experience Track), June 2013 (to appear)

SHF. Valentina Baljak, Kenji Tei, and Shinichi Honiden, Fault Classification and Model Learning from Sensory Readings Framework for Fault Tolerance in Wireless Sensor Networks, IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (IEEE ISSNIP 2013), 2013

SFC. Valentina Baljak, Tei Kenji, and Shinichi Honiden, Faults in Sensory Readings: Classification and Model Learning, Sensors & Transducers Journal, vol.18, pp.177-187, 2013.

ClouT – 31.07.2013 Page 117

D3.1 - Reusable components and techniques for CPaaS LBC. Ehsan Ullah Warriach, Kenji Tei, and Marco Aiello, A Machine Learning Approach for Identifying and Classifying Faults in Wireless Sensor Networks, the 10th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC12), pp. 618-625, 2012

CM. Valentina Baljak, Kenji Tei, and Shinichi Honiden, Classification of Faults in Sensor Readings with Statistical Pattern Recognition, The Sixth International Conference on Sensor Technologies and Applications (SENSORCOMM 2012), pp.270-276 , 2012.

ClouT – 31.07.2013 Page 118