Industrial Energy Management in the Cloud

Hugo André Gomes Sequeira

Thesis to obtain the Master of Science Degree in Information Systems and Computer Engineering

Supervisor: Prof. Dr. Paulo Jorge Fernandes Carreira

Examination Committee Chairperson: Prof. Dr. Ernesto José Marques Morgado Supervisor: Prof. Dr. Paulo Jorge Fernandes Carreira Member of the Committee: Prof. Dr. Mário Serafim dos Santos Nunes

November 2014 ii Acknowledgments

Gostaria de comec¸ar por agradecer ao meu orientador, professor Dr. Paulo Carreira, pelo excelente trabalho de orientac¸ao,˜ que inspirou tudo e todos durante a elaborac¸ao˜ desta tese e de todas as outras que supervisionou ao longo deste ano. Obrigado por todo o suporte, todas as criticas, todos os elogios e por acreditar no nosso potencial.

Gostaria tambem´ de agradecer ao meu supervisor, Dr. Thomas Goldschmidt, pela sabedoria, su- pervisao,˜ e pelo acompanhamento que prestou durante a minha estadia na ABB. Estarei para sempre grato por ter acreditado em mim e aceite nesta grande empresa. Caso contrario´ teria perdido muita coisa a n´ıvel pessoal e academico.´

A` minha fam´ılia, em especial aos meus pais e avos,´ por todo o carinho que sempre me deram e por todo o apoio na perseguic¸ao˜ dos meus sonhos. Ao Joao˜ Loff, Alexandre Almeida, Viteche Ashvin, Hugo Ramos, Sergio´ Isidoro, Tiago Aguiar, Edgar Santos, Nuno Teles e a todos os outros amigos e colegas que foram companheiros de corac¸ao˜ durante a minha vida escolar. Sem eles, nunca teria chegado aqui e por isso estarei eternamente grato.

A todos os amigos que fiz na Holanda e na Alemanha, durante o meu percurso no estrangeiro. Em especial, ao meu amigo alemao˜ Philipp Piroth, pelo o apoio e carinho desde o primeiro dia e que se tornaram indispensaveis´ para conseguir viver na Alemanha com muita felicidade e realizar o meu tra- balho com muito sucesso.

A todos aqueles que perdi, mas que sempre estiveram do meu lado e prontos para me apoiar nos maus ou nos bons momentos. Nunca me esquecerei de vos.´

A todos os colegas de trabalho que tive o prazer de conhecer na ABB e que tanto me ajudaram para conseguir realizar os meus objectivos.

A todos vos´ e a todos os outros que fizeram parte da minha vida,

Um MUITO OBRIGADO e a vos´ vos dedico esta tese.

Hugo Sequeira

iii iv Resumo

Organizac¸oes˜ industriais usam sistemas de gestao˜ energetica´ (EMS) para monitorizar, controlar e op- timizar o seu consumo energetico.´ Sistemas industriais como estes sao˜ complexos e dispendiosos, devido aos seus requisitos avanc¸ados de desempenho, confiabilidade e interoperabilidade. A industria sente tambem´ algumas dificuldades na operac¸ao˜ dos actuais sistemas EMS quando pretende ter uma monitorizac¸ao˜ centralizada do consumo energetico´ e emissao˜ de CO2 nos varios´ locais de produc¸ao,˜ na integrac¸ao˜ de dados energeticos´ e de automac¸ao,˜ e quanto pretende efectuar uma analise´ comparativa da eficienciaˆ energetica´ entre as diferentes produc¸oes.˜ Para alem´ disso, a industria sente tambem´ prob- lemas de big data devido a` evoluc¸ao˜ tecnologica´ dos equipamentos de medic¸ao.˜ Estes produzem cada vez mais medic¸oes˜ com mais detalhe e com mais frequencia,ˆ resultando na gerac¸ao˜ de grandes quanti- dades de dados, que dificulta a gestao˜ de toda esta informac¸ao˜ em tempo real. Esta tese propoe˜ entao˜ uma soluc¸ao˜ EMS na cloud para resolver estas dificuldades e derivar novas e mais informac¸oes˜ em tempo real. De facto, o impacto desta tese e´ deveras extenso, com possibilidades inovadores para as organizac¸oes˜ industriais detectarem padroes˜ de ineficienciaˆ no seu consumo energetico´ e conseguirem reagir a mudanc¸as de ambiente com mais rapidez. A relevanciaˆ da soluc¸ao˜ proposta nesta tese foi con- firmada atraves´ de uma avaliac¸ao˜ a` forma como resolveu casos de uso que estao˜ em falta nestes sistemas industriais. A sua viabilidade de implementac¸ao˜ e o seu desempenho foram tambem´ avalia- dos, atraves´ da implementac¸ao˜ de um prototipo´ e da avaliac¸ao˜ do seu comportamento em diferentes testes de stress.

Palavras-chave: Eficienciaˆ Energetica,´ Gestao˜ Industrial Energetica,´ Sistemas Industriais de Gestao˜ Energetica,´ Demand Response (DR), Computac¸ao˜ em Nuvem, Computac¸ao˜ em Tempo Real

v vi Abstract

Industrial organizations use Energy Management Systems (EMS) to monitor, control, and optimize their energy consumption. Industrial EMS are complex and expensive systems due to the unique require- ments of performance, reliability, and interoperability. Moreover, industry is facing challenges with cur- rent EMS implementations such as cross-site monitoring of energy consumption and CO2 emissions, integration between energy and production data, and meaningful energy efficiency benchmarking. Ad- ditionally, big data has emerged because of recent advances in field instrumentation that led to the generation of large quantities of machine data, with much more detail and higher sampling rates. This created a challenge for real-time analytics. To address these needs and challenges, this thesis proposes a cloud-native industrial EMS solution with cloud computing capabilities to enable the extraction of ac- tionable knowledge from large amounts of real-time data. Indeed, the impact of this work is far reaching as it enables organizations to detect hidden patterns of inefficient energy use and to react to changes of events in real-time. The feasibility of our proposal was verified with the implementation of a proof of concept and its usability and performance validated by respectively evaluating its approaches to solve important use cases that the industry is lacking of and how it handles different amounts of workloads.

Keywords: Energy Efficiency (EE), Industrial Energy Management, Energy Management Sys- tems (EMS), Demand Response (DR), Cloud Computing, Real-time Computing

vii viii Contents

Acknowledgments...... iii Resumo...... v Abstract...... vii List of Tables...... xiii List of Figures...... xvi List of Acronyms...... xvii

1 Introduction 1 1.1 Motivation...... 2 1.2 Problem statement and objectives...... 3 1.3 Research methodology and contributions...... 3 1.4 Document organization...... 5

2 Research Background 7 2.1 Energy Demand Management...... 7 2.1.1 Smart Grids...... 7 2.1.2 Liberalised Electricity Markets...... 7 2.1.3 Demand-Side Management...... 9 2.1.4 Energy Efficiency...... 10 2.1.5 Energy Management...... 11 2.1.6 Energy Management Systems...... 11 2.1.7 Industrial Energy Management Systems...... 12 2.2 Demand Response...... 18 2.2.1 Demand Response Programs...... 19 2.2.2 Demand Response Standards...... 20 2.2.3 Survey on Energy Management Systems...... 21 2.3 Industrial Automation Systems...... 24 2.3.1 Industrial Manufacturing...... 25 2.3.2 Industrial Control Systems...... 25 2.3.3 Industrial Networks...... 27 2.3.4 Cyber-Physical Systems...... 27

ix 2.3.5 Internet of Things...... 28 2.3.6 Industry 4.0...... 28 2.4 Cloud Computing...... 30 2.4.1 Cloud Computing Concepts...... 30 2.4.2 Cloud Computing Service Models...... 31 2.4.3 Cloud Computing Deployment Methods...... 32 2.4.4 Benefits of Cloud Computing...... 33 2.4.5 Risks and Concerns of Cloud Computing...... 34 2.4.6 Service-Level Agreements...... 35 2.4.7 Big Data and Real-Time Analytics...... 35

3 Solution 39 3.1 Scope Analysis...... 39 3.2 Requirement Analysis...... 40 3.2.1 Big Data Requirements...... 40 3.2.2 Real-Time Requirements...... 40 3.2.3 Functionalities...... 40 3.2.4 Quality Attributes...... 41 3.2.5 Use Case Diagrams...... 42 3.3 Conceptualization...... 45 3.3.1 Energy Monitoring (Real-time computing)...... 45 3.3.2 Energy Analytics (Batch processing)...... 45 3.4 Energy Cloud...... 46 3.4.1 Dashboards...... 47 3.4.2 Analytics API...... 47 3.4.3 Message-oriented Middleware (MOM)...... 49 3.5 Energy Monitoring...... 49 3.5.1 Storm Topologies...... 50 3.5.2 Storm Architecture...... 50 3.5.3 Storm Clusters...... 51 3.5.4 Energy Storm...... 52 3.5.5 Real-Time Messaging Servers...... 52 3.6 Energy Analytics...... 52 3.6.1 Hadoop Cluster...... 52 3.6.2 Timeseries DB and Distributed Historical Storage...... 53 3.7 Deployment...... 54

4 Evaluation 55 4.1 Conceptual Evaluation...... 55 4.1.1 Use Cases...... 55

x 4.1.2 Results and Discussion...... 57 4.2 Performance Evaluation...... 57 4.2.1 Data Sets...... 58 4.2.2 Virtual Energy Cloud...... 59 4.2.3 Test Cases...... 62 4.2.4 Results and Discussion...... 64

5 Conclusions 69 5.1 Achievements...... 69 5.2 Future Work...... 70

Bibliography 76

A Virtual Energy Cloud Models 77

B Energy Cloud 82

xi xii List of Tables

1.1 Challenges and opportunities for industrial energy management using the cloud.....2

2.1 Demand Response programs used by energy providers and consumers...... 20 2.2 Survey of Energy Management Systems...... 23 2.3 Automation protocols and standards...... 27

4.1 Energy key performance indicators for the industry...... 56 4.2 Results of the solution proposal conceptual evaluation...... 68

xiii xiv List of Figures

1.1 Current industrial energy management model...... 3 1.2 Research methodology...... 5

2.1 Smart grid environment...... 8 2.2 Impacts of Demand-Site management programs in production...... 10 2.3 The real-time requirement in the industrial sector...... 12 2.4 Architecture of Energy Management Systems...... 13 2.5 The PDCA cicle model...... 15 2.6 Load shifting technique...... 16 2.7 Industrial energy power load curves...... 17 2.8 Demand Response events...... 19 2.9 Architecture of Industrial Automation Systems...... 26 2.10 Automation pyramid model...... 28 2.11 Architecture of the Cloud...... 31 2.12 Responsibilities of cloud providers and users...... 32 2.13 Costs of on-premises systems versus cloud systems...... 34 2.14 The Lambda Architecture...... 37

3.1 Energy Cloud use case diagram...... 43 3.2 Energy Monitoring use case diagram...... 43 3.3 Energy Analytics use case...... 44 3.4 Data Collection use case diagram...... 44 3.5 Architecture of a cloud-native industrial EMS...... 46 3.6 Energy Monitoring dashboard...... 48 3.7 Energy Analytics dashboard...... 49 3.8 Storm Topologies...... 50 3.9 Storm Architecture...... 51 3.10 Energy Storm topology...... 53

4.1 Virtual Energy Cloud multi-site view...... 60 4.2 Virtual Energy Cloud metrics view...... 63 4.3 Storm sampling validation...... 65

xv 4.4 Storm test cases...... 66 4.5 Storm benchmarking...... 67

A.1 Virtual Energy Cloud domain and service layer...... 78 A.2 Virtual Energy Cloud gateway layer...... 79 A.3 Detailed Virtual Energy Cloud Dashboard - Sites...... 80 A.4 Detailed Virtual Energy Cloud Dashboard - Metrics...... 81

B.1 Energy Cloud Performance Results...... 83 B.2 Detailed Energy Monitoring Dashboard...... 84 B.3 Detailed Energy Analytics Dashboard...... 85

xvi List of Acronyms

CPS Cyber-physical system HVAC Heating, Ventilation, and NIST National Institute of Air conditioning System Standards and CS Control Server Technology IAS Industrial Automation CSA Cloud Security Alliance Systems PaaS Platform as a Service CRUD Create, Read, Update, PLC Programmable Logic and Delete IaaS Infrastructure as a Service Controllers DCS Decision Control System ICS Industrial Control RTP Real-time Pricing DER Distributed Energy Systems RTU Remote Terminal Unit Resources ICT Information and DLC Direct Load Control SaaS Software as a Service Communications SCADA Supervisory Control DR Demand Response Technology and Data Acquisition DSM Demand Side ISO Independent System SEP Smart Energy Profile Management Operators

EAF Electric Arc Furnace SG Smart grid IoT Internet of Things EC Energy Conservation SLA Service Level IT Information Technology Agreement EE Energy Efficiency KDD Knowledge Discovery in SMB Small and Medium-sized EMS Energy Management Databases Businesses Systems KPI Key Performance SOA Service Oriented FERC U.S. Federal Energy Indicator Architecture Regulatory Commission LM Load Management SR Spinning Reserves HMI Human-Machine Interfaces MES Manufacturing Execution TOU Time-of-Use Systems HRIS Human Resources WSDL Web Service Description Information System MTU Master Terminal Unit Language

xvii xviii Chapter 1

Introduction

The industrial sector produces the most CO2 and is one of the largest consumers of electricity worldwide, at a rate that continues to grow annually (International Energy Agency, 2008). However, due to limited resources and high costs, energy production is not growing at the same ratio, resulting in a demand- supply mismatch (Inamdar and Hasabe, 2009). In an effort to close this ever-widening gap, energy suppliers and consumers are working together to keep demand under acceptable and secure levels. Energy suppliers run a set of Demand Response (DR) programs that influence consumers to amend their energy consumption, through changes in the price of electricity or by financial incentives (Mohagheghi and Raji, 2012). On the other hand, energy consumers can use their available energy more efficiently by their own initiative. In order to accomplish this, industries must find inefficiencies and reduce energy consumption without affecting their business and production processes. In residential and commercial sectors, this essentially involves using energy efficient equipment or dimming out lights and heaters. In contrast, energy efficiency initiatives encounter unique difficulties in the industrial sector, including production and quality constraints, multiple energy tariffs, and consumption and emission restrictions that make the task of saving energy more complex. EMS are tools that monitor, control, and optimize energy consumption (Fiedler and Mircea, 2012). Nevertheless, literature, research projects, studies and industry expertise, make clear that there is a need for a novel and robust platform capable of providing more and better energy information monitoring, integration, repository, and analytics towards a future energy efficient manufacturing (see Table 1.1). In addition, recent advances in hardware, networking and software of sensor and control equipment, resulted in a massive increase of machine data in terms of volume, variety and velocity (the three V’s of big data), that hinders the capability to collect, monitor, and analyze all these data (Saha and Srivastava, 2014).

In the meantime, cloud computing is coming to the forefront and being applied to various fields. Its main application is solving large scale computation problems by optimizing and combining distributed resources (Jadeja and Modi, 2012). Cloud computing is a model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal manage- ment effort or service provider interaction (Mell and Grance, 2011). It provides several benefits such as

1 Industry challenges Cloud solutions Challenges Implications Cloud benefits Added value Industries have complex Industrial EMS are ex- Cloud solutions reduce Organizations can now infrastructures and opera- pensive and harder to im- investment and mainte- afford to have an EMS tions plement nance costs and with with less investment and less time-to-benefit maintenance costs Sites are often geograph- Hinders the ability to Cloud solutions are cen- Achieve energy monitor- ically distributed and self- have a multi-site energy tralized with easy access ing and analytics across managed management to data sites Weak integration between Manufacturing is not Cloud can provide inter- React faster to changes energy and production energy-aware and de- operability and integrate with data driven and in- data cision making is not data from external sys- formed decisions informed nor based on tems real-time data Weak and decentralized Harder to benchmark en- Centralized correlation of Derive knowledge from energy efficiency and cost ergy usage and strate- data from all production energy use and identify analysis tools gies across sites levels inefficiencies Huge amounts of ma- Guesswork to find con- Cloud can optimize the Find hidden patterns and chine and energy data to sumption inefficiencies performance of schedul- produce new, faster and analyze and process and optimal schedules ing algorithms richer knowledge

Table 1.1: Summary of the challenges and needs that the industry is facing with current energy manage- ment solutions and how cloud could address them (adapted from (Walawalkar et al., 2010; Cannata and Taisch, 2010; Bunse et al., 2011; Thollander and Ottosson, 2010; Inamdar and Hasabe, 2009; Givehchi et al., 2013)). saving of IT costs and maintenance, strong integration capabilities, short time-to-benefit, and scalable computation on demand that keep up with customer needs (Voorsluys et al., 2011). These benefits have pushed many residential and commercial EMS solutions to the cloud (Motegi et al., 2003; Byun et al., 2012; Hong et al., 2012). Hence, based on our research across publications, research projects, and industry trends (e.g., IMC-AESOP, IMS2020, Industry 4.0, Internet of Things, Cyber-Physical Systems), we believe that the industrial sector will also incorporate cloud technologies in their energy management and production processes in the near future. Therefore, the central motivation for this thesis is to study the migration of industrial EMS to the cloud, evaluate the possibilities to achieve more energy and cost savings, taking advantage of the latest cloud computing and big data technologies, and finally propose a solution. The conclusions of this thesis will be validated by developing, deploying and evaluating the feasibility and performance of a cloud-native industrial EMS proof of concept with big data capabilities.

1.1 Motivation

Companies with multi-site production usually follow the traditional energy management model with on- site energy management (Ates and Durakbasa, 2012). In practice, for each facility, an EMS has to be deployed and maintained. Moreover, these EMS may not even be from the same vendor. Each of them uses different proprietary protocols, resulting in heterogeneity in the organization and integration issues later on. This deployment method also implies some downtime and propagation time whenever there is a need for a system update, which often results in complications. Achieving a global energy and CO2 emissions management and an energy efficiency performance

2 evaluation, across equipment, production processes, departments, and facilities may involve many con- tacts, integrations, and constant assessments on-site over time. Along with the energy information systems, someone has to be responsible to manage the energy consumption on premises. In conclusion, this management model hinders global energy management with the following inef- ficiencies (see Figure 1.1): the same resources and costs spend in multiple sites to provide the same functionality, inefficient capabilities to integrate data across locations, and harder and slower knowledge sharing, affecting the business processes and competitiveness of organizations.

Conventional industrial EMS deployment model Proposed cloud-based industrial EMS model

IEMS

IEMS IEMS IEMS

IEMS Energy Managers Corporate Managers

Cloud-based IEMS 9

Figure 1.1: Conceptualization that compares the current industrial energy management model and the one proposed in this thesis.

1.2 Problem statement and objectives

EMS have evolved a great deal in the previous years, but there is still a gap between current solutions and the industry needs (see Table 1.1). Recent studies show that the solution lies in better energy monitoring and control systems, integration of energy efficiency into production information systems, and useful energy usage and cost benchmarking across equipment and productions sites to evaluate EE(Fiedler and Mircea, 2012; Kyusakov and Eliasson, 2012; Bunse et al., 2011; Arinez and Biller, 2010; O’Driscoll and O’Donnell, 2013). Therefore, the main objective of this thesis is to propose a solution that could solve these needs. This solution intends to be as optimal as possible, to facilitate problem identification and decision-making, and energy savings opportunities exploration. Following current industry standards and future trends, it aims to be a sustainable solution that can prevail for many years. In addition, this proposal intends to be an affordable solution oriented for industries of any size.

1.3 Research methodology and contributions

The challenging goal of this thesis requires an efficient research methodology to achieve the proposed objectives on time. It is necessary to review wide ranging topics and analyze how they could be com-

3 bined to obtain an unified solution (see Figure 1.2). The research of this thesis was also influenced by the following external contributions:

IMS2020 1 was an EU-funded research project that studied road maps towards Intelligent Manufactur- ing Systems (IMS) until 2020. It concluded that research is needed in five key areas: sustainable manufacturing, energy efficient manufacturing, key technologies, standards, and education. Fol- lowing studies identified the main needs of the industry sector like: energy-aware manufacturing processes (better measurement and control systems to improvement of energy efficiency) and more integrating of energy efficiency into production information systems (Bunse et al., 2011).

IMC-AESOP 2 was an EU-funded research and development project that studied the concept of a cloud of services in Industrial Automation Systems (IAS). Service Oriented Architecture (SOA) was the platform chosen to provide system interoperability. Service Oriented Architecture (SOA) provides an excellent platform for developing systems with various services offered by real time controllers, data acquisition systems, and legacy systems (Mora et al., 2012).

ABB Corporate Research 3 in Germany, contributed with industry expertise and feedback in the areas of Energy Management, Cloud computing and Industrial Automation Systems (IAS). ABB is one of the largest engineering companies in the world. ABB is a leader in power and automation technologies that enable utility and industry customers to improve their performance while lowering environmental impact.

Defining the research road map of this thesis involved an initial scope of work analysis. This anal- ysis revealed that an intense review of literature, research projects, and industry surveys and external expertise was needed as follows:

1. Perform a comprehensive literature review on the main concepts that could provide more energy efficiency: Energy Management, EMS, andDR.

2. Study research projects and industry trends to understand where the industry is moving towards and provide a sustainable solution that could work on future smart manufacturing architectures.

3. Analyse current developments in the area of cloud computing that can support the solution pro- posed in this thesis.

4. Review of current literature, surveys, and trends together with industry experts from ABB, in order to identify the fundamental needs of industrial organizations and what should be provided by this thesis.

5. The research finishes with topic consolidations and refinements from all information sources, com- bining the gathered research and using it to support the proposed objectives.

1http://www.ims2020.net/ 2http://www.imc-aesop.eu/ 3ABB is a multinational corporation headquartered in Zurich, Switzerland, operating in robotics and mainly in the power and automation technology areas. ABB has operations in around 100 countries, with approximately 150,000 employees (November 2013)—http://www.abb.com/

4 Research The main focus of Workload this work

Less

Energy Cloud Management Computing (Energy Efficiency) More Industrial Demand Automation Response Systems

Less Research Timeline weeks) 0 4 8 12 16 (

Research Center

Figure 1.2: Illustration of the research workload of this thesis and the research centers involved during its development.

6. Demonstrate the claims and feasibility of this thesis through the development of a proof of concept and evaluate it using a fleet simulator that simulates virtual energy metering devices operating in multiple sites.

This thesis brings the following contributions to the academic field:

1. Performs an extensive review of the existing literature regarding the state-of-art, benefits and risks of the main discussed topics: Energy Efficiency (EE), Demand Response (DR), Industrial Automa- tion Systems (IAS) and cloud computing.

2. Analysis of the different topics regarding automation and cloud computing towards a more energy- aware and efficient industry.

3. Proposes a solution for the problems described earlier, following the current industry trends and standards.

4. Conceptual and performance evaluation of the solution proposed.

1.4 Document organization

This thesis is divided in5 Chapters. Chapter2 describes the research background that supports this thesis. It starts by describing the status of the energy management domain and current approaches to tackle Energy Efficiency (EE) and Demand Response (DR) in the industry sector. Furthermore, it

5 presents a survey of current EMS in use today. Moreover, it introduces Industrial Automation Systems (IAS) and modern trends towards a more smart manufacturing. This Chapter concludes with a review of cloud computing and its modern developments that influenced this thesis. The proposed solution can be found in Chapter3 and its evaluation in the following Chapter4. Finally, Chapter5 presents the final conclusions obtained from this thesis and summarizes the most important aspects.

6 Chapter 2

Research Background

2.1 Energy Demand Management

This section describes the concepts that define the outline of this work scope regarding the energy domain. It follows a top-down approach, by starting from the outer concept, the Smart grid (SG), to Demand Side Management (DSM). The Section DSM breaks into two sub approaches: Energy Effi- ciency (EE) and Demand Response (DR). These latters represent the main research topics of this work in the energy field.

2.1.1 Smart Grids

The rigidity of the traditional grid was a major hindrance to overcome the problem of demand-supply mismatch. Nowadays, utilities and customers work as partners, they need to find ways to communicate and help each other. The concept of a SG, a computerized power grid that provides many advanced services using a two-way communication and information infrastructure linking utilities and customers, has been called to the rescue (Report et al., 2011; Kyusakov and Eliasson, 2012). The end goal of theSG is to enhance reliability of electricity distribution, reduce peak demand, shift usage to off-peak hours, lower total energy consumption and carbon dioxide footprint (Kyusakov and Eliasson, 2012). However, due to the enormous costs involved in this upgrade, grid operators are looking for ways to leverage these new services provided by theSG to offset the costs. Although, the economic benefits realized from the wide adoption ofDR are expected to pay the largest share of the investment on the SG (Faruqui et al., 2010).

2.1.2 Liberalised Electricity Markets

In traditional vertically-integrated electricity systems, supplies are maintained by a monopoly provider who has the responsibility to ensure that adequate generating capacity is available. Prices are gener- ally regulated wherever electricity is scarce or not. Electricity market liberalization was introduced with the intention of creating a reliable, economically efficient electricity sector and increase price setting

7 Secure data communication flows Electrical flows Domain

Operations Service Markets Provider

Consumer Bulk & Producer Generation

Transmission Distribution

Figure 2.1: Conceptual model that describes theSG environment and the interactions between the different actors (adapted from (Locke and Gallagher, 2010)). transparency. Several countries have liberalized their energy supply system through energy markets where different power suppliers can coexist and offer energy to customers through bidding. This way, prices are formed through complex interactions between the demand and supply side of the market. Notwithstanding, liberalization of electricity tends to substantially benefit large consumers like industrial customers because these are more energy dependent and therefore more easily willing to adapt their load on request. However, most buyers do not participate actively in the price-setting process and thus the process is far from complete. Several countries have liberalized their energy supply system through energy markets where different power suppliers can coexist and offer energy to customers through bidding. They are usually regulated by national and international authorities to protect consumer rights and avoid oligopolies1. With energy markets liberalization , prices are formed through complex interactions between buyers and sellers, i.e., between the demand and supply side of the market. However, most buyers do not participate actively in the price-setting process and thus the process is far from complete. As a result, prices fail to play their normal role of balancing natural swings in supply and demand, leading to excessive instability. Electricity market liberalization was introduced with the intention of creating a reliable, economically efficient electricity sector and increase price setting transparency. Notwithstanding, liberalization of electricity tends to substantially benefit large consumers, such as industrial customers, since these are more energy dependent and are more easily willing to adapt their load on request. In traditional vertically-integrated electricity systems, supplies are maintained by a monopoly provider who has the responsibility to ensure that adequate generating capacity is available. Prices are gener- ally regulated wherever electricity is scarce or not. In liberalized systems, by contrast, the function of balancing supply and demand is performed in almost real-time, normally through a wholesale electric-

1An oligopoly is a structured market where only a few dominant producers operate and where any action of one producer has an influence on the overall market, prices, and payoffs (David and Wen, 2001).

8 ity market, where information about the current and future supply and demand balance is signaled by electricity prices. Generally, efficient market prices are formed by interactions between suppliers and customers. This interaction determines the value of supply at any point in time. However, in some liberalized electricity markets, nearly all retail customers are exposed to prices that are fixed for rela- tively long periods, regardless of the supply-demand balance in the market. Under such conditions, the customers have no incentive to vary their consumption in response to actual market conditions.

2.1.3 Demand-Side Management

DSM is one important function in aSG that helps the energy providers reduce the peak load demand and reshape the energy load demand profile. DSM includes everything that is done on the demand side of an energy system, ranging from improving energy efficiency by using energy efficient materials, over smart energy tariffs with incentives for certain consumption patterns changes, up to real-time control of on-site power generation systems. There are different conceptions for DSM programs but generically they can be categorized as follows (Palensky and Dietrich, 2011; Walawalkar et al., 2008; Albadi and El-Saadany, 2007; Mohagheghi and Raji, 2012):

Energy Conservation (EC) focuses on user energy consumption behavioral changes to use less en- ergy usually driven by education (e.g., use of natural lighting over electrical lighting).

Energy Efficiency (EE) means using building materials, equipment or techniques that are more energy efficient, i.e., use less power to perform the same tasks (e.g., replacing an incandescent lamp with a compact fluorescent lamp which uses much less energy to produce the same amount of light).

Demand Response (DR) refers to the changes in customers energy end-use from their nominal con- sumption patterns. These changes in consumption are often in response to changes in the price of electricity over time, or due to incentive payments (e.g., dimming down lights to comply with the utility request to reduce energy consumption under a certain level, in exchange of a compensation).

Kyusakov et al. (Kyusakov and Eliasson, 2012) affirms that one of the main challenges for DSM and SG is on the Information and Communications Technology (ICT) side. It is ICT interoperability, scala- bility, information security, and network management that show more challenges. Fortunately, several international research projects are working on smart grid, communication, security, interoperability, and smart-metering standards for energy management (Palensky and Dietrich, 2011; Kyusakov and Elias- son, 2012). Manufacturing processes as paper pulping, steel smelting using Electric Arc Furnace (EAF) 2 con- sume lots of energy. The industrial sector is often the bigger portion of the total load served by utilities. At many utilities, industrial customers (2-10% of total customers) account for at least 80% of the elec- tricity usage (Mohagheghi and Raji, 2012). This further emphasizes the importance of the role of the industrial sector in these previous measures.

2An EAF is a furnace that heats charged material by means of an electric arc. The use ofEAFs allows steel to be made from a 100% scrap metal feedstock. As EAFs require large quantities of electrical power, many companies schedule their operations to take advantage of off peak electricity pricing.

9 Figure 2.2: Impacts of DSM programs in the quality of industrial production processes (adapted from (Palensky and Dietrich, 2011)).

2.1.4 Energy Efficiency

Efficient energy use, sometimes simply called Energy Efficiency (EE), is the goal to reduce the amount of energy required to provide products and services without compromissing them (Palensky and Dietrich, 2011). Achieving this goal may involve using more energy efficient equipment, methods, processes, energy management techniques, or others. For example, insulating a home allows a building to use less heating and cooling energy, to achieve and maintain a comfortable temperature inside. Another typical example would be the use of fluorescent lights or natural skylights, to reduce the amount of energy using traditional incandescent light bulbs. Improvements in energy efficiency are usually achieved by adopting a more efficient technology or production processes or by application of commonly accepted methods to reduce energy losses. These measures imply immediate and permanent energy and emissions savings, and therefore are the most accepted methods. Surely,EE potential and operations of physical parts (motors, lights, Heating, Ventilation, and Air conditioning System (HVAC), equipment, etc.) are important, but they are relatively well researched and are out of scope of this work (Palensky and Dietrich, 2011). Hence, this work is only concern withEE measures driven by Information Technology (IT) systems and processes in the industrial sector. There are important drivers to introduceEE in this sector (Bunse et al., 2011):

• Rising energy prices makes energy consumption reduction more and more important to manu- facturing companies. Especially in energy-intensive industries (e.g., steel, cement, pulp and paper, chemicals), where energy can account for up to 60% of operating costs, turning energy costs in a strong factor for competitiveness.

• New environmental regulations with environmental taxes, subsidies, emission permits, and green certificates for CO2 emissions. For example, in Germany, the implementation of an certificated En-

10 ergy Management Systems (EMS), allows companies to save ten or hundred thousands of Euros per year on environmental taxes reductions (Fiedler and Mircea, 2012).

• Changing purchasing behaviors based on products or services that have been manufactured in a more environmentally friendly way, known as “green products”.

2.1.5 Energy Management

Energy management stands for all the measures and activities which are planned or executed in order to minimize the energy consumption of a company or institution (Fiedler and Mircea, 2012). It influences the organizational and technical processes as well as patterns of behavior and labor in order to reduce, within economical constraints, the consumption of energy and increase energy efficiency. More specifi- cally energy management includes control, monitoring, and improvement activities for energy efficiency. In the end, energy management is beneficial for industrial companies for economic, environmental and societal reasons (Bunse et al., 2011). Nevertheless, several studies have identified a low status of energy management as a barrier to energy efficiency (Bunse et al., 2011). One of the reasons lies on the energy managers that perform energy management. Usually these managers are not qualified for the job or aren’t fully committed to the role. A survey performed by Siemens in the UK to the top top 600 leading companies, showed that only 1 in 10 energy managers spend up to 50% of their time on energy related issues and 94% of them don’t have any qualification in energy management (Uk, 2011). Small and Medium-sized Businesses (SMB) may not have the necessary capital or needs to hire an energy manager. But sometimes even big companies with the necessary resources to recruit, lack of energy managers and tools. Ates et al. (Ates and Durakbasa, 2012) surveyed 120 large companies with a total annual energy consumption of 1000 toe or more and with 80% of them having more than 500 employees, from the top 2000 industrial companies in Turkey. The study revealed that 18% of surveyed organizations don’t have energy managers. More importantly, it discovered that only 24% of those large companies actually practice energy management. As for the SMB, it estimated a rate significantly below 20%.

2.1.6 Energy Management Systems

EMS is an energy management tool used in a wide variety of applications to effectively monitor, opti- mize and control power generation, distribution and consumption. The main goal of these system is to increase energy efficiency and thus achieve energy savings, through continuous monitoring and mainte- nance of the facilities, improving the operation of equipment and decreasing energy consumption without compromising the customer needs(Arinez and Biller, 2010). The cost saving argument is probably the major driver for the majority of organizations implementing an EMS(Fiedler and Mircea, 2012). Since lowering the energy costs increases the profit, the search for energy saving potentials always merits. On the other side, one of the barriers to the adoption of EMS it’s the capital investment necessary to deploy these systems.

11 Production Quality Sudden change Energy in operating conditions

Steady-state operation

Timeline

Figure 2.3: Real-time is a requirement in the industrial sector because sudden disturbances affect sev- eral domains such as energy, production, and quality simultaneously (adapted from (Ma et al., 2010)).

Today, these systems offer broad scope of capabilities and features. These can be found in energy suppliers (e.g., in electrical generation plants and power transmission supervision centers) and in energy customers (e.e., in the industrial, commercial and residential sector). Nowadays both produce and consume energy and data, they both need these systems to increase energy consumption awareness and to provide informed decisions. Independent of the magnitude and application, each kind of EMS has its own unique requirements depending on the user’s needs.

2.1.7 Industrial Energy Management Systems

The focus of this work is on those EMS found on the industrial customers, i.e., small to big manufacturing facilities. Research shows that this sector is lacking complex and intelligent energy monitoring and control systems when compared to EMS found in other sectors (Arinez and Biller, 2010; Kyusakov and Eliasson, 2012; Bunse et al., 2011). While on other sectors, as the residential and commercial, energy management may only involve using more energy efficient equipment, dimming lights or switching off air condition equipment, the industrial sector have some unique challenges that difficult the task. To find out which activities can be dimmed or suspended in respond to aDR request or even, how power demand can be reduced by rescheduling power dependent activities in response to a DSM ini- tiative, we first need to understand how the facility is spending its energy and which constraints there are.

Energy Management Systems Architectures

EMSs are intended to help managing and reducing energy consumption in any facility infrastructure. To achieve this purpose, a standard EMS is usually architected in a multi-layer application as follows (Ma et al., 2010):

12 Energy-use optimization Application Layers Performance evaluation

Integration Energy-use interpretation Layers

Data transmission Data Acquisition Layers Energy data acquisition

Network

Harware Energy Sensors Meters

Figure 2.4: Conceptualization of the generic architecture for EMS composed by a data acquisition layer to gather data from the field and transmit it to the upper layers, an integration layer to transform data into internal representations, and an application layer to analyze these data (adapted from (Ma et al., 2010)).

Energy data acquisition layer includes the modules responsible to communicate with sensors and metering devices and retrieve its data like: status, temperature, humidity, illumination intensity and the current amount of energy consumed. This can be performed by pooling the devices periodically or event-driven, where whenever some value changes the device has the responsibility to send the data.

Data transmission is a middle-layer operating the data from the field level (aggregating it) and estab- lishing connections with the central system. Industrial networks usually use communication proto- cols standards like the M-Bus, , CAN, AS-I bus, Interbus or the Profibus to exchange data between meters and servers (O’Driscoll and O’Donnell, 2013; Feuerhahn et al., 2011; Bayindir et al., 2011; Kyusakov and Eliasson, 2012).

Energy-use interpretation containing the module responsible for evaluating, integrating, transforma- tion and mapping retrieved data into the EMS energy data models repository.

Performance evaluation layer is responsible to benchmark energy performance like energy efficiency, energy consumption, energy costs and others energy Key Performance Indicator (KPI) specific to the customer bussines.

Energy-use optimization layer contains the modules responsible to reduce energy consumption and to optimize the operation of equipment to increase energy efficiency.

13 The ISO 50001 standard

It is only through the use of standards that the credibility can be verified and requirements of inter- connectivity and interoperability can be assured. With an issued EMS certificate, a company proves a sustainable company strategy together with a reasonable usage of energy that strengthens its company image.

ISO 50001 establishes an international framework for industrial plants and companies manage en- ergy, including all aspects of processes and the energy management system model (Fiedler and Mircea, 2012). Based on the Plan, Do, Check, Act (PDCA) model, this standard provides an energy manage- ment implementation strategy which involves (i) establishing an energy management policy (ii) forming an energy management team to effectively implement an energy management system (iii) conduct an energy review (iv) identifying and analyzing opportunities for improving energy performance (v) estab- lishing a baseline and energy performance indicators for tracking the progress (vi) helping and guide to set energy performance improvement targets (vii) implementing action plans to achieve customer targets. Like all International Organization (ISO)3 standards, such as the quality management ISO 9001 or the environmental management ISO 140001, the ISO 50001 was designed to be implemented by any type of organization, independently of its size, business, or geographical location. It does not impose any energy performance improvement targets. The strategic and operative energy targets are rather up to the organization itself. In other words, any organization, regardless of its cur- rent level of energy management, can implement the ISO 50001 standard and achieve a improvement baseline (Fiedler and Mircea, 2012).

PDCA (Plan, Do, Check, Act) is the model for energy management when employing the ISO50001. The PDCA cycle provides a framework for continuous improvements of processes or systems. It is a dynamic and repeating model, the results of one cycle are the input for the following (Fiedler and Mircea, 2012). This structure enables a continuous reassessment of the energy consumption and a sustainable optimization and reduction:

1. Plan: conduct the energy review and establish the energy-use baseline, energy performance indi- cators, objectives, targets and action plans necessary to deliver results in accordance with oppor- tunities to improve energy performance and the organization’s energy policy.

2. Do: implement the energy management action plans.

3. Check: monitor and measure processes and the key characteristics of its operations that deter- mine energy performance against the energy policy and objectives and report the results.

4. Act: take actions to continually improve energy performance and the EMS.

3http://www.iso.org/

14 Plan Do

Energy policy and Implementation and energy-use review operational control

Objectives and action Awareness and planning training of staff

Act Check

Monitoring and Management Review analysis

Corrective and Optimisations preventive actions

Internal audit of the EMS

Figure 2.5: Illustration of the different stages of the PDCA cicle model in ISO 50001 to perform energy management efficiently (adapted from (Chiu et al., 2012; Fiedler and Mircea, 2012)).

Load management

Load Management (LM) is the process of balancing the energy consumption over an electric network in order to avoid consumption during high price periods and optimize utilization of valuable resources like fuels, power generators, power transmission networks, and network distribution capacity (International Union for Electricity applications, 2009).

Power demand during system on-peak demand is therefore more expensive since it requires expen- sive generation power stations. If a customer can reduce his demand during a on-peak demand, hence reducing the supplier requirement for network capacity, then the customer reduces the total electricity charges, since saving costs to the supplier, distributor and to the producer. This also allows the post- ponement of the need for additional capacity, while at the same time increasing the operating efficiency of the energy system.

This may include using on-site power generation, energy storage equipment, shifting demand to a less expensive period of the day, such as lighting and heating, or through temporary shut-down of one or more processes

Some of the most common demand changes techniques to achieve energy costs reduction without sacrificing the performance and quality of the manufacturing processes are the following (International Union for Electricity applications, 2009; Guntermann, 1982; Raghavendra Nagesh et al., 2010; Piette et al., 2004):

15 Figure 2.6: Load shifting example where the cooling activity is shifted from a on-peak period (left side) to off-peak periods (right side) to keep consumption under the billing energy demand limit.

Load shifting is a technique where the energy consumption period is shifted to periods of the day with lower energy consumption, or in particular, with lower prices. Although the same amount of energy is used, the overall costs associated with the energy consumption will be reduced, because the consumption will be shifted from on-peak to off-peak time slots.

Load shedding or Demand Limiting, is a technique that simply curtails energy loads, i.e., it reduces current energy consumption by forcing equipments or processes shutdowns.

Load priority systems to avoid large loads interacting simultaneously, e.g., motors and ovens starting at the same.

Energy storage units are charged during off-peak periods and used during peak hours, i.e., power batteries.

On-site generation also called Distributed Generation, are systems with small-scale power generation technologies used to provide an alternative to or an enhancement of the traditional electric power system, e.g. solar panels.

Load curve profiling

Complex manufacturing facilities consume a significant amount of the industrial sectors electrical energy, to power motors, compressors, machine tools and it is also required to maintain adequate heating, ventilation and air conditioning (O’Driscoll and O’Donnell, 2013). Industrial energy use can be classified as (Rahimifard et al., 2010):

Indirect energy is used to maintain the environment that surrounds the production processes, e.g. energy used to power lights, sensors or HVAC.

16 Power (kW)

170 160 Oven 150 140 130

120 Lighting 110 100

90 Lighting Heating 80 Lighting Hot water 70 60 Hot water 50 40 Permanent processes in production 30 20

10 Ventilation 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (h)

Figure 2.7: Example of an industrial power load curve, describing all the energy dependent activities and its consumptions over the day (adapted from (International Union for Electricity applications, 2009)).

Direct energy is defined as the energy used by various processes (e.g. casting, machining, spray painting, inspection, etc.) required to manufacture products. Hence this is the essential energy that production facilities depends on, making it the hardest one to be dimmed, because it can jeopardized the whole production process, turning the power demand enhancement not costly worth it.

Figure 2.7 provides an example of load curve of a production factory and in it, we can see all the factory power dependent activities and their consumptions over time. As we can see, the energy used in the activities ‘Permanent processes in the workshop‘ and ‘Oven‘ are classified as direct energy since they power the production process. As for the other activities, they use indirect energy since they support the production.

Load curve optimisation

After obtaining the site load curve, it’s necessary to analyse the curve and how it can be enhanced. This process includes the following tasks (International Union for Electricity applications, 2009):

Definition of the load curve objectives according to the pricing mechanisms in effect. From the cus- tomer point of view, a flatted curve may not be the the best solution. Industrial customers often

17 want to reduce the load during on-peak time and to increase off-peak consumption.

Incentives and opportunities checking the electricity bills to determine if any DSM promotion exist at the moment and in which time periods.

Archiving load curves for a large period of time will provide a proper historic of the actions taken and how well they reflected in the total energy efficiency. It may also be useful for load forecast in similar situations.

Analysis of the processes and supporting operations energy consumptions, characterises the energy load. This will allow to manage the amount of load reduction that can be achieved by means of interruption or deferment of every single operation measured.

Constraints analysis of LM appliance. That is, to make sure that constraints do not exist in interrupting or rescheduling loads, with respect to: safety of operations, impacts on quality and quantity of pro- duction, preservation of the integrity of the equipment, and mutual interactions with other facilities in the factory.

2.2 Demand Response

Electricity supply and demand must remain in balance in real time to ensure stability on the electricity grid. Since seasons and weather influence electricity demand and since electricity cannot be stored in large quantities, it is necessary to plan energy supply availability in advance. Without this task, supply interruptions in the form of brownouts and blackouts would be common, causing considerable economic damages. Utilities rely on peaking power plants to meet these demand periods peaks, known as on-peak peri- ods. However, these power plants are very expensive to run, thus suppliers try to keep energy demand under control, to avoid running these power plants or to install extra capacity, and thus increasing the electricity prices (Mohagheghi and Raji, 2012; International Union for Electricity applications, 2009). For instance, the capital cost needed to produce 1MW ofDR (see Section 2.2) capacity in collaboration with the customers is about $240,000 vs. $400,000 for using a gas-fired peaking power plant.DR capacity it’s even faster too, since it can potentially be dispatched in less than 5 minutes, whereas a peaking power plant can take up to 30 minutes to ramp up to full capacity (Mohagheghi and Raji, 2012). The solution relies on conserving more electricity, i.e., demand-side needs participate and reduce their energy-use, by what is commonly called as virtual generation (Carreira et al., 2011). Either by their own initiative, hence performing energy management on-premise to minimize their energy consumption, or by the energy utility initiative. The fundamental idea is that on critical on-peak periods, the grid would request the consumers to reduce their loads, hence, acting as if they had power generating capacity of their own (though some may actually have) and economically compensate them for their participation. It’s a win-win situation for both because utilities defer from investments and continue to have an available and efficient power supply system, and customers benefit themselves from reduced costs and extra compensations.

18 Signal Received Period DR Initiation Ramp Energy Market HMI Load Management Rates Policy DemandResponse Response Deadline Response Sustained Signal Period Event DR Validity DR Client Check Module Grid Status Utility DR Recovery

DR Release Period Signal DR Decision Energy Normal Operation Engine Server Energy Metering Load Forecast Load Data Data Sources Utility Control Center Customer

Figure 2.8: Overview of the interactions between energy providers and consumers during Demand Response events (adapted from (Dam et al., 2008)).

This is whereDR comes into play. While the goal ofEE is to reduce energy use (kW/h), the goal of Demand Response (DR) is dynamic reduction of peak electricity demand (kW) (Kiliccote and Piette, 2005).DR is a DSM solution targeted to residential, commercial and industrial customers, with the purpose of reducing or shifting power demand to a specific time for a specific duration, when energy market prices are high wholesale or when system reliability is jeopardised.DR induce energy demand alterations with changes in the price of electricity or with incentive payments (Manuel and Cardoso, 2013; Mohagheghi and Raji, 2012; Cappers et al., 2010; Granderson and Piette, 2011; Albadi and El-Saadany, 2007; Cardoso, 2012; Dam et al., 2008). In other words,DR includes all intentional modifications to consumption patterns of electricity of end-use customers that are intended to alter the timing, level of instantaneous demand, or the total electricity consumption (Albadi and El-Saadany, 2007).

2.2.1 Demand Response Programs

There are many classifications forDR programs, but they can be roughly grouped in the ones based on incentives and the ones based on time tarrifs (see Table 2.1). Other way to look at the various programs of demand response is to distinguish in (i) MarketDR for the plans that involves wholesale energy market price signals and incentives (ii) PhysicalDR for the plans with utility grid load management signals and utility emergency signals (Palensky and Dietrich, 2011).

Regarding industrial customers, they are usually billed according to Time-of-Use (TOU) rates (In- ternational Union for Electricity applications, 2009). This means that the cost of the energy consumed depends on the hour of the day and the season. Prices are higher during on-peak time and lower at off-peak, hence customers are provided with these “price signals” that stimulate customers to change their consumption.

19 Type Program Description Incentive Based Direct Load Control (DLC) Utility or grid operator gets direct access to customer utilities. Curtailable rates Customers get special contract with limited loads. Emergency signals Voluntary response to direct emergency sig- nals from the utilities. Capacity markets Customers guarantee to pitch in when the grid is in need. Time-Based Rates Time-of-Use (TOU) Fix price tariffs based on periods of the day with high and low power demand. Critical peak pricing Price tariff based on seasonal demand peaks (e.g., 3—6 pm. on a hot summer weekday). Real-time Pricing (RTP) Wholesale energy market prices are for- warded to end customers.

Table 2.1: Summary of Demand Response (DR) used by energy providers and consumers to keep energy demand under control (adapted from (Palensky and Dietrich, 2011; Dam et al., 2008)).

2.2.2 Demand Response Standards

Customers can stabilize the power grid by increasing or decreasing their electricity consumption based on the amount of energy available in the utility grid. The problem is that this requires a large amount of customers. Hence, standards are required to make this work. A networking standard for demand response, such as OpenADR 2.0B, will help grow demand response and enable large numbers of very different customers to stable theSG(Kyusakov and Eliasson, 2012).

OpenADR 2.0 (Open Automated Demand Response) it’s a network protocol standard for energy infor- mation and communication exchange, target toSG to standardize, automate and simplifyDR. It con- tains a set of data models and interfaces (exchange patterns) that define standardDR signals and the interfaces between utilities, energy markets (dynamic and transaction pricing information), Independent System Operators (ISO), Distributed Energy Resources (DER), and energy consumers (industrial or res- idential buildings). The communication interfaces are based on SOA and are defined using Web Service Description Language (WSDL) using SOAP Web Services. Unfortunately, the use of these technologies arises technical challenges to resource-constrained devices due to its computing resource requirement, therefore the OpenADR scope does not cover Internet of Things (IoT) devices.

Smart Energy Profile (SEP) 2.0 is an application layer specification target to IoT devices for on- premiseDR andLM management. Created by the Consortium for SEP Interoperability (CSEP), SEP has been identified by the National Institute of Standards and Technology (NIST) as a primary candidate specification for energy information and control on the consumer side. The specification includes smart metering, pricing,DR, andLM applications for devices in residential and light commercial buildings operating on a Home Area Network (HAN), sometimes called Premises Area Network. SEP 2.0 runs

20 on top of the IP protocol and therefore, it supports Ethernet, WiFi, powerline, and low-power radio communications. Unlike OpenADR who uses SOAP as web services, SEP 2.0 relies on RESTful web services and Create, Read, Update, and Delete (CRUD) operations.

2.2.3 Survey on Energy Management Systems

Energy Management Systems (EMS) exist for a few decades now, but due to lack of standards and demand in previous years, this led to many proprietary implementations and systems outdated without updates in years (Fan et al., 2005).

With the proliferation of smart energies and demand for EMS have pushed these systems further. Even though, without standards on functionalities and architectures, many different implementation so- lutions with different purposes, emerged in the market.

This work presents here a survey current EMS demonstrating exactly this phenomenon. There are thousands EMS in the market now. For this work the focus rested in those who either had cloud capabilities or focus to the industrial sector. The final set of EMS chosen were the following: McKinstry - EEMSuite 4, KGS - ClockWorks 5, Powertech - EMS 6, Enernoc - DemandSMART 7, GE - XA/21 8, and ABB - cpmPlus EM 9. This set results from the top search findings for specific EMS with Cloud technologies and from the collaboration with ABB.

This has proved to be a hard task because it’s harder to make an apple-to-apple evaluation with solutions target for different industries, even though, providing the same features. As intended this surveys shows that current cloud EMS are still more focus for the residential and building sector. In addiction, it shows that there are some gaps between cloud-based systems and industrial systems that this work proposes to solve. As we can see, industrial EMS are stronger in monitoring and controlling capabilities, but not so strong in integration and benchmarking capabilities.

4http://www.mckinstryeem.com/ 5http://www.kgsbuildings.com/clockworks.aspx 6http://goo.gl/X4ElbA 7http://www.enernoc.com/for-businesses/demandsmart 8http://goo.gl/16hqj5 9http://goo.gl/CjcsLw, http://goo.gl/Ky6wv9

21 DemandSMART it’s the best example in this survey for a cloud-based EMS withDR capabilities using OpenADR 2.0 (Deliso, 2013). Created by EnerNOC’s10 a demand response leader company, it ensures that participating commercial, industrial, and institutional entities receive maximum payments for their participation in demand response. They manage more than 8,500 MW of demand response capacity worldwide on behalf of utility and grid operator clients.

According to EnerNOC moving to the cloud, allowed DemandSMART to be more capable of larger amounts of demand response and energy efficiency than before. As this work solution proposal, they believe that increasing the energy management scope from peripheral management to facility manage- ment and responding to grid events and energy prices, it’s the natural energy management and energy intelligence evolution.

Features surveyed

Several different set of features where analysed while performing this survey: (i) Cloud Technologies, (ii) Systems Integration, (iii) Data Mining and Analysis, (iv) Consumption Measurement and Benchmarking, (v) Monitor and Control, and (vi) Load Management capabilities. Each of the following set of features is summarized in Table 2.2:

• Cloud Technologies refers to cloud-based functionalities provided by the solution;

• Data Integration refers to the ability of the solution to integrate data from other systems;

• Data Mining and Analysis refers to features to extrapolate information out of raw data and present- ing them to the end-user;

• Energy-use refers to ability to measure the energy consumption over time and facing them against key performance indicators;

• Monitor and Control refers to the capability to monitor data on different equipments and plants;

• Load Management capabilities refers to the solution load management capability.

10EnerNOC, Inc. engages in the business of providing energy management applications, services and products for the smart grid, which include comprehensive demand response, data-driven energy efficiency, energy price and risk management and enterprise carbon management applications and services.—http://www.enernoc.com/

22 al .:Smayo eea etrspoie ycretESsolutions: EMS current by provided features  general of Summary 2.2: Table etr supported, feature : al :Smayo eea etrspoie ycretESsolutions: EMS current by provided features general 4 of Summary 4: Table etr supported, feature ut-iebenchmarking Multi-site benchmarking ciency e Process benchmarking ciency e Equipment Evaluation Energy-use forecast Demand allocation Cost advisory ciency E detection pattern Unsual Analysis and Mining Data integration data Production gathering data Environmental gathering status Equipment gathering data meter Energy Integration Data IaaS) Paas, (SaaS, deployment Cloud etc.) (Scalablity, capabilities Cloud storage Cloud Technologies Cloud Sector Industrial (HVAC) Sector Building Market Target Alarms capabilities Response Demand capabilities Management Load control Remote monitor Real-time Control and Monitor ciency e energy for KPI  etr o upre,-:ukoninformation unknown : - supported, not feature : 7 etr o upre,-:ukoninformation unknown : - supported, not feature : 22 23 4 4 4 4 4 4 4 4 4 4 7 7 7 7 ------McKinstry - EEMSuite 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 - - - KGS - ClockWorks 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 - - - - - Powertech - EMS 4 4 4 4 4 4 4 4 4 4 4 4 4 4 7 7 ------Enernoc - DemandSMART 4 4 4 4 4 4 4 4 4 4 7 7 7 ------GE - XA/21 4 4 4 4 4 4 4 4 4 4 4 4 7 7 ------ABB - cpmPlus EM 2.3 Industrial Automation Systems

Industrial Automation Systems (IAS) are very complex industrial automated systems and technologies, working together for a common automated production. The common automation system pyramid model of Industrial Automation Systems (see Figure 2.10), hierarchically layers the composition of systems with different rights and purposes (Givehchi, Givehchi). The shape of a pyramid was chosen because of the characteristics of information on the different levels i.e. size of data packages (highers layers have less amount of data), frequency of transmission, real-time requirements, availability requirements, etc. Each level can be briefly described as follows (Department of Electrical Engineering IIT Kharagpur and Iit, Department of Electrical Engineering IIT Kharagpur and Iit):

Enterprise level deals with Enterprise resource planning (ERP) systems which are the less technical and the more focus in commercial activities, such as supply and demand management, account- ing, product marketing etc.

Management level include the Manufacturing Execution Systems (MES) systems which are in charge of managing production and solving problems like production targets, resource allocation, task allocation to machines, maintenance management etc.

Supervision level comprise the supervision and control systems such as Supervisory Control and Data Acquisition (SCADA), Decision Control System (DCS), and Human-Machine Interfaces (HMI) which are responsible for supervising and controlling the overall production process.

Control level comprehends the automatic control systems such as Programmable Logic Controllers (PLC) that monitor and drive the field devices as sensors and actuators to fulfill the localized tasks.

Field level includes the devices in the field of operation that translate electrical signals into actions in the physical world (actuators) and the devices responsible to measure the environment (sensors).

The automation pyramid describes the total automation functions and includes all devices from the field to the enterprise level. From the bottom of the pyramid, where the information is framed by the technical process, up until the top level, where the enterprise resource planning systems for business management can be found. It is strictly and the different layers represent functions of similar type. The model also highlights the decreasing amount of data: the higher the level on the automation pyramid, the fewer amounts of data there are. From level to level, data is condensed and transformed into knowledge. At the bottom many short data from field devices (sensors and actuators) need to be gathered and transferred to the control level, where they are used to control the production process. From the control level to the supervision level only summarized control data which are relevant for the operator need to be transferred. Typically those data are less frequently transferred and in bigger packets. At top of the pyramid only orders from the business IT are transferred to the control level and shift protocols as well as production KPIs are transferred to the business IT (Givehchi, Givehchi; Bowers, 2013).

24 2.3.1 Industrial Manufacturing

Manufacturing is the process of transforming raw materials into finished goods. This involves several industrial processes who trough a set of tasks transform the input material into desirable outputs. The production of goods is performed by the following different manufacturing processes (Fraser, 2000):

• Process-based manufacturing is the production of goods in bulk quantities which cannot be distilled back into the original basic components due to a production recipe. Process manufacturing industries typically utilize two main processes:

– Continuous Manufacturing Processes are processes running continuously, often with tran- sitions to make different grades of a product. E.g., fuel or stems flows in petroleum refinery, chemical distillation, etc.

– Batch Manufacturing Processes have stage by stage production, conducted on a quantity of material. There is a distinct start and end step to a batch process with pauses in between. E.g., pharmaceutical ingredients, water purification, and inks.

• Discrete-based manufacturing industries typically conduct a series of steps on products that can be individually counted and labeled. E.g., production of automobiles, smart-phones, and airplanes.

Both process-based and discrete-based industries need control systems, sensors, and networks to orchestrate theirs processes in mass production.

2.3.2 Industrial Control Systems

Industrial Control Systems (ICS) is a part of IAS that encompasses the control systems, denoted as Controllers, used in industrial production to orchestrate several process-control activities through Sen- sors and Actuators systems (Stouffer et al., 2006; Department of Electrical Engineering IIT Kharagpur and Iit, Department of Electrical Engineering IIT Kharagpur and Iit). These controllers are essential ele- ments that operate in the field level in a Control Loop, i.e., they are continuously measuring the physical world through sensors, deciding what do next using control hardware has PLC, and acting based on the gathered data by interacting with their environment using actuators. Production processes can be monitored by operators and engineers using Human-Machine Inter- faces (HMI) devices. These are used to display the processes status information, historical information, and adjust parameters in the controllers.

Supervisory Control and Data Acquisition (SCADA)

SCADA systems supervise distributed control sub-systems, such as DCSs and PLCs, which are usually geographically dispersed (Galloway and Hancke, 2013; Stouffer et al., 2006). There are many types of SCADA systems offering different features, sometimes including remote control functionalities, but by definition these systems are tailored towards the monitoring of remotes sites. To achieve this, SCADA control centers utilize a complex HMI to visualize the status of the remote sites and a Master Terminal

25 External Network Enterprise Network Management Network

Weather, Energy Gateway prices, etc. ERP MES EMS Legacy System TCP/IP

Gateway

Sensors, PLC #1 Gateway Actuators MTU PLC #1 Gateway Gateway RTU CS Line or Radio FieldBus Gateway Communication Sensors, PLC #N CS Actuators DCS #1 PLC #1 DCS #N HMI

Remote Site #1 Remote Site #N SCADA Control Center

Fielbbus Network Control Network

Figure 2.9: Illustration of a Industrial Automation Systems (IAS) architecture where typically a SCADA system supervises multiple DCS systems which are controlling production using PLC systems. In top of those, other management systems control the other areas of the organization.

Unit (MTU) which communicates with Remote Terminal Unit (RTU) in the sites. This RTU can be an individual device or incorporated within a Control Server or a PLC. Due to its purpose these systems are usually used in distribution grid systems such as power, water, oil, and gas grids.

Decision Control System (DCS)

DCS systems are used to control production systems within the same geographic location. Each DCS uses a centralized control loop to supervise a process or a discrete part of a production facility, which operates using a localized group of controllers. By modularizing the manufacturing in many DCS sys- tems it reduces the impact of a single fault on the overall system. These control loops are controlled by a real-time Control Server (CS) which is responsible for directly gather all the data from the controllers in the network and commanding to execute some action. Since field controllers communicate through field- bus protocols, sometimesCS may need a Gameteways to translate and inter-connect theCS network with the fieldbus network (see Section 2.3.3).

Programmable Logic Controllers (PLC)

PLC are small industrial computers designed to perform the process logic functions by controlling the connected sensors and actuators. These systems form the core of industrial control systems and oper- ate in hard real-time using a power supply, processor, input/output, and communication module and with multiple inputs and output arrangements, extended temperature ranges, immunity to electrical noise, and resistance to vibration and impact.

26 Domain Protocols and standards Power system automation IEC 60870, DNP3, IEC 62351, Modbus, Profibus Automatic meter reading ANSI C12.18, IEC 61107, M-Bus, Modbus, ZigBee Process automation CIP, CAN bus, ControlNet, DeviceNet, DF-1, DirectNET, Ether- CAT, EtherNet/IP, GE SRTP, HART, Honeywell SDS, HostLink, Modbus, , PieP, Profibus, PROFINET IO, SERCOS in- terface, SERCOS III, Industrial control system OPC DA, OPC HDA, OPC UA, MTConnect BACnet, C-Bus, DALI, DSI, KNX, LonTalk, Modbus, oBIX, , ZigBee

Table 2.3: Summary of protocols and standards used in automated domains.

2.3.3 Industrial Networks

Industrial networks differs from conventional networks found in residential and building sectors. These have unique requirements such as the need for strong determinism (bounded and low latency variance), real-time data transfer (250 µs-10 ms) and fixed sampling periods. Hence, there are different network characteristics for each layer within the IAS hierarchy (Galloway and Hancke, 2013):

Control Network connects the supervisory control level like SCADA, DCS, HMI to lower-level control modules such as PLC.

Fieldbus Network links field devices to a PLC or other controller. Use of fieldbus technologies elimi- nates the need for point-to-point wiring between the controller and each device. The field device communicates with the fieldbus controller using an industrial control protocol. The messages sent between the sensors and the controller uniquely identify each of the devices.

Gateway/Router is a communications device that transfers messages between two networks. Common uses for routers include connecting a LAN to a WAN, and connecting MTUs and RTUs to a long- distance network medium for SCADA communication.

2.3.4 Cyber-Physical Systems

The term Cyber-physical system (CPS) refers to a new generation of control systems with integrated computational and communication capabilities to monitor and control in the physical world (Rajkumar et al., 2010). These systems interact with the physical world, and must operate dependably, safely, securely, and efficiently and in real-time. CPS is the confluence of technologies as embedded systems, real-time systems, distributed sensor systems and controls. CPS are often referred to as embedded systems. But unlike traditional embedded systems, a CPS is typically designed as a network of interacting elements with physical input and output instead of as standalone devices (Lee, 2008). Regarding industrial control systems, that means the merging of controllers as PLC with physical devices as sensors and actuators, with extra intelligent and real-time computational capacities capable

27 Figure 2.10: Illustration of the current industrial trend to move from the traditional automation pyramid architecture to Cyber-physical system (CPS) (adapted from (Givehchi, Givehchi)). of predicting and adapting the behavior of the system upon changes in the environment. Examples of CPS are aerospace systems, transportation vehicles and intelligent highways, defense systems, robotic systems, process control, factory automation, building and environmental control, etc. Although the individual components of CPSs such as sensor networks, controllers, actuators, dis- tributed systems, etc. have reached a research maturity level, research in the integrated whole called CPS, is still in its infancy. Thus CPSs are considered to be an emerging discipline of research.

2.3.5 Internet of Things

The Internet of Things (IoT) is another novel paradigm gaining ground in the academic and industry domain. A prove of this movement is the manifold definitions for IoT. The basic idea of this vision is that the technology is going to move towards a world whereas a pervasive variety of things of objects, with unique identification capabilities, are going to be able to interact with each other and cooperate with their neighbors to reach a common goal (Atzori et al., 2010). In this scenario, Things like objects, machines, or people, are provided with unique identifiers and the ability to exchange data over a network as the Internet, without requiring human-to-human or human-to- computer interaction, and therefore considered to be Smart. IoT is the confluence of technologies such as machine-to-machine (M2M) communication, wireless technologies, micro-electromechanical systems (MEMS), and devices with connection to the Internet.

2.3.6 Industry 4.0

The term “Industry 4.0” refers the arising fourth industrial revolution promoted by the German govern- ment, under the premise of “Smart Factories” (April, 2013), with the basic principle that by connecting machines, work pieces and systems, we are creating intelligent networks along the entire value chain, that can control each other autonomously. Industry 4.0 can be a reality in about 10 to 20 years and it will address and solve some of the challenges facing the world today such as resource and energy

28 efficiency, urban production, and demographic change. The first three industrial revolutions resulted from the evolution of technology by mechanizing the production, using electricity, and by embody industrialIT. Now, the introduction of the IoT, CPS, Big Data, and Cloud Computing into the manufacturing environment is ushering a fourth industrial revolution to increase productivity, quality, and flexibility within the manufacturing industry (SmartFactoryKL, 2014). Using these novel technologies in the manufacturing environment means comprising smart machines, storage systems, and production facilities capable of autonomously exchange information, trigger ac- tions, and controlling each other independently (April, 2013).

29 2.4 Cloud Computing

The Cloud it’s a concept that interest many companies with needs for less maintenance overhead, less costs, unlimited resources, quick deployment, and easy scalability. Thus, many businesses sectors are trying to incorporate the cloud in their processes. The manufacturing industry is one of them (Gilart- Iglesias, 2007; Givehchi, Givehchi; Givehchi et al., 2013; Langmann et al., 2012; Luo et al., 2011; Macia- perez et al., 2012; Staggs and Mclaughlin, 2010; Qu and Yingjun, 2014; Tao et al., 2011; Xu, 2012). But there are some common questions surrounding this new technology: What kind of information should be stored there? What are the benefits and risks involved? Is moving toward cloud computing right for the industry? How could it help to perform energy management? The cloud is no “silver bullet” solution. It has strengths and weaknesses and it should not be applied without thinking. Understanding the counterparts is necessary to take the right decision.

2.4.1 Cloud Computing Concepts

The cloud itself is a pool of resources—networks, servers, applications, data storage and services— which the end user can access and use on-demand (Mell and Grance, 2011; Meenakshi, 2012). The cloud is a congregation term of diverse technologies into one. Technologies such as clusters, grids, and now, cloud computing, have all aimed at allowing access to large amounts of computing power in a fully virtualized manner, by aggregating resources and offering a single system view. The aim of the cloud is to provide computing as an utility (Voorsluys et al., 2011). Many authors and entities have attempted to define what exactly cloud computing is and its char- acteristics. The most accepted definition of cloud computing is a model for enabling ubiquitous, conve- nient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction (Mell and Grance, 2011). The objective of cloud computing is to make a better use of distributed resources, combine them to achieve higher throughput and be able to solve large scale computation problems (Jadeja and Modi, 2012). This is possible due to the cloud’s following characteristics (Mell and Grance, 2011):

On-demand self-service permits that a consumer can unilaterally acquire computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.

Broad network access in the sense that capabilities are available over the network and accessed through standard mechanisms that promotes heterogeneous use by thin and thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).

Rapid elasticity means that computing capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.

30 model commonly referred to as cloud computing in 2000s. Cloud computing has different definitions and understandings from different perspectives and applications. The National Institute of Standards and Technology (NIST) defined cloud computing as “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications, and services) that can be rapidly provisioned and released with min- imal management effort or service provider interaction.” [25]. From the scientific view point, the main goal of cloud computing is to provide on-demand computing services with high scalability and availability in a distributed environment with minimum complexity for the service consumers. Cloud computing architecture can be divided into several layers: the hardware layer, the infrastructure layer, the platform layer and the application layer [28] as shown in Figure 1.

Figure 2.11: ConceptualizationFigure of the 1: Cloud cloud architecture Computing with Architecture the different [39] layers and services of software and hardware depicted (source (Givehchi, Givehchi)).

Hardware layerComputinghandles resources the physical pool to resources serve multiple of the consumers cloud, using including a multi-tenant physical model, hardware, with different network devices and power systems.physical Typical and virtual issues resources at hardware dynamically layer assigned include and hardware reassigned configuration, according to consumer fault-tolerance, de- traffic management andmand. power There management. is a sense ofInfrastructure location independence layer in thatis alsothe customer known generally as virtualization has no control layer. or This layer partitions the hardware and provides a pool of computing resources and disk storage. Platform layer is mainly knowledge over the exact location of the provided resources but may be able to specify location covers operating systems and application frameworks depending on each specific platform. This layer tries to minimize the developmentat a higher level efforts of abstraction by providing (e.g., country, development state, or datacenter). platform to Examples the developers of resources as include a service without installing any softwarestorage, or processing, framework memory, on their and network computers. bandwidth.Application layer offers the cloud applications to the end users asMeasurement a service. services These applicationsto control and optimize can be resource automatically use by leveraging scaled with a metering high performance capability at inside this layer with lower maintenance costs comparing with traditional applications [39]. some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, Cloud computingand is based active user on a accounts). service-driven Resource model. usage can In be this monitored, model, controlled, hardware and and reported, software providing resources will be delivered as servicestransparency on-demand. for both These the provider fundamental and consumer cloud of services the utilized are service. categorized into three models: IaaS provisions infrastructural resources such as virtual machines on-demand. It is the most essential cloud service model. IaaS providers or cloud owners e.g. Amazon EC2, GoGrid and Flexiscale offer their resources to the users with2.4.2 least Cloud complexity Computing using Service this service Models model. provides platform layer resources e.g. software development frameworks and deployment components. PaaS The cloud computing model is service-oriented. In this model, hardware and software resources are Software developers employ these services to develop and deploy applications with minimum installation and delivered on-demand as services. The fundamental cloud services are categorized as (Mell and Grance, preparation of resources. Google App Engine, Microsoft Windows Azure and Force.com are examples of PaaS providers. 2011):

SaaS offers on-demandSoftware as cloud a Service applications (SaaS) is theto the model users when through the consumer network. has the This capability service to offers use provider’s complete complex- ity abstraction forapplications the users. on They a cloud do notinfrastructure. need to deal The applications with preparing are accessible required from hardware various client and devices software resources and application isthrough accessible either a through thin client a interface, standard such interface as a web browser e.g. web (e.g., browsers. web-based email), Examples or a program of SaaS providers include Microsoft Office 365, Google Calender and SAP Business ByDesign. interface. The consumer does not manage or control the underlying cloud infrastructure including Besides, based onnetwork, applications servers, and operating architecture systems, of storage, clouds, or eventhey individual can be divided application into capabilities, four different with the types. Public Cloud offerpossible cloud-based exception of applications limited user-specific and services application to configuration the general settings. public Famous via the examples Internet. are Numerous organizations and users can use the resources from an infrastructure at the same time. Benefits of this cloud type include less investment for users to install and maintain31 infrastructures at their location and outsource these operations to the providers. However, public clouds limit the control over data privacy and security settings Private Cloud IaaS PaaS SaaS

Applications Applications Applications Applications

Data Data Data Data

Runtime Runtime Runtime Runtime

Middleware Middleware Middleware Middleware

O/S O/S O/S O/S

Virtualization Virtualization Virtualization Virtualization

Servers Servers Servers Servers

Storage Storage Storage Storage

Networking Networking Networking Networking

Managed by Customer Managed by Vendor

Figure 2.12: Illustration of the responsibilities of cloud vendors and customers in different cloud service models (adapted from (Kalakota, 2013)).

SalesForce.com, BaseCamp.com, and Microsoft Office 365 (Givehchi, Givehchi). Google Apps is the most widely used SaaS (Jadeja and Modi, 2012).

Platform as a Service (PaaS) designates the model when the consumer has the capability to deploy onto the cloud infrastructure, consumer-created or acquired applications created using program- ming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration set- tings for the application-hosting environment. Key examples are Google App Engine, Heroku, and Microsoft’s Azure (Jadeja and Modi, 2012).

Infrastructure as a Service (IaaS) specify the model when the consumer has the capability to acquire processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications, and possibly limited control of select networking components (e.g., host firewalls). Examples are Amazon EC2, GoGrid, Flexiscale, CloudFoundry, Joyent and Rackspace (Jadeja and Modi, 2012).

2.4.3 Cloud Computing Deployment Methods

The cloud provides the means through which the computation can be delivered as a service. These services can be deployed in several different ways (Mell and Grance, 2011):

32 Private cloud system infrastructure is provisioned for exclusive use by a single organization compris- ing multiple consumers (e.g., business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises.

Community cloud system infrastructure is provisioned for exclusive use by a specific community of consumers from organizations that have shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be owned, managed, and operated by one or more of the organizations in the community, a third party, or some combination of them, and it may exist on or off premises.

Public cloud system infrastructure is provisioned for open use by the general public. It may be owned, managed, and operated by a business, academic, or government organization, or some combina- tion of them. It exists on the premises of the cloud provider. Examples of public cloud providers are: Salesforce.com, Amazon Web Services, Microsoft,

Hybrid cloud system infrastructure is a composition of two or more distinct cloud infrastructures (pri- vate, community, or public) that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds).

2.4.4 Benefits of Cloud Computing

The features of the cloud and cloud computing brought many benefits for vendors and customers like the following (Voorsluys et al., 2011):

Saving onIT costs and maintenance because it allows to avoid overhead costs on acquiring and mantaining hardware, software, andIT staff. Consumption is billed as a utility, usually by hour slots, with minimal upfront costs. Thus, the customer will only pay for what they need at each moment. Since it stretch and grows without the need to buy hardware, extra software licenses, or programs, turns cloud computing solutions more affordable over time (see SAP11 case study in Figure 2.13). In many cases, customers are even offered with the latest updates as long as they continue to acquire the service.

Easy access and up-to-date data because applications can be easily accessed from anywhere in the world with an Internet connection and a browser, i.e., without having to download or install anything. The cloud keeps all files in one central location and everyone access the same repository.

Less time-to-benefit with quick deployment ofIT infrastructures and applications. The software de- ployment times and resource needs associated with rolling out end-user cloud solutions are signif- icantly lower than with on-premises solutions (Berggren et al., 2013).

11SAP AG is a german multinational software corporation that makes enterprise software to manage business operations and customer relations. Headquartered in Walldorf, Baden-Wurttemberg,¨ Germany, with regional offices around the world, SAP is the leader in the market of enterprise applications in terms of software and software-related services.—http://www.sap.com/

33 $ 4,000 $ 7,000 On-premise Hardware $ 6,000 Cloud License $ 3,000 $ 5,000 Database $ 4,000 $ 2,000 On-Staff $ 3,000 Annual Cost Implementation $ 1,000 $ 2,000 Subscription $ 1,000 5-year Total5-year Cost of Ownership Year 1 Year 2 Year 3 Year 4 Year 5 On-Premise Cloud

Figure 2.13: Study about the typical costs of an on premises SAP’s Human-Resources information system for a 10.000 employee company versus the costs for an equivalent SAP cloud solution. The on- premises solution cost increases in year 5 because there are upgrades, re-implementation, and license costs involved that do not exist in cloud deployments (adapted from (Berggren et al., 2013)).

Improving business processes with better and faster integration of information between different entities and processes.

Scalability on demand to overcame constant environments and usage changes. To scale vertically (or scale up) means to add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer. To scale horizontally (or scale out) means to add more nodes to a system, such as adding a new computer to a distributed software application.

2.4.5 Risks and Concerns of Cloud Computing

The adoption of a cloud approach on a traditional and sensitive domain as the industrial domain, raises some valid concerns. The major concerns about the cloud are:

Security and privacy concerns are the firsts to rise when handing over sensitive control and data (containing data of customers, consumers and employees, business know-how and intellectual properties), to a third party (Xu, 2012). Having to share an infrastructure with unknown outside parties, requires a high level of assurance in the security mechanisms used for logical separation. One way to handle this concern is by using a proper cloud deplyment architecture, such as hybrid clouds—with sensitive data kept on-premise—or with a private cloud and with the use of proper authentication techniques (Combs, Combs) such as secure and encrypted connections (TLS/SSL with X.509 digital certificates), encrypted data storage techniques (using AES-25big datalitary encryption), authentication and identity management (using Active Directory/LDAP services), end- to-end data integrity (using SHA-1, Secure Hash Algorithm), and private retained keys (ensures that all information requests must involve the owner) (Mohamed, 2012; Archer and Boehm, 2009). A more in-depth content on security, resides outside the scope of this paper and thus should be researched in future work.

Network performance are concerns rather important to the industrial sector (Inductive Automation, 2011). Real-time monitoring systems are hard to implement in the cloud due to latency issues.

34 The term latency refers to the time that takes between the interaction and final response. In a local network, data is constantly flowing back and forth through servers, routers, switches and other hardware. Moving or accessing data to or from a cloud data center, will involve passing through the cloud provider network—which it’s up to the cloud provider to decide their speed and quality of service—that can be overloaded and through extra security layers, i.e. firewalls. The increased and unpredictable latency and can lead to a very unsatisfactory real-time experience, cause errors or affect the productivity of production lines.

Reliability it’s also a concern for the industry (Inductive Automation, 2011). Servers can crash, con- nections can go down, and the more connections there are the more possibilities there are for disconnections. The more dependent customer critical production processes are from the cloud, the more dependent the customer is from the cloud Quality of Service (QoS). If something goes wrong with the cloud system, the customer must wait for the cloud providerIT staff to fix it, and in meanwhile production processes can stop and cause lost in revenues.

As with any new technology, issues must be addressed. But if the correct service model (IaaS, PaaS, or SaaS) and the right provider is selected, the payback can far outweigh the risks and challenges. The cloud performance and ability to scale up or down with much ease, means that companies can react faster to changes of requirements like never before.

2.4.6 Service-Level Agreements

To protect providers and customers, Service Level Agreement (SLA) contracts are signed, which cap- tures the agreedment upon guarantees, between the service provider and the customer (Sakr and Liu, 2012). SLAs for cloud services focus on characteristics of the data center and characteristics of the network to support end-to-end communication. SLA management encompasses the SLA contract def- inition which includes basic schema with the QoS parameters, SLA negotiation, SLA monitoring, and SLA enforcement according to defined policies.

2.4.7 Big Data and Real-Time Analytics

Over the last years we have witnessed to an astonishing increase of data produced by social networks, monitoring and controlling systems, scientific projects, financial transactions, mobile devices, and oth- ers. This evolution is mostly due to recent technological advances. This incredible growth has affected businesses in several ways. On the bright side, it made possible to do many things that could not be done before: identify business trends, explore medical data, understand the universe and so on. On the other hand, it created a series of new problems such as, short storage and processing capacities and data security and privacy uncertainties. The term big data was then coined to define a collection of data so large and complex that it becomes difficult to process using traditional database management tools and processing applications. This in- cludes capture, storage, search, sharing, transfer, analysis, and visualization of the data (Mark Beyer,

35 2012). Beyer (Mark Beyer, 2012) characterizes big data as high volume, high velocity, and/or high vari- ety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. Big data is characterized by three metrics (the V’s of big data):

• Volume stands for the size of the data under consideration.

• Variety deals with the number of different data types and sources.

• Velocity refers to the speed of data generation or how fast data is processed.

These are the most accepted metrics by the community, but several other authors extended the previ- ous metrics with more dimensions (more V’s) like veracity (refers to the messiness or trustworthiness of the data), variability (stands for the variance of lexical meaning), value (measures the monetary amount that can be produced out of raw data) and others. Traditional database systems are one of the systems affected by this evolution. These have been pushed to the limit and in an increasing number of cases they have failed to cope with this growth (Marz, Marz). Traditional database systems handle internal and structured data sources, but big data sys- tems handle unstructured and semi-structured data as well as internal and external data sources. This is particularly interesting in applications with needs to process data to provide features like real-time monitoring or real-time analytics.

The Lambda Architecture

A recent development entitled Lambda Architecture (LA), proposes an innovative architecture to provide real-time analytics in big data systems (Marz, Marz). This architecture style emerged from a need that the authors felt with previous experiences working with big data systems. The authors felt the need for robust systems that are fault-tolerant, both against hardware failures and human mistakes, being able to serve a wide range of workloads and use cases, and in which low-latency reads and updates are required. The LA describes a set of principles to enable both batch and real-time or stream data processing in the cloud. The architecture consists in three layers as follows:

Batch layer processes high volumes of data by collecting, processing, and outputting a group of trans- actions over a period of time. The storage in this layer is managed by the Apache Hadoop 12 (an open source platform for storing massive amounts of data). The batch layer stores the master data set using HDFS and computes the arbitrary results views by performing MapReduce (program- ming model for processing large data sets with a parallel, distributed algorithms, using Map and Reduce procedures) operations.

Speed layer processes in real-time views in distributed and fault tolerance stream processing solutions, such as Storm13. The processing is done automatically every time new data enters the system.

12http://hadoop.apache.org/ 13http://storm-project.net/ and S414 http://incubator.apache.org/s4/

36 Figure 2.14: Conceptualization of the lambda architecture to achieve real-time analytics composed by a batch and speed layer to respectively perform batch processing and real-time computing, and the serving layer to merge and serve the output of these latter layers.

Serving layer indexes and exposes pre-computed views to be queried in ad hoc with low latency by Hadoop Query implementations like the Cloudera Impala15.

Service-Oriented Architectures

Service Oriented Architecture (SOA) is a loosely-coupled architectural style that supports Service- Orientation, i.e. it provides functionality as services to other systems, therefore it’s often used to in- tegrate legacy and external systems (Microsoft, 2014a; Mora et al., 2012). A service acts as a black box to the consumer of the service. A service is a self-contained logical representation of an activity that as a specified outcome. SOA targets system autonomy (i.e. systems whose functionality is independent from others), inter- operability (i.e., heterogeneous systems who are capable of sharing information with each other), and extensibility (i.e., systems which are able to be changed or enhanced with minimal costs). SOA archi- tecture consists in a service provider, a service requester, a service broker and a service description or interface, describing architectural principles and patterns. Each service provider publish to a broker service server a description, also called as contract or interface, which describes their services, capa- bilities and invocation requirements. This broker server is responsible to manage the list of available services and to reply to service consumers whenever a service consumer asks about a service, this is also known as Discovery Service. The service consumers must follow the description of the services to be able to establish communication and use the services. This architectural is being widely used in other sectors to interconnect heterogeneous systems, where each component belongs to different vendors, offering different features by using different system

15http://www.cloudera.com/

37 specifications. The Industrial Automation Systems (IAS) has the exact same environment. Thus, is not surprising that current literature in industrial automation (Delsing et al., 2011; Karnouskos and Colombo, 2011; Mora et al., 2012) and research projects has the IMC-AESOP look at SOA has the solution for a more interoperable and efficient manufacturing.

38 Chapter 3

Solution

This thesis intends to provide a sustainable and conceptual architectural solution, common to any cloud- native industrial EMS. Knowing that there are endless ways of implementing the same idea and knowing that technologies quickly change over time, the main focus in this thesis are the architectural aspects of a cloud system as the latter. Thus, business or external technical aspects are lightly depicted and left for further decision of the organization providing or buying the system. For instance, is up the organizations to decide which cloud deployment setup (private, public or hybrid cloud) or which business model (Software as a Service (SaaS), Platform as a Service (PaaS), Infrastructure as a Service (IaaS)) fits their needs the best. Nevertheless, the authors developed a proof of concept to prove the feasibility of this conceptual solution, using available technology at the moment and implementing only a specific set of use cases that clearly demonstrate the application of the cloud in this domain.

The architecture here proposed followed a standard software development methodology with the following stages: scope analysis, requirement analysis, conceptualization, implementation, deployment, and evaluation. At the same time, a higher level project management activity managed, planed and controlled its execution.

3.1 Scope Analysis

The industry sector uses complex heterogeneous systems that includes networks, protocols, and au- tomation systems. Industrial organizations operate on multiple domains at the same time like energy, quality, and production. Therefore, industrial EMS are more complex than EMS in other sectors, due to all the constraints and requirements needed to operate in these domains. Even so, as it was described in this thesis, industrial EMS are not at their full potential because the industry still have needs that current EMS implementations don’t fulfil.

This chapter proposes a conceptual solution of a cloud-based industrial EMS capable of solving the needs and challenges that the industrial sector faces today (see Chapter1).

39 3.2 Requirement Analysis

This section presents a high level analysis of the requirements that this thesis solution needs to address to achieve the objectives proposed.

3.2.1 Big Data Requirements

Recent advances in control and sensor devices have resulted in the generation of large quantities of data (the volume of big data). These equipment now produce data with much more detail and in a much higher frequency (the velocity of big data). Therefore, time series data automatically originated from thousands of different sensors and customers (the variety of big data), presents a technical challenge in terms of collecting, storing, and processing these data in a real-time basis, and at the same time produce meaningful knowledge. Our solution proposal applies big data techniques, like batch and real- time computation, to provide real-time analytics on large quantities of incoming energy and machine data streams.

3.2.2 Real-Time Requirements

Various domains require real-time data processing for faster decision making: credit card fraud analytics, network fault prediction from sensor data, security threat prediction, and others. As it was described before, monitoring in today’s energy management system is usually performed per minute or every 15 minutes. We proposed the introduction of real-time in the monitoring aspect of these systems to enable organizations to react faster to quick changes of events and improve decision- making with actions based on real-time data. The concept of real-time has sometimes different definitions for different people. Therefore, it is important to define the real-time requirement that this thesis discusses before any further development. Whenever the term of real-time computing is used it means that the system guarantees that an event can be completed computed in a short amount of time. In the case of this thesis, the real-time requirement is in fact near real-time. This refers to the time delay introduced by automated data processing and network transmission. Therefore, the throughput real-time time requirement is in terms of a few seconds, i.e. an interval of 1 to 10 seconds maximum.

3.2.3 Functionalities

The functionalities proposed for a cloud-native industrial EMS are based on all the research background described before and on ABB’s expertise. These represent the fundamental functionalities necessary to fulfil the needs and challenges that the industry is facing and that are depicted in this thesis. Thus, not every possible feature is here described but these are the ones that the authors find essential:

40 Energy Monitoring

• Real-time monitoring of energy information such as consumption, efficiency, intensity, quality, cost and others, integrated with production data, to enable a more energy-aware manufactur- ing and facilitate quicker more informed decisions.

• Real-time monitoring of energy consumption of multiple sites and meters to dynamically as- sess the viability of participating inDR initiatives.

• Alarm and events notifications management to keep managers up-to-date with the latest events.

Energy Analytics

• Energy efficiency benchmarking across entities such as metering and production equipment (sensors, actuators, motors, machines, etc.), areas (manufacturing floors, departments, pro- duction sites, etc.), and production processes to enhance energy efficiency.

• Energy efficiency KPI evaluation to analyze the success or failure of on-going energy and manufacturing strategies.

• Cost allocation to characterize energy costs by entity (equipment, production processes, ar- eas, etc.) in order to provide transparency and identification of energy costs savings oppor- tunities.

• Pattern analytics and forecasting to be able to identify and predict anomalies and demand peaks.

Data integration

• Integration of automation data from external system IAS to derive more knowledge and achieve a more energy-aware production.

• Integration of external data sources as weather, day-ahead and real-time energy price mar- kets and others.

3.2.4 Quality Attributes

The solution proposed by this thesis allied with cloud computing will provide the following quality at- tributes:

System qualities

• Availability and reliability - Any system component failure is recovered by a redundant copy. The complete cloud system can also be redundant by deploying it in multiple avail- ability zones.

41 • Modifiability and portability – The architecture proposed offers low costs of changes by providing a separation of concerns and coarse-grained modules through multiple decouple components. Based on SOA, this architecture also enables system modifiability by separating services interfaces and services implementations.

• Performance – With virtually infinite and redundant computation resources, the architecture is able to quickly respond to requests and to enhance some types of mathematical algorithms.

• Security – The software and system architecture proposed enables security measures from end-to-end such as built-in firewalls and secure connections.

• Interoperability - The adoption of communication standards enables communication be- tween heterogeneous systems.

Business qualities

• Time-to-benefit and Costs – A cloud solution running and available benefits the customer with quick deployment and cost savings, because it avoids acquiring, deploying and maintain the necessaryIT infrastructure and staff and paying only for the service in use. From the vendor point-of-view, with a multi-tenant architecture in place, the solution is ready to handle more customers without any architectural or system change, resulting in recurrent revenues over time.

• Targeted market – This adaptable, scalable, and modular solution is target to manufacturing companies of any kind and size.

• Integration – Data is centralized in the cloud, thus the system is able to integrate all kinds of data and to perform data mining 24/7. In addiction, with its web services online, it’s able to integrate external data into the system at any time. In opposite of EMS on-premises, this cloud solution enables the vendor to evaluate the usability and performance of the solution over time, making changes in system if necessary and to correlate data from multiple customers if needed.

3.2.5 Use Case Diagrams

A set of use case diagrams were developed to portray the different types of users and the various ways they can interact with the system. These provide a simplified and graphical higher-level view of the capabilities of the system. Due to their simplistic nature, these are an ideal communication tool to explain the solution here proposed. These use case diagrams convey the requirements depicted before, in form of use cases. A use case is represented by a circular form and addresses a set of requirements. Each use case perform a list of steps that typically require an action from an actor or a system (represented by a human icon) to achieve a certain goal. For the more complex use case diagrams, like the Energy Monitoring, Energy Analytics, and Data Collection use case diagram, they are described in sub-use case diagrams.

42 Figure 3.1: Use case diagram with a high-level view of the Energy Cloud system that describes its main capabilities and actors.

Figure 3.2: Use case diagram that describes the Energy Monitoring use case and depicts its various monitoring capabilities and the actors that interact with this part of the system.

43 Figure 3.3: Use case diagram that describes the Energy Analytics use case and depicts its various analytic functionalities and the actors that interact with this part of the system.

Figure 3.4: Use case diagram that describes the Data Collection use case and depicts how the system gathers, deals, and stores data originated from the different systems.

44 3.3 Conceptualization

The architecture proposed is based on Marz and Warren lambda architecture (Marz, Marz) and Gold- schmidt et al. time-series cloud architecture for industrial processes (Goldschmidt et al., 2014). Hence, it uses two parallel layers that process continuous streams of time series data from energy metering devices at different speeds (see Figure 3.5). The system also responds to requests from users through Dashboards. These requests are then processed and treated accordingly. For example, the execution of on-demand jobs or historical data visualization, respectively uses the Analytics API to trigger batch processing jobs in the Energy Analytics layer, or to request historical data from the Timeseries DB API. The point of input is the Data Collector (DC) component. Data is pushed from the multiple sites to this component at any frequency possible (e.g. one metering per minute or per second). Then, the DC forwards these data to the system Message- Oriented Middleware (MOM) to be treated by the following layers:

3.3.1 Energy Monitoring (Real-time computing)

This high-speed pipeline layer processes all incoming time-series data, applies simple data transforma- tions, and outputs the processed data without storing it. The main concern of this layer is to transfer data from one end of the system (sites) to the other (desktop and mobile clients) as fast as possible, so that users can react faster to quick changes of events. Its Data Consumers are constantly listening and consuming incoming data. They then push these data to other stage where they are converted, normalized, and sampled in configurable very short time windows (e.g. 1 or 5 seconds) that keeps only the minimum and maximum values measured inside the window. This way, we prevent data floods from meters that are incorrectly configured and are measur- ing or pushing data at a frequency higher then accepted or even necessary. Thus, only the essential data packages navigate through the layer, keeping performance under control. In the end, it calculates the top energy usage consumers per site and transfers these results and all the individual energy me- tering sampled values to the Real-Time Messaging Servers, so that they can be propagated to all the connected desktop and mobile clients through WebSockets.

3.3.2 Energy Analytics (Batch processing)

This slow-speed layer also processes all incoming data. However, it autonomously stores and analyzes these data using a series of pre-defined programs (“jobs”). Its concern is to compute complex algorithms that extrapolate knowledge from all available data sources, also known as Knowledge Discovery in Databases (KDD). Its objective is to produce valuable knowledge that is hard to unveil. This is made possible because data are fed in parallel to several cloud computing clusters that perform different data mining jobs. Hence, this layer takes more time to complete due to the complexity of the algorithms in use. The partial outputs of stored in a Distributed Historical Storage and re-used in further computations.

45 Clients Legend R Req./Response System Storage Web comm. channel Component Component Browser Bi-directional Technology Read/Write comm. channel Used Access

HTTPS WebSockets

Cloud Cluster KairosDB Spring Freeboard.IO Node.JS

R Real-Time Timeseries RabbitMQ Analytics API Dashboards Messaging DB R Servers Apache Storm

Energy Analytics Energy Monitoring R (Hadoop Cluster) (Storm Cluster)

R Data Batch Top Energy Aggregation & Processing Consumers KPI Engine Cluster

Message- Distributed oriented Energy R Knowledge Filtering, Historical Efficiency Discovery Middlew Storage are Normalization Benchmarking Cluster & Sampling

Abnormal R Pattern Energy Usage Recognition Data Detection Jobs Cluster Consumers

Cassandra Spark Data Sites Simulator HBase Mahout Collector

Industrial Sites Industrial Sites Spring OPC UA Energy Cloud Cloud Energy Meters Connector ... Connector Meters

Figure 3.5: Conceptual architecture proposed for cloud-native industrial EMS with the Energy Monitoring layer computing streams of data in real-time and calculating the top consumers per site (right side). The Energy Analytics batch processing layer computes the same data but derives knowledge autonomously through a set of pre-defined algorithms and archives the data in full detail (center). Various other com- ponents enable interaction in real-time or at demand with users or external systems (top area) or with the multiple sites and meters of the organization (bottom area).

3.4 Energy Cloud

This thesis intends to provide a sustainable and conceptual architectural solution common to any cloud- native industrial EMS. Nevertheless, to prove its feasibility and claims, the authors developed a proof of concept, entitled Energy Cloud. Running entirely in the cloud, this implementation is a scalable and real- time industrial energy management system capable of monitoring (Energy Monitoring) and analyzing (Energy Analytics) time series energy metering data from external or simulated energy meters. The following sections depict the technologies and implementations for each architectural component and how they work together as a unified solution.

46 3.4.1 Dashboards

The interaction between the user and the system is made through a presentation tier called Dashboards. Its concern is to display the fleet of sites’ energy status and to provide data exploratory functionalities (see Figure 3.9). This essentially is a collection of web pages that use JavaScript libraries like the Free- board.io (an open source library to create real-time IoT dashboards), to visualize data in real-time with lines (to display the variation of energy consumption over time), pies (to depict the ratio of consumption per site), and columns (to have a side-by-side comparison of energy usage per site). These pages are served by a cluster of web servers. The dashboard page comes with a fixed widget that shows an overview of the energy consumption per site and which are the top energy consumers. The user can then dynamically customize the rest of the screen with other widgets. Each widget subscribes to a set of independent data sources and react to incoming data by updating its visualization in real-time. For this project, additional widgets were developed apart from the ones that come with the library itself. In particular, a line chart widget was developed to display time-series data across time using the Highcharts charting library. This widget is especially useful to visualize and compare energy load profile curves of individual equipment or sites or the evolution of certain KPI, over a period of time. An additional pie chart widget was developed to display the ratios of energy consumption within one site. In the Freeboard.io library, visualization widgets are decoupled from data sources, meaning that the same widget can display data from different types of data sources, e.g. JSON files, WebSockets, WebServices, third-party messaging systems, etc. In this prototype, out of the box JSON data sources plugins were used to collect data from external systems such as energy exchange markets, to obtain one-day-ahead energy prices, production systems, and weather temperature web services. These could also be used to autonomously pull data (such as computed energy KPI or other variables) from the Analytics API with a certain refresh rate. A custom WebSockets data source plugin was developed to be able to receive data from the internal Real-Time Messaging Servers. These were then bound to the visualization widgets that display real-time data. The analytics page allows the user to interact with the Analytics API to pull and visualize historical data. In this page the user specifies the time range to investigate and which metrics to query. In this im- plementation, the metrics available are energy consumptions measurements and previously calculated KPI. Once the user confirms its query, the page sends an asynchronous request to the Analytics API to deal with this call (see Figure 3.7).

3.4.2 Analytics API

The Analytics API is the interface between the Dashboard or any external systems, and the systems components regarding the Energy Analytics layer. Decoupling the Dashboard from internal components brings many benefits. The first one is security. This intermediate layer acts as a black box, i.e. the user doesn’t know how the system works internally and doesn’t have any direct access to internal systems.

47 Figure 3.6: Energy monitoring dashboard where the user can visualize current energy usage consump- tions of multiple sites and equipment with line charts and gauges, and monitor the most significant energy consumers with pie charts and columns.

Secondly, by abstracting data from presentation, we enable different HMI, apart from the Dashboard, to be further developed such as mobile and desktop applications. These all obtain the same data but present it in different ways.

It also allows the addition of extra logic per operation. For example, a simple request to query for existing metrics in the system can perform additional actions like sorting and filtering. Another example is the when requests require the use of transactions, i.e. complex operations that follow a series of steps and that rolls-back in case some step fails.

Furthermore, it creates interoperability, by enabling the exchange of information with external sys- tems that can use these data into their processes.

On the other side, this extra layer creates additional round trip time to get data from the persis- tent data stores. Although, this time is so short due to the speed of today’s internet connections, that becomes irrelevant. As for the additional time that takes to process each request, it depends on the specifications of the machines running the API and on the implementation of each operation. Another point to take in consideration is that this one entering point to the system might become an attractive point of security attacks. Therefore, proper security and authentication measurements must be used.

In this proof of concept, the authors used the Spring Web Framework, to create the Analytics API and it offers a set of operations to pull historical energy consumption measurements and KPI from the Timeseries DB. It also offers options to execute on-demand Energy Analytics computation jobs. Background threaded daemons hold on to these requests and reply to them as soon as the operations are completed.

48 Figure 3.7: Energy analytics dashboard where the user can query, visualize, and correlate the evolution of historical energy data or previously calculated KPI.

3.4.3 Message-oriented Middleware (MOM)

The message-oriented middleware (MOM) is the cross-layer component that enables communication of messages between several distributed systems. In this case, there are many open source options to choose from. But since the AMQP industry standard protocol was needed to support a wider variety of developer platforms, the authors chose RabbitMQ because it is a robust, yet easy to use and deploy queue messaging system. A collection of exchanges and queues were created to queue all input data that comes from the Data Collector to be consumed by both computing layers, and all output data from the Energy Monitoring layer, to be consumed by the Real-Time Messaging Servers.

3.5 Energy Monitoring

The concern of the Energy Monitoring layer is to compute streams of data containing timestamped en- ergy data in real-time. To implement this requirement, the authors needed an open source, distributed and scalable real-time processing system capable of computing large amounts of data. Storm, a dis- tributed real-time computation system, designed to be scalable, fault tolerant (at-least-once or exactly- once) and programming language agnostic, seemed like the perfected choice. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop does for batch processing. However, in Storm, topologies run forever or until explicitly killed or un-deployed.

49 Bolt

Spout Bolt

Bolt Data Source Bolt

Spout Bolt Output System

Data Source

Figure 3.8: Conceptualization of a Storm topology with various Spouts generating streams of data from some data source and Bolts processing the data from these Spouts or other Bolts.

3.5.1 Storm Topologies

In Storm, the structure of a distributed computation is referred to as a Topology. These are graphs of stream computation where each node is either a spout or a bolt. Spouts essentially produce/fed data in form of tuples (ordered list of elements) into the topology. These can read data from HTTP streams, databases, files, message queues, etc. For example, a spout may connect to the Twitter API and emit a stream of tweets. A bolt is a component that performs stream transformations or operations. Bolts may subscribe to any number of streams emitted by other spouts and bolts and produce new streams. Therefore, it is possible to create complex network of stream transformations. Typical operations that bolts perform include: filtering, joining, and aggregating tuples, calculations, and external database reads and writes.

3.5.2 Storm Architecture

Storm clusters are composed by two kinds of nodes: master and worker nodes. The master node runs a daemon called Nimbus (master node), responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures. Inside the worker nodes, there are several processes running: a single Supervisor process (slave node) and multiple Workers processes, i.e. Java Virtual Machines (JVM) processes. The Supervisor starts and stops worker processes as necessary based on what Nimbus has assigned to it. Each worker process executes a subset of a certain topology. There may exist various topologies running using worker processes spread across many machines.

50 Worker (JVM) Supervisor Worker (JVM) Executor (Thread) EExxeeccuuttoorr ( T(Thhrreeaadd)) ExecTuatoskr (Thread) (bolTt/Tasaspksoksut (bolTt/a sspkosut Zookeeper (sipnsotauntc/eb )olt) Supervisor (sipnsotauntc/eb )olt) (backup)

Zookeeper Nimbus Supervisor (elected leader)

Worker (JVM) Worker (JVM) Zookeeper Executor (Thread) Supervisor EExxeeccuuttoorr ( T(Thhrreeaadd)) (backup) ExecTuatoskr (Thread) (bolTt/Tasaspksoksut (bolTt/a sspkosut (sipnsotauntc/eb )olt) (sipnsotauntc/eb )olt)

Supervisor

Figure 3.9: Conceptualization of the software and hardware architecture of a Storm cluster deployment. The Zookeeper nodes do not belong to Storm but they are used to perform coordination and machine discovery.

Executors are threads that are spawned by worker process and runs within the worker’s JVM. An executor may run serially one or more tasks for the same component (spout or bolt).

Tasks are bolt or spout instances and they perform the actual data processing. The number of tasks for a component (bolt or spout) is always the same throughout the lifetime of a topology, but the number of executors (threads) for a component can change over time and thus scaling the system.

3.5.3 Storm Clusters

Storm relies on the ZooKeeper system to coordinate the Nimbus process and its Worker processes. By using ZooKeeper, Nimbus will automatically take care of discovering and integrating new Supervisor nodes into the cluster without any interaction from the user.

Additionally, both Nimbus and Supervisor daemons are fail-fast and stateless, because their state is kept in Zookeeper or on local disk. The result is stable clusters with great performance. In a recent benchmark, a Storm topology clocked one million 100 byte messages per second per node (2xIntel E5645 2.4Ghz processors and 24Gb memory).

51 3.5.4 Energy Storm

The authors developed a custom topology (a graph of computation nodes) to perform the necessary real- time computation on streams of data, entitled Energy Storm. The first nodes of this topology, consume data in parallel from the MOM, using the AMQP protocol in a round-robin way due to the specifications of the binding established with RabbitMQ. These are then converted from its raw data type (JSON, CSV, text, etc.) to internal data representation (Java objects). These individual messages are then shuffle by their unique ids and transfer to the following node responsible to sample that individual meter. Hence, data originating from the one meter goes always to the same sampling node so that it can then have all the necessary data to sample. Nevertheless, each node can handle data from meters from different sites and thus load balancing incoming data. For the sake of simplicity, these time-windows are essential classes that hold only the minimum and maximum values read. Storm’s supervision mechanism then orders these classes to dump these data to the next nodes after a configurable time. Finally, the following nodes transfer these sampled energy measurement data to the MOM. At the same time, a set of parallel nodes also receives these data and calculates the top energy consumers per site. These nodes hold these computed data in-memory according to a configurable Time-To-Live (TTL) variable. This enables further calculations until a new measurement arrives and it also supports brief periods of missing data. Like the time-window sampling nodes, Storm orders these nodes to continuously transfer these calculations to the MOM for further distribution.

3.5.5 Real-Time Messaging Servers

Real-time messaging servers are network servers that consume data from the MOM and distribute them to any subscribed client in real-time through WebSockets. The authors used Node.js to manage all the WebSockets connections and to push in real-time any incoming data from the MOM queues to every client listening.

3.6 Energy Analytics

The batch processing cluster performs complex computations. There are endless opportunities here and much research has been made in the last years in the fields of data mining, machine learning, pattern recognition and others. These clusters might even be based on active on premises systems that perform the same computations. In that case, they must be migrated to reliable cloud computation systems that could scale and process incoming data in batches against historical data, such as the Apache Hadoop.

3.6.1 Hadoop Cluster

In this proof of concept, a simple use case was chosen to demonstrate the inherent possibilities. Based on the work of Anna Koufakou et al. (Koufakou et al., 2008), the authors developed a simple outlier

52 Energy Monitoring (Storm Topology)

Raw Input Data Queue Sampling Archiving (Java to JSON) Output Data Queues

Conversion Data Collection Sampling Top Consumers (JSON to Java) (Max/Min. Windows) (Expiring Windows)

Legend

Spout: a source Raw Data of data stream

Bolt: processing unit Computed Data

Figure 3.10: Conceptualization of the Storm topology developed in the Energy Cloud project with streams of data coming from the RabbitMQ queue, being processed through several stages by a set of distributed bolts that output data to various other queues. detection system (MR-AVF) to identify energy consumption peaks per site. The user is then aware of significant energy usage that might be inefficient or unexpected, and later lead to higher energy costs, or worst, result in penalties from the energy providers from crossing a contracted limit. Therefore, it is of the utmost importance to keep these peaks under control.

The MR-AVF is based on the Map Reduce paradigm for parallel programming. It provides high-speed and scalable outlier detection by analyzing each individual point and categorizing it by the average rate of occurrence. The more infrequent or irregular a value is, the more likely it is to be an outlier. This rather simple approach is easy to parallel and implement. Even though, it is faster and sometimes more efficient than other more complex calculations.

3.6.2 Timeseries DB and Distributed Historical Storage

Data from the computation layer are persistent stored in a distributed storage. Following the lambda architecture principals, every detailed raw data is time-base stored and immutable. The authors used KairosDB, a fast distributed scalable time series database, that works on top of Cassandra, an industry standard for distributed NoSQL databases. KairosDB provides us with an easy to use abstraction layer (API) to push and pull time series data from the Cassandra cluster.

53 This setup was proved to be very robust, scalable, and suitable for industrial processes by Gold- schmidt et al. (Goldschmidt et al., 2014). A cluster of 24 KairosDB nodes could handle the workload of a large city (6 million smart meters).

3.7 Deployment

This proof of concept was deployed on ABB’s development cluster that runs Openstack, an open source software for building private and public clouds. All the components described before, except the Storm cluster, were deployed on top of Cloud Foundry, a PaaS system that made the deployment of the devel- oped applications in Spring and Node.js, and services, such as the KairosDB, RabbitMQ and Cassandra, much easier. Furthermore, it provided us with easy to use commands to vertically and horizontally scale all the components independently.

54 Chapter 4

Evaluation

The major contribution of this thesis is a conceptual solution for modern cloud-native industrial EMS. It intends to solve the needs and challenges that the industry is facing with current EMS implementations and provide extra capabilities from the use of the cloud. The evaluation of this proposal is based on three criteria: feasibility, relevancy, and performance. Its feasibility is proved through the implementation of a proof of concept. It shows that the envisioned solution can indeed be implemented using today’s technologies. Although it’s not a full product, with every possible aspect implemented, this proof of concept developed by the authors implements the necessary components to prove this thesis proposal claims. The second criteria validates this solution proposal relevance. Here, an evaluation framework eval- uates the importance of this work and how it contributes to the community, by testing it against a set of important use cases and decisive questions. Finally, the third part validates its performance. In this part the authors submit the proof of concept that sustain this proposal, under a series of benchmarking tests to assess its capabilities under different workloads. It is essential that this solution continues to work properly under any kind of circumstances, to prove its reliability, robustness, and scalability.

4.1 Conceptual Evaluation

Evaluating this thesis solution proposal from a conceptual point of view, is important to asses if the decisions taken and if the proposals here proclaimed, are in fact relevant to the community and valid to solve the objectives discussed. To accomplish this evaluation the authors gathered a set of use cases and analyzed how this proposal approaches each of them.

4.1.1 Use Cases

To prove the application and benefits of the cloud in current EMS, the authors focused on five use cases, extrapolated from recent literature and industry experts’ expertise, that represent innovative functional-

55 KPI Indicator Focus Description Power Energy consumption Instantaneous or average power used by a process Energy consumption Energy consumption Energy input into a process during a defined time period Production energy Energy consumption Energy consumption per manufactured product consumption (items or units) Energy costs Energy costs Monetary cost of energy used including fixed and variable components Production energy Energy costs Energy costs per manufactured product costs Energy losses Energy efficiency Energy use associated with non-value adding pro- cess steps or operating states Energy efficiency Energy efficiency Ratio between the total energy used and the one used only in production

Table 4.1: Summary of the energy key performance indicators (KPI) that the industry is in need of and that were implemented in the Energy Cloud project (adapted from (Vikhorev et al., 2013)). ities that the industry lacks or that could not be easily obtained in a multi-site management without the cloud.

UC1 - Monitor the most significant energy consumers

Monitoring is necessary to derive knowledge of current energy use. Especially the most significant en- ergy consumers need to be autonomously identified, monitored and analyzed in real-time, to facilitate the use of the system and increase industrial energy efficiency, e.g. supporting the judgment as to whether anticipated energy savings such asDR could be achieved or not, the scheduling of tasks to avoid peaks loads, and as to take advantage of on-site power generation (Bunse et al., 2011). This re- quires standardization of data collection, on-line data processing and visualization techniques (Vikhorev et al., 2013).

UC2 - Calculation of energy performance indicators (KPI)

The system must calculate energy related KPI to enable actors within an organization to react to negative developments. Nowadays, there is a need for effective energy efficiency KPI to track the changes and improvements on both process and on plant level (Bunse et al., 2011) (see Table 4.1).

UC3 - Historical data visualization and correlation

Besides monitoring, the system must also provide targeting. With targeting one compares current en- ergy consumption behaviours of sites or individual equipments with a set of targets, in order to identify management priorities for action, e.g. a certain percentage reduction over a given period. The en- ergy consumption behaviours usually known as load curves or power profiles, should be analyzed using different timescales to develop holistic energy efficiency strategies. For example, at plant level such

56 analysis may be useful to identify peak loads that would attract a surcharge from the energy utility com- pany (Vikhorev et al., 2013).

UC4 - Pattern matching and data usage analysis

The system must derive real-time intelligence from data streams to enable low-latency decisions in response to changing conditions. Using pattern matching techniques the system must be able to unveil patterns in a data stream of events, such as abnormal peaks and troughs, deviation of energy use from reference operating state, which can indicate a malfunction of the equipment, and identify different stages of production to quantify idle time and others (Vikhorev et al., 2013).

UC5 - Benchmark energy usage and efficiency

Benchmarks for similar equipment should be facilitated. Benchmarks should be available, stating where other sites or even other companies with the same challenges stand, in order to increase energy effi- ciency with the same process quality (Bunse et al., 2011). Due to the complexity involved with this use case and time restrictions, this task was left as further work.

4.1.2 Results and Discussion

After gathering the use cases necessary to conceptually evaluate this proposal, the authors together with ABB’s industry experts, evaluated how this proposal address each of one by to answering the following questions:

• Q1 - What is the value added by the use case?

• Q2 - How easy is it to obtain the same results without the use of the cloud?

• Q3 - Which aspects of the cloud does this use case highlight?

• Q4 - How is this use case implemented?

The answers to these questions can be found in Table 4.2. These results show that the solution proposed positively supports and addresses every depicted use case and provide additional value that could not be obtained before or that could not be obtained with such ease. Therefore, we conclude that this proposal is in fact an innovative, beneficial, and relevant solution that could enhance the processes of actual organizations in many ways.

4.2 Performance Evaluation

The industrial sector is a very sensitive domain regarding the performance of every system in use. In- dustrial processes operate in a supply chain system with every system affecting the following one across all levels of the organization, from production, to distribution and management. Systems are more and

57 more connected and dependent of each other and their performance directly affects the ongoing busi- ness processes. Therefore, it is important that these systems keep running and successfully operating, so that these processes are not troubled. The energy management is an important part because it directly affects production and quality. Es- pecially if the production processes are highly dependent on energy. If this is the case, in the worst scenario a failure on the EMS system can dictate the interruption of a production process. This can have serious impacts in the organization. Thereby, the solution here proposed has to comply with these requirements and prove to be robust and reliable to operate in such environment. In addition, this solution might be deployed in a multi-tenant deployment, i.e. it may be the case that an organization implementing and deploying this proposal runs the system to cope with multiple customers in the same cluster. It can also be the case that only one customer operates in a single cluster. In any case, the system must be ready to scale to the number of meters managed because the number of customers or the number of meters that one customer have can increase spontaneously. The architectural decisions that lead to the solution proposed in this thesis, intent to address these performance requirements. To evaluate its performance means evaluating the proof of concept de- veloped, i.e. if this proof of concept copes with these requirements then it shows that it is a correct implementation of the conceptual solution and that the conceptual solution is in fact valid for the indus- trial sector. Thus, the authors developed a series of test cases that put the prototype under different workload situations and evaluated its performance according to a set of metrics.

4.2.1 Data Sets

Energy and automation data is the fundamental input to the proposal solution detailed in this thesis. Hence, to be able to evaluate its proof of concept, energy and automation data originated from metering equipment has to be provided. These data can be simulated, but it has to follow the characteristics of real data to be accepted as valid. Taking that in consideration, the authors conceived a data set based on typical industrial energy load curves and use it to evaluate the proof of concept. This data set was also essential during the development of the latter, to test the system during the different stages of development. Nevertheless, real data provided by ABB was also used in this prototype.

Simulated Data

Simulating the operation of industrial organizations involved the creation of a dataset with fictional data with sites, meters and energy curves. The data regarding the hierarchical logic between sites and meters was provided by an ABB’s WebService. From all the information provided by this service, only the data regarding the unique identification of each site and meter was used. On the other hand, the energy curves were manually created by the authors and were based on typical industrial energy consumers, e.g. industrial ovens with sudden peaks of energy consumption

58 and heating systems with a peak during the hottest part of the day. To create these curves, the authors adapted the use of LIMBO, an eclipse-based tool for modeling load variations curves (von Kistowski et al., 2014). The main purpose of this tool is to generate curves to be benchmark s distributed systems by recreating the variation of number of requests of multiple clients. The authors then adapted these curves to instead of representing the number of requests per minute, they would in fact represent energy load curves. Each curve generated contained a list of 1440 Cartesian points, i.e. one point per minute during a full day, with an incremental delta in the x-coordinate and a generated energy value in the y-coordinate, e.g. [(0, 10),(1, 30)...(1440, 23)]. For this data set we created a collection of 100 sites with at least two meters per site: one meter simulating the operation of an oven and another simulating a heating system. Each simulator instance represented one customer.

Real Data

Evaluating the performance of the proof of concept when dealing with real data was also took into consideration. To perform this evaluation, ABB provided a data set with real energy data from one of their sites, measured during a period in time. In fact, the data contained in these data did not differ much from the simulated data set. The main difference was that the amount of energy measurements, i.e. the amount of points per energy curve, was much higher.

4.2.2 Virtual Energy Cloud

To simulate the operation of energy meters in multiple industrial sites, the authors developed a software tool to virtually generate time-series data streams to emulate physical energy meters, entitled Virtual Energy Cloud. This multi-thread application enabled the concurrent operation of multiple virtual energy meters in one machine. This was fundamental to dynamically change the number of active sites and meters and evaluate the solution proposed easily. In this case, the authors developed a Java application for the desktop, in a stacked three-layered software architecture. This application proves the cloud can also be used as a remote service from the desktop. The three-layer model is a software architecture pattern focused to enterprise business applications with complex system and communication requirements (Microsoft, 2014b). It consists of coarse-grained modules in a unidirectional relation with each other that allows that any layer to be changed independently. Hence, this design pattern promotes modifiability, portability, and code reuse. Three-layer architecture has the following layers:

• Presentation layer is the topmost level of the application. It communicates with other layers by which it puts out the results to the front-end application layer.

• Business layer contains the business logic that controls the functionality of the application.

• Domain layer consists of database servers and services gateways that store and retrieve data. This layer keeps data independent from application servers or business logic.

59 Figure 4.1: User interface of the Virtual Energy Cloud simulator to visualize the active sites and meters (left side) and the data being generated by the active threads (right side).

This architecture is a big advantaged in fields where requirements and features changes are con- stant. In this case, it proved to be particularly beneficial during the development of this thesis’ proof of concept, because it enabled the experimentation of different techniques in each layer without affecting the others. This simulator was created using the Spring Framework, an open source application development framework and inversion of control container for the Java platform. The Spring Framework core features provide popular functionalities that are common to most of the Java applications. Thus, it facilitated the development of this project.

Domain Layer

The domain layer was implemented with Hibernate, an open source Java persistence framework project. This Object-Relational Mapping (ORM) library was important to abstract the databases used by the simulator from its internal object representation. The library takes care of the mapping Java classes to database tables and provides data query and retrieval operations. Hence, in case the database system used, changes there is no need to change the simulator code. Internally, data and tables are treated as normal Java objects. The final set of classes used to represent the domain entities used by the simulator tool consisted of (see Figure A.1):

• Metering class to represent each individual load curve points.

• LoadCurve class to represent kinds of energy curves, each with a collection of a Metering objects.

60 • MeterType class to represent different types of metering equipment.

• Meter class to represent energy meters, each with its LoadCurve association to characterize the curve measured by the meter and an association with MeterType to depict its type.

• SiteMeters class to represent the different sites, each with a collection of Meter objects.

The conclusive data storage configuration used under Hibernate to persistently store the simulated dataset involved a WebService, a MySQL server, a H2 database, and Ehcache. The WebService was a common service used by ABB to provide data about sites and equipment data. Several different ABB’s projects use this service to not re-implement and re-create these data in their systems. The MySQL server, an open-source relational database management system, was running in a local machine and stored all the points from all the energy curves. This allowed the authors to run multiple simulators simul- taneously in different machines but sharing the same data from the same MySQL server. Each instance running the simulator had its own H2 database, an open-source in-memory relational database man- agement system, and Ehcache, an open-source memory cache system. The H2 database is created and loaded on start-up with data from the MySQL server. Without this local database in each simulator, the MySQL server would be a bottleneck for simulta- neous simulators running, because theirs request to load the energy curves would take so much time that would hang any other concurrent request. This way data is once pushed from the server and once all the simulators are loaded their operation can resume without affecting others. The EhCache system was used to boost the performance by caching on-memory common energy curves, i.e. collection of Java objects, used by multiple meters. Thereby, all the data needed was loaded to memory. Thus, the execution of these simulators was very fast, with data being generated and published by each meter up to a rates of a couple of hundred milliseconds if needed.

Business Layer

The business layer of the Virtual Energy Cloud project performs the simulating operations based on the data provided by the domain layer and also exchanges data with the upper layers through services. By separating the implementation and logic from the other layers enabled us to change these inner operations without affecting the surrounding layers. Hence, the presentation layer is not dependent on the service layer and the only way they communi- cate it is through the use of events that are exchanged among them. Therefore, the only thing that the methods in the presentation layer need to know are the specifications of the events that the service layer expects. In this application, the business layer is divided in two parallels layers with distinct concerns. The service layer is responsible to intermediate data in form of objects with the upper layers, e.g. get the list of available sites and meters, get the energy curve of one meter, and others. The gateway layer however, provides mechanism to publish data out of the simulation tool. Internally, the gateway layer is composed by classes that manage the active sites. The ActiveSites- Manager class keeps track of all the actives sites with internal Java concurrent HashMap collections.

61 Each entry is identified by the site id and contains an ActiveSite object. The latter represents an active site and its concern is to keep track of all the active meters of the site. Like the previous class, the ActiveSite class also uses Java concurrent HashMaps to manage the collection of active meters (see Figure A.2): The active meters are in fact threads. Once a thread is activated, it starts publishing data at the defined rate to the RabbitMQ system. As soon as an event to disable a certain meter arrives to this layer, the ActiveSitesManager catches the event and replicates the order to the corresponding ActiveSite instance which kills the running thread.

Presentation Layer

During the development of the proof of concept there was a need to enhance this simulator and change its output from log files to a user interface, to facilitate the debug and testing of the system. It was hard to keep track of all the generated data of each meter with so many outputs at the same time. Therefore, a user interface was created to provide a visual representation of the energy consumed in each active site. This way it was easy to visualize the past, present and future energy measurements generated and compare it with the data that the proof of concept system had in storage and in its real-time monitoring dashboards (see Figure 4.1). In a secondary area of the interface, one can configure the frequency of the messages per minute that are being published. Additionally, two tables show all the active sites and their total energy consumptions and a list of all the active meters ordered by consumption, to quickly identify the top consumers. In top of that, these tables are updated as soon as the measurements are published. Thereby, this view concentrates all the necessary data to quickly evaluate the real-time aspect of the proposed system in evaluation (see Figure 4.2).

4.2.3 Test Cases

Evaluating the performance of the proof of concept involved submitting the proof of concept under dif- ferent kinds of conditions and evaluates how it performs. To do so, the authors created test cases where the ability of the system to deal with its workload is tested under different configurations. These tests had multiple variables changing over time and different metrics to assess its perfor- mance. The goal of these tests is to understand if the system sustains its claims.

Variables

In a distributed system like the one here depicted, the number of nodes is one of the variables that can influence its performance. Most of the distributed systems have higher throughput and have far more capabilities to deal with bigger workloads when the number of nodes increases. Hence, this is one of the important variables to take in consideration. The next variable is the number of active meters. Without knowing the architecture in use, one could argue that the number of connections could be limited or that could in fact have some impact in the link

62 Figure 4.2: User interface of the Virtual Energy Cloud simulator to quickly visualize the current generated totals of energy consumption per site (left side) and per meter (right side). and operation of a remote system. Thus, the number of active meters should vary over time to try to bring the system to its limits. Another important variable to take in consideration is the number of messages per second that are inputted into the system. Although there might exist many active meters, that does not directly implies that frequency of messages is higher. These meters can be configured to send messages very often. Therefore, the solution here proposed must be ready to deal to an uncertain amount of messages per second because in a real life deployment the configurations for metering sampling, or even the equipment itself, can suddenly change. This is actually the real workload for the solution proposed.

Metrics

A series of metrics were defined to measure the performance of the proof of concept during the test cases. The critical computation of this proof of concept is performed in the real-time computing layer (Energy Monitoring). Therefore, the performance of each Storm node was evaluated by monitoring their hardware status. Hence, the metrics monitored in each node were the CPU activity, the memory usage, the network traffic, and the number of packages exchanged. In addition, the Storm UI also provides more metrics for the running topology that were also taken in consideration: number of messages emitted, number of messages transferred, and the average time take to process a tuple along the entire topology. These metrics were important to tune the configuration of the topology in use by the proof of concept and to evaluate it afterwards.

63 4.2.4 Results and Discussion

Since this thesis solution proposal intends to provide real-time analytics using real-time computation, the Energy Monitoring layer is the main target of evaluation. As for the Energy Analytics layer, its performance depends on the algorithms that run inside of it. Their performance is discussed in the literature that describe the algorithms. Therefore, the work implemented in this thesis with relevancy to be evaluated, is in fact the Energy Monitoring layer. This is going to be evaluated accordingly to its efficiency, performance, and scalability.

Efficiency

The first aspect under evaluation was the efficiency of the real-time computing system. This means having to perform a test case where a known data set by the authors is inputted to this system and its output is saved over time to be compared with the initial data. To achieve this the authors developed a plugin for KairosDB to autonomously pull data from the RabbitMQ that followed a specific pattern in its id. This was then used to record input and output data from the Storm cluster. Therefore, in this first test case we loaded the Virtual Energy Cloud simulator with the simulated dataset, deployed the Storm topology, and activated a series of meters over 10 minutes. After this time everything was stopped and the results of were analyzed. The results were very good. During this time a collection of 1440 points from one meter (circa 2.5 points per second) entered the system. From those, Storm’s output sample had only 254 points. The result is a much simpler energy curve with much less detail. Although, if we look at both curves (see Figure 4.3) we see that no important point was lost, i.e. every maximum and minimum peak points were outputted by Storm and all of those in between ignored because they don’t add any more information. Like so, the number of packets that circulate through the system is much less, resulting in a much higher degree of performance. If we compare the times to obtain both curves from the persistent storage, we need 6,107ms to obtain the raw data and 290ms to obtain the sampled data (21x faster). The conclusion is that Storm was successful used for sampling and aggregation operations. Hence, only the necessary points are send to the desktop clients, avoiding extra overhead that could hinder the real-time communications and performance of Storm, intermediate systems, and client’s browsers.

Scalability

Evaluating the Energy Monitoring layer scalability capacity is an important task to assess if the real- time and big data requirements can be solved. To solution relies on the distributed system used in the Energy Monitoring layer that must scale horizontally and increase its throughput linearly. In the Energy Cloud project, the authors used Storm to implement this layer. Thus, two test cases were created to test the scalability of the Storm topology. As described before, Storm runs on multiple nodes with multiple threads. These nodes share multiple topologies on their available slots. Thereby, scaling in Storm means scaling the number of slots (threads) used. To prepare this test the MOM queues were

64 Figure 4.3: Comparison between the raw data (1440 points) that is inputted to the system (top area) and the one that is outputted by the Energy Monitoring layer (254 points) after being sampled and normalized (bottom area). As we can see, the sampled data have much less detail but it has all the important points that define the energy curve. The result is a sampled data which is 21x faster to obtain than the raw data. pre-loaded with thousands of messages to make sure that the topology could consume the maximum amounts of messages as possible.

In the first test case, the topology was deployed with one executor (thread) per component and the topology runs for about 5 minutes. The metrics of each component are then read at regular intervals. However, in the second test case, the RabbitMQ spout and the Converter bolt have now two executors instead of one. The initial results of the first test case, indicated that these components were taking a considerable amount of time in comparison with the others. Therefore, these were scaled by adding more executors. As we can see in Figure 4.4, the speed of computation of each component, i.e. the number messages processed, remained linear during the time for both test cases. For example, the RabbitMQ spout has an average of about 35 messages emitted per second with a total of 10k messages in the first case. However, in the second test case, the same spout but now with two executors, had an average of 80 messages emitted per second with a total of 23k. This is an increase of almost 2.5 times.

The conclusion is that by horizontally scaling the topology we scale the throughput of the system linearly. Meaning that in case this system reaches its limits, just by scaling the topology we will have much more capacity. In Storm, it is possible to scale either by re-deploying the topology with a new configuration or by using the command-line and issuing a re-balancing command.

65 Figure 4.4: Storm topology test cases to evaluate scalability. The first case every component have only one executor (thread) (left side). On the second test case, the RabbitMQ Spout and the Converter Bolt have two executors (right side). The difference is almost 2.5 times more throughput.

Performance

Another point to take in consideration is the amount of resources used by the system. It is important to evaluate how the resources of such a system are used to understand the degree of resource consump- tion versus the workload. In an optimal solution, doubling the resources should at least cover the double of workload. Otherwise, extra resources have to be allocated whenever the system scales. To assess this evaluation, the authors collected the status of the hardware used by a simple topol- ogy of one Nimbus (Storm master) and one Supervisor (Storm slave). Both components had Collectd installed, a popular software daemon to collect system performance statistics periodically. All statistics were then pushed every 10 seconds to a Graphite server, a scalable real-time graphing system. In this case, the number of active meters and hence the number of messages per second, was increase over time by the power of 2 each minute. The master node in used in this test case included 1xVCPU and 1Gb of memory. However, the slave node was running 2xVCPU and 2Gb of memory. The results are presented in Figure 4.5 and show the use evolution of CPU, memory, and network for both components, over a period of almost 10 minutes, time where the topology was killed. As we can see, the master resources remained unchanged during the entire time, until it received the order to kill the topology. As for the slave node, its resources were used linearly over time. In the slave node, the results show that the CPU use increases almost at the same ratio than the number of packages received. Which means that the CPU followed the workload almost at the same pace. As for the memory, there was an increase of about 45 megabytes at a certain point, but the results do not show that is directly related with the workload. Therefore, the conclusion is that this topology is CPU intensive, i.e. the CPU is the resource that matters when deploying this topology because its use increases almost linearly with the workload in comparison with the memory. Hence, scaling the CPU resources of this topology results in a significant computational of the same ratio of the scale itself.

66 Figure 4.5: Storm resource (CPU, memory, network) benchmarking with the number of active meters increasing constantly.

67 UC1 - Monitor the most significant energy consumers Q1 Increase industrial energy efficiency by monitoring and identifying unaware high consumers of energy Q2 Moderate, because with current on premise systems only results from individual sites can be obtained. Performance is also limited to system’s capabilities Q3 Real-time analytics Q4 The Energy Monitoring layer computes the top consumers per site in a real-time basis

UC2 - Calculation of energy performance indicators (KPI) Q1 Increase industrial energy efficiency by identifying energy improvement opportunities, increase energy consumption awareness and assist the development of strategies Q2 Hard for large and complex sites because performance is limited to the local system capabilities Q3 Data integration and cloud computing Q4 Both computing layers compute and archive different KPI

UC3 - Historical data visualization and correlation Q1 Increase industrial energy efficiency by deriving knowledge from energy use, identifying energy management inefficiencies, and increase energy consumption awareness Q2 Hard with multi-site correlation, because without the cloud, this involves having to transfer and import data between systems Q3 Data integration and easy access to data Q4 With the Dashboard, the user can correlate and visualize external and historical data stored in the Distributed Persistent Storage

UC4 - Pattern matching and data usage analysis Q1 Increase industrial energy efficiency by deriving knowledge from energy use, identifying energy management inefficiencies, and increase energy consumption awareness Q2 Hard for large and complex sites because performance is limited to the local system’s capabilities Q3 Data integration and cloud computing Q4 The Energy Analytics layer continuously analyzes and computes incoming data and perform com- plex computations (e.g. MR-AVF) to autonomously derive more knowledge

UC5 - Benchmark energy usage and efficiency Q1 Increase energy efficiency by providing benchmark metrics to improve production processes and energy strategies Q2 Very hard, because on premises, there may not exist the necessary statistic population number to benchmark Q3 Data integration and cloud computing Q4 On-demand data mining jobs in the Energy Analytics layer can browse through the multiple datasets and compare energy usage profiles

Table 4.2: Summary of the answers to the questions of each use case used to evaluate the solution proposal from a conceptual point of view.

68 Chapter 5

Conclusions

Following the current industry trends regarding smart and future production, this thesis focuses how industrial EMS can be enhanced with the use of the cloud, to fulfill the needs that this sector faces today. To achieve the defined goals, a research methodology was created around this thesis’ main concepts: Energy Efficiency (EE), Demand Response (DR), Industrial Automation Systems (IAS), and Cloud Computing. The research included academic literature, research projects, ABB’s expertise and surveys on current EMS solutions. The conclusion of this research is that the current gap between the industry’s needs and current EMS products, can be diminished by incorporating cloud technologies. In addition, big data is a present challenge due technological advances in field equipment that led to the generation of larger amounts of data, with much more detail and higher sampling rates. Therefore, this thesis proposes a novel cloud-native architectural solution for future EMS solutions, to address these concerns and to cope with the big data challenges involved. This solution proposal provides means to unveil patterns of inefficiency and to enhance decision-making with real-time analytics. The feasibility, relevancy, and performance of this solution were validated with the implementation and evaluation of a proof of concept called Energy Cloud.

5.1 Achievements

The development of this thesis has been proved to be very successful, with many achievements accom- plished during it development. The greatest of them all, was the publication of an article entitled “Energy Cloud: real-time cloud-native Energy Management System to monitor and analyse energy consumption in multiple industrial sites”. This paper was accepted for the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC) in London, England. UCC is the premier IEEE/ACM conference covering all areas related to Cloud Computing as a Utility. Publication is coming soon. In top of that, the solution proposed in this thesis and included in this latter paper, was selected as a finalist for the UCC 2014 Cloud Challenge. The cloud challenge is a competition within the UCC confer- ence where participants develop solutions for real-world problems by utilizing virtualization technologies and cloud computing. The competitors have to submit their proposals outlining the nature of the problem

69 that might range from business, scientific to socio-lifestyle applications, the methodology used to solve the problem and the means of validation of the solution. Furthermore, the competitors have to outline how characteristics of the solution can lead to business value, or to improve existing processes, prac- tices, tools and applications. Later this in the year in the conference, all the finalists will present their solution and be judged according to the previous criteria. Furthermore, this article was also featured in the conference Energia@IST2014 as a workshop pub- lication. This conference it’s organized by the Instituto Superior Tecnico´ university, to share all the ongoing research work regarding the energy domain. Besides academic publications, part of the work developed in the Energy Cloud project was also accepted by the community. Namely, the Freeboard.io widgets and plugins developed for the Energy Monitoring dashboard were included in source of this library by the company running this open source project1. In addition, a plugin developed to extend KairosDB was also accepted and imported into KairosDB’s plugin directory2. This plugin is a scalable-ready plugin for KairosDB that autonomously pulls time series data from RabbitMQ3.

5.2 Future Work

This thesis presents the fundamental solution for modern cloud-native EMS. Therefore, there are still many spaces for improvement and development. Always with the goal to enable a more energy-aware and smart production, much research can be done in the Energy Analytics part of the solution proposed. New algorithms can be developed and techniques can be applied to further improve these systems. This however, requires the action of experts of different areas such as mathematics, machine learning, data mining, etc. The solution here depicted can of course be fully implemented with features that weren’t discussed due to their low importance for this thesis. For example, a back-end system to manage the information about sites and meters, a dashboard system to save and load custom dashboards from application database, and many more could be implemented. Further more advanced visualization widgets could also be developed to provide more and better visual and exploratory capabilities.

1http://goo.gl/qDSQme 2http://goo.gl/xQU5ka 3http://goo.gl/uwwSFq

70 Bibliography

Albadi, M. H. and E. F. El-Saadany (2007, June). Demand Response in Electricity Markets: An Overview. In 2007 IEEE Power Engineering Society General Meeting, pp. 1–5. IEEE.

April, W. G. (2013). Recommendations for implementing the strategic initiative Industrie 4.0. (White Paper) (April).

Archer, J. and A. Boehm (2009). Security guidance for critical areas of focus in cloud computing. Cloud Security Alliance, 0–176.

Arinez, J. and S. Biller (2010). Integration requirements for manufacturing-based Energy Management Systems. In Innovative Smart Grid Technologies (ISGT), pp. 1–6.

Ates, S. A. and N. M. Durakbasa (2012). Evaluation of corporate energy management practices of energy intensive industries in Turkey. Energy 45(1), 81–91.

Atzori, L., A. Iera, and G. Morabito (2010, October). The Internet of Things: A survey. Computer Networks 54(15), 2787–2805.

Bayindir, R., E. Irmak, Colak, and A. Bektas (2011, January). Development of a real time energy monitoring platform. International Journal of Electrical Power & Energy Systems 33(1), 137–146.

Berggren, E., E. Bjorn,¨ and F. Froschl¨ (2013). The Cloud: When & Why? (SucessFactors an SAP Company).

Bowers, N. (2013, November). The perfect pyramid of automation. Available at http://www. electronicspecifier.com/design-automation/the-perfect-pyramid-of-automation (visited in Jan. of 2014).

Bunse, K., M. Vodicka, P. Schonsleben,¨ M. Brulhart,¨ and F. O. Ernst (2011). Integrating energy effi- ciency performance in production management - gap analysis between industrial needs and scientific literature. Journal of Cleaner Production 19(6–7), 667–679.

Byun, J., I. Hong, and S. Park (2012). Intelligent Cloud Home Energy Management System Using Household Appliance Priority Based Scheduling Based on Prediction of Renewable Energy Capability. 58(4), 1194–1201.

71 Cannata, A. and M. Taisch (2010). Introducing Energy Performances in Production Management: To- wards Energy Efficient Manufacturing. Advances in Production Management Systems New Chal- lenges New Approaches 338, 168–175.

Cappers, P., C. Goldman, and D. Kathan (2010, April). Demand response in U.S. electricity markets: Empirical evidence. Energy Volume 35, Issue 4, April 2010, 35(4), 1526–1535.

Cardoso, T. (2012). Efficient Integration of Energy Data. Ph. D. thesis, Instituto Superior Tecnico.´

Carreira, P., R. Nunes, and V. Amaral (2011, October). SmartLink: A hierarchical approach for con- necting smart buildings to smart grids. 11th International Conference on Electrical Power Quality and Utilisation, 1–6.

Chiu, T.-Y., S.-L. Lo, and Y.-Y. Tsai (2012, December). Establishing an Integration-Energy-Practice Model for Improving Energy Performance Indicators in ISO 50001 Energy Management Systems. Energies 5(12), 5324–5339.

Combs, L. Cloud Computing for SCADA.

Dam, Q. B., S. Mohagheghi, and J. Stoupis (2008, November). Intelligent Demand Response Scheme for Customer Side Load Management. 2008 IEEE Energy 2030 Conference, 1–7.

David, A. K. and F. Wen (2001). Market power in electricity supply. Energy Conversion, IEEE . . . 16(4), 352–360.

Deliso, R. (2013, September). Openadr: Transforming the smart grid through standards. Available at http://energysmart.enernoc.com/bid/334732/ OpenADR-Transforming-the-Smart-Grid-Through-Standards (visited in Jan. of 2014).

Delsing, J., J. Eliasson, R. Kyusakov, A. W. Colombo, F. Jammes, J. Nessaether, S. Karnouskos, and C. Diedrich (2011, November). A migration approach towards a SOA-based next generation process control and monitoring. IECON 2011 - 37th Annual Conference of the IEEE Industrial Electronics Society, 4472–4477.

Department of Electrical Engineering IIT Kharagpur and E. E. Iit. Architecture of Industrial Automation Systems. pp. 1–21.

Fan, R., L. Cheded, and O. Toker (2005). Internet-based SCADA: a new approach using Java and XML. Computing and Control Engineering.

Faruqui, A., D. Harris, and R. Hledik (2010). Unlocking the EUR53 billion savings from smart meters in the EU: How increasing the adoption of dynamic tariffs could make or break the EU’s smart grid investment. Energy Policy 38(10), 6222–6231.

Feuerhahn, S., M. Zillgith, C. Wittwer, and C. Wietfeld (2011, October). Comparison of the communi- cation protocols DLMS/COSEM, SML and IEC 61850 for smart metering applications. 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm) (c), 410–415.

72 Fiedler, T. and P.-M. Mircea (2012, October). Energy management systems according to the ISO 50001 standard — Challenges and benefits. 2012 ICATE.

Fraser, R. E. (2000). Process Measurement and Control: Introduction to Sensors, Communication, Adjustment, and Control. Premtc HaII.

Galloway, B. and G. P. Hancke (2013). Introduction to Industrial Control Networks. IEEE Communica- tions Surveys & Tutorials 15(2), 860–880.

Gilart-Iglesias, V. (2007). Industrial Machines as a Service: Modelling industrial machinery processes. Industrial Informatics (INDIN), 2012 10th IEEE International Conference, 737–742.

Givehchi, O. Industrial Automation Services as part of the Cloud: First Experiences. Conference Pro- ceeding.

Givehchi, O., H. Trsek, and J. Jasperneite (2013, September). Cloud computing for industrial automation systems — A comprehensive overview. 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA), 1–4.

Goldschmidt, T., A. Jansen, H. Koziolek, J. Doppelhamer, and H. Pei-Breivold (2014, July). Scalabil- ity and robustness of time-series databases for cloud-native monitoring of industrial processes. In Proceedings 7th IEEE Int. Conf. on Cloud Computing (IEEE CLOUD 2014) Industry Track. IEEE.

Granderson, J. and M. Piette (2011). Energy Information Handbook: Applications for Energy-Efficient Building Operations. Lawrence Berkeley National Laboratory.

Guntermann, A. E. (1982, November). Are Energy Management Systems Cost Effective? IEEE Trans- actions on Industry Applications IA-18(6), 616–625.

Hong, I., J. Byun, and S. Park (2012). Cloud Computing-Based Building Energy Management System with ZigBee Sensor Network. In Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), 2012 Sixth International Conference on, pp. 547–551.

Inamdar, H. P. and R. P. Hasabe (2009). It based energy management through demand side in the industrial sector. In INCACEC 2009.

Inductive Automation (2011). Cloud-Based SCADA Systems: The Benefits & Risks. White Paper..

International Energy Agency (2008). World energy outlook 2008.

International Union for Electricity applications (2009). Electric Load Management in Industry.

Jadeja, Y. and K. Modi (2012, March). Cloud computing - concepts, architecture and challenges. In 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET). IEEE.

Kalakota, R. (2013). Cloud taxonomy. Available at http://cloudblueprint.wordpress.com/ cloud-taxonomy/ (visited in Dec. of 2013).

73 Karnouskos, S. and A. Colombo (2011). Architecting the next generation of service-based SCADA/DCS system of systems. IECON 2011-37th, 359–364.

Kiliccote, S. and M. A. Piette (2005). Advanced Control Technologies and Strategies Linking Demand Response and Energy Efficiency.

Koufakou, A., J. Secretan, J. Reeder, K. Cardona, and M. Georgiopoulos (2008, June). Fast parallel outlier detection for categorical datasets using mapreduce. In IJCNN 2008.

Kyusakov, R. and J. Eliasson (2012). Emerging energy management standards and technolo- gies—Challenges and application prospects. Emerging Technologies & Factory Automation (ETFA), 2012 IEEE 17th Conference.

Langmann, R., O. Makarov, L. Meyer, and S. Nesterenko (2012, July). The WOAS project: Web-oriented Automation System. 2012 9th International Conference on Remote Engineering and Virtual Instru- mentation (REV), 1–3.

Lee, E. a. (2008, May). Cyber Physical Systems: Design Challenges. 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), 363– 369.

Locke, G. and P. Gallagher (2010). NIST framework and roadmap for smart grid interoperability stan- dards, release 1.0. National Institute of Standards and Technology.

Luo, Y. L., L. Zhang, D. J. He, L. Ren, and F. Tao (2011, February). Study on Multi-View Model for Cloud Manufacturing. Advanced Materials Research 201-203, 685–688.

Ma, X., R. Cui, Y. Sun, C. Peng, and Z. Wu (2010, August). Supervisory and Energy Management Sys- tem of large public buildings. 2010 IEEE International Conference on Mechatronics and Automation, 928–933.

Macia-perez, F., J. V. Berna-martinez, D. Marcos-jorquera, I. Lorenzo-fonseca, and A. Ferrandiz- colmeiro (2012). A new paradigm: cloud agile manufacturing. 45, 47–54.

Manuel, T. and F. Cardoso (2013). Efficient Integration of Energy Computer Science and Engineering Examination Committee. (August).

Mark Beyer, D. (2012). The importance of ’big data’: A definition.

Marz, N. Big Data: Principles and best practices of scalable realtime data systems.

Meenakshi, M. (2012). An overview on cloud computing technology. Advances in Computing and Information Technology (3), 244–246.

Mell, P. and T. Grance (2011). The NIST Definition of Cloud Computing, Recommendations of the National Institute of Standards and Technolog. National Institute of Standards and Technology.

74 Microsoft (2014a, January). Service oriented architecture (soa). Available at http://msdn.microsoft. com/en-us/library/bb833022.aspx (visited in Dec. of 2014).

Microsoft (2014b, January). Three-layered services application. Available at http://msdn.microsoft. com/en-us/library/ff648105.aspx (visited in Dec. of 2014).

Mohagheghi, S. and N. Raji (2012). Intelligent demand response scheme for energy management of industrial systems. In Industry Applications Society Annual Meeting (IAS), 2012 IEEE, pp. 1–9.

Mohamed, E. (2012). Enhanced data security model for cloud computing. Informatics and Systems (INFOS), 2012 8th International Conference, 12–17.

Mora, D., M. Taisch, A. W. Colombo, and J. M. Mendes (2012). Service-oriented architecture approach for industrial System of Systems: State-of-the-Art for Energy Management. In Industrial Informatics (INDIN), 2012 10th IEEE International Conference on, pp. 1246–1251.

Motegi, N., M. A. Piette, S. Kinney, and K. Herter (2003). Web-based Energy Information Systems for Energy Management and Demand Response in Commercial Buildings. Lawrence Berkeley National Laboratory.

O’Driscoll, E. and G. E. O’Donnell (2013). Industrial power and energy metering – a state-of-the-art review. Journal of Cleaner Production 41(0), 53–64.

Palensky, P. and D. Dietrich (2011). Demand Side Management: Demand Response, Intelligent Energy Systems, and Smart Loads. IEEE Transactions on Industrial Informatics 7, 381–388.

Piette, M., O. Sezgen, and D. Watson (2004). Development and evaluation of fully automated demand response in large facilities. (January).

Qu, X. and W. Yingjun (2014). Software Resource Re-sharing in Middle-sized Enterprise Cloud Manu- facturing System. 12(1), 711–717.

Raghavendra Nagesh, D. Y., J. V. Vamshi Krishna, and S. S. Tulasiram (2010, January). A real-time architecture for smart energy management. 2010 Innovative Smart Grid Technologies (ISGT), 1–4.

Rahimifard, S., Y. Seow, and T. Childs (2010, January). Minimising Embodied Product Energy to support energy efficient manufacturing. CIRP Annals - Manufacturing Technology 59(1), 25–28.

Rajkumar, R., I. Lee, L. Sha, and J. Stankovic (2010). Cyber-physical systems: the next computing revolution. Design Automation Conference (DAC), 2010 47th ACM/IEEE, 731–736.

Report, T., U. P.Mar, and Y. Simmhan (2011). An informatics approach to demand response optimization in smart grids. Proceding Conference.

Saha, B. and D. Srivastava (2014, March). Data quality: The other face of big data. In ICDE Data Engineeringa 2014, pp. 1294–1297.

75 Sakr, S. and A. Liu (2012, June). SLA-Based and Consumer-centric Dynamic Provisioning for Cloud Databases. In 2012 IEEE Fifth International Conference on Cloud Computing, pp. 360–367. IEEE.

SmartFactoryKL (2014, January). Industry 4.0 - from hype to realization. Available at http: //smartfactory.dfki.uni-kl.de/en/content/news/2013/innotag-nachlese (visited in Jan. of 2014).

Staggs, K. P. and P. F. Mclaughlin (2010). Cloud computing for an industrial automation and manufac- turing system.

Stouffer, K., J. Falco, and K. Kent (2006). Guide to Supervisory Control And Data Aquisition (SCADA) and industrial control systems security. Recommendations of the National Institute of Standards and Technology Keith.

Tao, F., L. Zhang, V. C. Venkatesh, Y. Luo, and Y. Cheng (2011, August). Cloud manufacturing: a computing and service-oriented manufacturing model. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture 225(10), 1969–1976.

Thollander, P. and M. Ottosson (2010). Energy management practices in Swedish energy-intensive industries. Journal of Cleaner Production 18(12), 1125–1133.

Uk, S. (2011, January). The green league: How businesses are reacting to the green agenda. Available at http://w3.siemens.co.uk/home/uk/en/aboutus/Documents/greenleaguereport.pdf (visited in Sept. of 2013).

Vikhorev, K., R. Greenough, and N. Brown (2013, March). An advanced energy management framework to promote energy awareness. Journal of Cleaner Production 43, 103–112. von Kistowski, J. G., N. R. Herbst, and S. Kounev (2014, March). LIMBO: A Tool For Modeling Variable Load Intensities. In Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering (ICPE 2014), ICPE ’14, New York, NY, USA, pp. 225–226. ACM.

Voorsluys, W., J. Broberg, and R. Buyya (2011). Introduction to cloud computing. Cloud computing: Principles and Paradigms.

Walawalkar, R., S. Blumsack, J. Apt, and S. Fernands (2008, October). An economic welfare analysis of demand response in the PJM electricity market. Energy Policy 36(10), 3692–3702.

Walawalkar, R., S. Fernands, N. Thakur, and K. R. Chevva (2010, April). Evolution and current status of demand response (DR) in electricity markets: Insights from PJM and NYISO. Energy 35(4), 1553– 1560.

Xu, X. (2012, February). From cloud computing to cloud manufacturing. Robotics and Computer- Integrated Manufacturing 28(1), 75–86.

76 Appendix A

Virtual Energy Cloud Models

77 Figure A.1: UML diagram to describe the domain and service layer of the virtual energy cloud project.

78 Figure A.2: UML diagram to describe the gateway layer of the virtual energy cloud project.

79 Figure A.3: Actual size view of the Virtual Energy Cloud dashboard to manage active sites and meters.

80 Figure A.4: Actual size view of the Virtual Energy Cloud dashboard to visualize the metrics simulated.

81 Appendix B

Energy Cloud

82

4519,516

4543,385

4613,017

4695,409

4763,785

4729,157

4663,652

4633,939

4606,591

4616,755

4333,723

Process Latency (ms) Latency Process

Process Latency (ms) Latency Process

600

7560

5760

4420

1883

5580

4160 2080

2040

11800

10680

Executed

Executed

Bolt-Window (4) Bolt-Window

Bolt-Window (4) Bolt-Window

0,59

0,466

0,515

0,618

0,561

0,596

0,824

0,909

0,711

0,784

0,033

Execute Latency (ms)Execute Latency

Execute Latency (ms)Execute Latency

0,473

0,504

0,716

0,778

0,956

2,375

0,469

0,582

0,553

0,517

1,167

Process Latency (ms) Latency Process

Process Latency (ms) Latency Process

900

320

940

580

120

2900

2620

1740

1260

1920

1340

Executed

Executed

11

1,24

0,324

0,321

0,402

0,492

0,644

1,063

Bolt-TopConsumers (1) Bolt-TopConsumers

1,572

2,255

3,103

Bolt-TopConsumers (1) Bolt-TopConsumers

Execute Latency (ms)Execute Latency

Execute Latency (ms)Execute Latency

1,14

1,29

0,605

0,641

0,763

0,823

0,977

1,554

1,044

1,449

2,355

Process Latency (ms) Latency Process

Process Latency (ms) Latency Process

620

1160

7420

5640

4360

1840

5400

4000

2900

1940

10500

Executed

Executed

1,1

Bolt-Converter (2) Bolt-Converter

Bolt-Converter (1) Bolt-Converter

0,64

0,72

1,07

1,18

1,32

0,634

0,687

0,885

1,293

1,903

Execute Latency (ms)Execute Latency

Execute Latency (ms)Execute Latency

0,1

0,39

0,355

0,429

0,487

0,646

0,552

0,228

0,243

0,247

0,224

Process Latency (ms) Latency Process

Process Latency (ms) Latency Process

600

980

180

5180

4620

3120

2260

1620

2220

1560

16200

Executed

Executed

Bolt-Archiver (1) Bolt-Archiver

Bolt-Archiver (1) Bolt-Archiver

0,9

0,63

0,51

0,332

0,346

0,417

0,504

0,215

0,441

0,474

0,778

Execute Latency (ms)Execute Latency

Execute Latency (ms)Execute Latency

4639,05

4716,33

4866,16

4623,505

4623,025

4639,011

4700,317

4749,012

4550,085

4602,265

4814,054

CompleteLatency

CompleteLatency

Spout RMQ (2) RMQ Spout

Spout RMQ (1) RMQ Spout

8700

3700

8000

5820

3900

1220

23200

21000

14840

11320

10800

Emitted

Emitted

04:39

04:11

03:03

02:20

01:54

00:48

04:28

03:20

02:26

01:47

00:39

Uptime

Storm

Uptime

Storm

48

40

40

43

43

43

20

20

22

18

15

Get speedGet

Get speedGet

RabbitMQ

RabbitMQ

18701

18321

17412

16639

16639

15372

25794

24091

22672

20694

277692

Messages QueuedMessages Messages QueuedMessages

Figure B.1: Detailed results of the test cases to evaluate the Energy Cloud project performance.

83 Figure B.2: Actual size view of the Energy Monitoring dashboard.

84 Figure B.3: Actual size view of the Energy Analytics dashboard.

85 86