A Platform Architecture for Sensor Data Processing and Verification in Buildings

A Platform Architecture for Sensor Data Processing and Verification in Buildings Jorge Ortiz Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2013-196 http://www.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013-196.html December 3, 2013 Copyright © 2013, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. A Platform Architecture for Sensor Data Processing and Verification in Buildings by Jorge Jose Ortiz A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David E. Culler, Chair Professor Randy H. Katz Professor Paul Wright Fall 2013 A Platform Architecture for Sensor Data Processing and Verification in Buildings Copyright 2013 by Jorge Jose Ortiz 1 Abstract A Platform Architecture for Sensor Data Processing and Verification in Buildings by Jorge Jose Ortiz Doctor of Philosophy in Computer Science University of California, Berkeley Professor David E. Culler, Chair This thesis examines the state of the art of building information systems and evaluates their architecture in the context of emerging technologies and applications for deep analysis of the built environment. We observe that modern building information systems are difficult to extend, do not provide general services for application development, do not scale, and are difficult to set up and manage. We assert that a new architecture must be designed with four system properties – extensibility, generalizability, scalability, ease of management – in order to address these shortcomings. Our system, StreamFS, embodies these system properties through a filesystem abstraction and a set of data services. Data services are made available to applications through an overloaded pipe abstraction. This allows for dataflow specification of processing streams to clean and analyze the streaming sensor data. We deploy StreamFS in seven different buildings and compose several applications on top of it. One of the driving applications is a phone application called the Mobile Energy Lens. The Energy Lens provides occupants with mechanisms for collecting building information in a unified platform and provides a way to view aggregate energy consumption data associated with the spatial deployment configuration of plug-load devices. We present a three-layer architecture, where one of the main layers is implemented entirely with the data management and processing services offered by StreamFS. We introduce the notion of verification of physical relationships through empiricial data. We partition the verification problem into three sub problems: 1) functional verification, 2) spatial verification, and 3) categorical verification. We show how empirical mode decompo- sition, correlation, and standard machine learning techniques can give us information about how the sensors are related to each other, statistically and physically. We demonstate an ex- tensible, generalizable, scalable, and easy-to-manage system for supporting the “appification” of the built environment. i To Aileen Cruz. The love of my life. My accomplishments are not possible without your love, support, and complete faith in my abilities. ii Contents Contents ii List of Figures v List of Tables x 1 The Vision of Soft Buildings 1 1.1 The Built Environment . 2 1.2 Pervasive Computing . 2 1.3 Cloud Computing, Ubiquitous Connectivity, and Mobile Phones . 3 1.4 Applications in Buildings . 4 1.5 Research Statement And Hypothesis . 4 1.6 Thesis Roadmap . 5 1.7 Statement of Joint Work . 6 2 Sensing in the Built Environment 8 2.1 Tightly Integrated Building Information System Architecture . 9 2.2 From Supervisory Control to Application Development in Buildings . 11 2.3 BMS Architectural Shortcomings for Supporting Emerging Application De- velopment . 12 2.4 Addressing BMS Shortcomings . 18 2.5 Contextual Accuracy . 20 2.6 Experimental Setting in Real Buildings . 22 2.7 Summary . 24 3 StreamFS System Architecture 25 3.1 Overview . 25 3.2 Name Management . 27 3.3 Time-series Data Store . 30 3.4 Publish-Subscribe Subsystem . 31 3.5 Data Cleaning and Real-time Processing . 33 3.6 Entity-relationship Model . 35 iii 3.7 Related Work . 37 3.8 Summary . 38 4 StreamFS Files and Process Mechanisms 39 4.1 Process Management . 39 4.2 Internal Processes . 40 4.3 External Processes . 47 4.4 Freshness Scheduling . 48 4.5 Dynamic Aggregation Example and Freshness Scheduling Results . 51 4.6 Naming and The Filesystem Metaphore . 54 4.7 File Abstraction . 54 4.8 Supporting Multiple Names . 59 4.9 Related Work . 61 4.10 Summary . 61 5 API and an Architectural Evaluation Through Applications 63 5.1 API Overview . 63 5.2 Energy Auditing With Mobile Phones . 65 5.3 Energy Lens Architecture and System Challenges . 66 5.4 Energy Lens Experience and Results . 74 5.5 Mounted Filesystem and Matlab Integration . 81 5.6 Related Work . 83 5.7 Summary . 85 6 Empirical Verification of System Functionality and Metadata 86 6.1 Verification through Sensor Data . 86 6.2 Types of Verification . 87 6.3 Functional Verification Methodology . 92 6.4 Functional Verification Experimental Results . 101 6.5 Spatial Verification Methodology . 106 6.6 Spatial Verification Results . 109 6.7 Categorical Verification Methodology . 118 6.8 Categorical Verification Results . 118 6.9 Related Work . 120 6.10 Summary . 124 7 Conclusions 128 7.1 Lesson Learned . 128 7.2 Future Work . 131 7.3 Thesis Summary . 132 A StreamFS Process Code 135 iv B StreamFS HTTP/REST Tutorial 138 B.1 Terminology . 139 B.2 Creating a resource . 139 B.3 Creating a stream file . 142 B.4 Pushing data to a stream file . 142 B.5 Bulk data insertion . 145 B.6 Queries . 145 B.7 Bulk default/stream file creation . 147 B.8 Creating symbolic links . 152 B.9 Stream Processing . 154 B.10 Configuration . 154 B.11 Start the processing element . 155 B.12 Creating a processing job . 155 B.13 Start the process . 157 B.14 Stopping the process . 159 Bibliography 161 v List of Figures 1.1 Building application model. Building sensor deployments send data to the cloud and applications access it as it streams in. Applications may also feed data to the cloud and make it available to other applications. 7 2.1 General building control loop. 9 2.2 High-level control board architecture. 10 2.3 BMS network architecture. 11 2.4 BACNet device example. 12 2.5 Screen shot for the Soda Hall Building Management System Interface. 13 2.6 Emerging Application: Hierachical MPC for a cluster of buildings. 16 2.7 Emerging Application: Mobile phone interfacing with the physical infrastructure. 17 2.8 MPC example where metadata must be verified to maintain correct behavior. 21 2.9 Distribution of energy consumption in buildings by end-use. Table reproduced here from the 2011 Buildings Energy Data Book [117]. 23 3.1 StreamFS system architecture. The four main components – name register, subscription manager/forwarding engine, process manager, and timeseries datastore – are shown. It also shows the application layer at the top. 26 3.2 Name management layer implemented behind HAProxy. Name servers handle individual requests and use the “name registration table” to handle name-lookup requests accordingly. 29 3.3 The timeseries data store. We use OpenTSDB; a timeseries data-store that runs in cluster of HBase instances. 31 3.4 Subscription manager and forwarding engine. These components manage the mapping from sources to sinks and forwards data between client applications and internal/external processes or external subscribers. 32 3.5 The process manager manages a cluster of processing elements and connections to external processing units. It works closely with the subscription manager to forward data between elements. 34 vi 3.6 This figure shows how we translate the OLAP cube to a hierarchical ERG. Note how the dimensions of the cube translate to the graph. The level of the subtree is the category, the unit is specified at the node, and there are values at each node for every time slice. 36 4.1 This figure shows the internal structure of a process element (PE). The PE con- sists of several javascript sandboxes where internal process code runs. It also contains a scheduling loop, message router, and instance monitor. 41 4.2 This figure shows how a “slice” operation is translated from the cube to the ERG. The user queries across all streams or aggregation points at a certain level, specified by a star level query with the level-specific prefix. The corresponding slice query is query.slice(’/4F/R*’).start(t1).end(t2). 44 4.3 This figure shows how a “dice” operation is translated from the cube to the ERG. The user queries across all streams or aggregation points at a certain level, specified by a star level query with the level-specific prefix. The corresponding dice query is query.dice(’/4F/R*’).units([’F’]).start(t1).end(). 44 4.4.

A Platform Architecture for Sensor Data Processing and Verification in Buildings

CSE 220: Systems Programming Input and Output

Introduction to Unix

Storage Devices, Basic File System Design

Linux Kernel and Driver Development Training Slides

File and Console I/O

Plan 9 from Bell Labs

Workstation Operating Systems Mac OS 9

Linux and GNU

Ten Steps to Linux Survival

Get to Grips with /Proc and /Sysfs

The Linux Command Line

Introduction to Linux