Secure Lwm2m Iot Streaming Data Pipelines in Hospworks Master Thesis
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSIDAD POLITÉCNICA DE MADRID ESCUELA TÉCNICA SUPERIOR DE INGENERÍA Y SYSTEMAS DE TELECOMUNICACIÓN EIT DIGITAL MASTER IN INTERNET TECHNOLOGY AND ARCHITECTURE Secure LwM2M IoT streaming data pipelines in Hospworks Master Thesis Kajetan Maliszewski Madrid, July 2019 Contents 1 Introduction 1 1.1 Problem description . 2 1.2 Purpose . 2 1.3 Goals . 2 1.4 Outline . 3 2 Background 4 2.1 IoT Architecture . 4 2.2 IoT Nodes . 5 2.3 IoT Gateway . 6 2.4 Hopsworks Ecosystem . 8 2.5 Apache Kafka . 9 2.6 Stream Processing . 10 2.7 Security . 10 3 Architecture 12 3.1 Components . 12 3.2 IoT Gateway in the Hopsworks Ecosystem . 14 3.3 IoT Gateway Architecture . 15 3.3.1 LeshanService . 16 3.3.2 DatabaseService . 16 3.3.3 ProducerService . 19 3.3.4 HopsworksService . 21 3.4 Hopsworks Architecture . 21 3.4.1 Hopsworks Database . 21 3.4.2 User Interface . 23 3.4.3 IotGatewayResource . 23 3.4.4 Data Storage . 24 3.4.5 Streaming Jobs . 24 3.5 IoT Nodes . 25 3.5.1 Endpoint Client Name . 25 i 3.5.2 Measurements Timestamping . 25 3.5.3 Measurement Life Cycle . 26 3.6 Security . 27 4 Implementation 29 4.1 IoT Nodes . 29 4.2 IoT Gateway . 30 4.2.1 LeshanService . 31 4.2.2 DatabaseService . 31 4.2.3 ProducerService . 32 4.2.4 HopsworksService . 32 4.3 Hopsworks . 33 4.3.1 Hopsworks Database . 33 4.3.2 IoTGatewayResource . 33 4.3.3 User Interface . 34 4.3.4 Streaming Jobs . 36 4.4 Installation . 37 5 Evaluation 39 5.1 Verification . 39 5.2 Validation . 39 5.2.1 Test setup . 40 5.2.2 Test with an IoT simulator . 40 5.2.3 Test with a real IoT device . 42 5.2.4 Multiple gateways test . 43 5.2.5 Failure test . 43 5.2.6 Anomaly Detection Test . 44 5.3 Benchmarking . 45 5.3.1 Latency in a local setup . 45 5.3.2 Latency in a remote setup . 46 5.3.3 Latency results analysis . 47 5.3.4 Cold and warm startup . 48 6 Conclusion 50 6.1 Goals Achieved . 50 6.2 Future Work . 51 6.3 Reflections . 52 Bibliography 53 ii List of Figures 2.1 Example of an IoT architecture [7]. 4 2.2 Hopsworks ecosystem schema [18]. 8 3.1 Project Architecture. 12 3.2 IoT registration procedure. 14 3.3 Example of tables generated for DatabaseService. 19 3.4 IoT Gateway state in Hopsworks. 22 3.5 New gateways table in Hopsworks database. 23 3.6 Measurement life cycle. 26 3.7 System Security Architecture. 27 4.1 Zolertia Firefly (top) and Thunderboard Sense 2 (bottom). 30 4.2 Sequence diagram of a REST call getting the list of IoT Nodes. 34 4.3 UI - Enter IoT Gateway Details window. 35 4.4 UI - Overview of IoT tab. 35 4.5 UI - IoT Gateway Details window. 35 4.6 UI - IoT Nodes window. 36 5.1 Screenshots of running IoT Gateway (top) and IoT Node simulator (bottom). 41 5.2 Screenshot of running Eclipse Leshan server. 41 5.3 IoT simulator data retrieved from HopsFS. 42 5.4 Kafka ACL after detection of too high traffic on a gateway. 44 5.5 Measurement delivery time for local setup. 46 5.6 Measurement delivery time for remote setup. 47 5.7 Average latency benchmark result comparison. 48 5.8 Measurement latency with cold and warm startup. 49 iii List of Tables 3.1 HopsworksService REST API. 21 3.2 IotGatewayResource REST API. 24 5.1 bbc2 test machine specifications. 40 5.2 computer test machine specifications. 40 5.3 Software branches used for tests. 40 5.4 Average latency benchmark results . 47 iv List of Acronyms 6LoWPAN IPv6 over Low-Power Wireless Personal Area Networks ACK Acknowledge ACL Access Control List API Application Programming Interface CoaP Constrained Application Protocol CoaPS DTLS-Secured Constrained Application Protocol DTLS Datagram Transport Layer Security DDoS Distributed Denial of Service EUI Extended Unique Identifier FS File System GPU Graphics Processing Unit Hops Hadoop Open Platform-as-a-Service HDFS Hadoop Distributed File System HTTP Hypertext Transfer Protocol HTTPS Hypertext Transfer Protocol Secure IMEI International Mobile Equipment Identity IP Internet Protocol IPSO Internet Protocol for Smart Objects IoT Internet of Things JDBC Java Database Connectivity JSON JavaScript Object Notation v JVM Java Virtual Machine JWT JSON Web Token MAC Media Access Control ML Machine Learning MVC model-view-controller MVVM model-view-viewmodel NAT Network Address Translation OMA LwM2M Open Mobile Alliance Lightweight Machine-to-Machine PKI Public Key Infrastructure PSK Pre-Shared Key REST Representational State Transfer RPK Raw Public Key SQL Structured Query Language TLS Transport Layer Security TSDB Time-Series Database UI User Interface URN Uniform Resource Name UUID Universally Unique Identifier VM Virtual Machine vi Summary The number of internet connected devices has already by far surpassed the number of human beings. The pace of growth is still so big that in the next five years that number will double. The ecosystem of these devices, collectively called Internet of Things (IoT), is a source of a tremendous amount of data and creates several unheard challenges for researchers and companies. New, unconventional ways of storing, analyzing, and processing of the data had to be proposed. One such a solution is Hadoop Open Platform-as-a-Service (Hops), a result of years-long research between KTH Royal Institute of Technology in Stockholm and RISE SICS AB. It is a platform enabling an analysis of extremely large volumes of data with cutting-edge, open-source technologies for Big Data and Machine Learning (ML). This master thesis provides support for connecting these two environments. It provides instruments for secure and reliable ingestion of IoT data into Hops platform. Moreover, it provides tools for ensuring the level of security by supporting the execution of mitigating measures, such as automated exclusion of misbehaving devices and dropping traffic from sources of Distributed Denial of Service (DDoS) attacks. To allow the data ingestion a new element was introduced to the ecosystem - IoT Gateway. It is a platform, where the authenticated IoT devices can stream data to. Furthermore, Hopsworks, one of the Hops’ main component, was extended with REST API that allowed the gateways to securely connect to the Hops ecosystem. A testbed, including IoT software simulator and a real IoT device with dedicated hardware, was built and comprehensively tested and benchmarked. The architecture is based on the publicly open and very popular security protocols - Raw Public Key (RPK) and Hypertext Transfer Protocol Secure (HTTPS). It is shown that the proposed solution is performant, scalable, and provides high reliability in a real-life case scenario. Up to our knowledge, the work done in this thesis makes Hopsworks the world’s first open source Big Data platform with secure IoT data ingestion. vii Resumen La cantidad de dispositivos conectados a Internet, ya ha superado la cantidad de seres humanos. El ritmo de crecimiento es tan elevado que en los próximos cinco años se duplicará. El ecosistema de estos dispositivos, colectivamente llamado Internet of Things (IoT), es una fuente de gran cantidad de datos y crea varios retos inauditos para investigadores y empresas. Se han propuesto nuevas formas y poco convencionales de operaciones de los datos. Una de esas soluciones es Hadoop Open Platform-as-a- Service (Hops), el resultado de investigación entre KTH Royal Institute of Technology en Estocolmo y RISE SICS AB. Además, es una plataforma que permite un análisis de datos en cantidades extremadamente grandes con tecnologías innovadoras y open source de Big Data y Machine Learning (ML). Este proyecto fin de máster, proporciona soporte para conectar esas dos tecnologías. Esta plataforma también proporciona instrumentos para introducir de manera segura y de confianza los datos de IoT a la plataforma Hops. Además, proporciona herramientas para asegurar el nivel de seguridad, permitiendo la ejecución de medidas de mitigación, tales como exclusión automatizada de fuentes de ataques de tipo Distributed Denial of Service (DDoS). Para permitir la ingesta de datos, se ha introducido un nuevo elemento a esta tecnología - IoT Gateway. Se trata de una plataforma hacia la cual los dispositivos IoT ya autenticados pueden transmitir los datos. Hopsworks, un componente de Hops, ha sido extendido a través de REST API, lo que.