Scalable and Reactive Data Management for Mobile Internet-Of-Things Applications with Actor-Oriented Databases

UNIVERSITY OF COPENHAGEN PhD in Computer Science Scalable and Reactive Data Management for Mobile Internet-of-Things Applications with Actor-Oriented Databases Yiwen Wang Supervised by Marcos Antonio Vaz Salles April 2021 Yiwen Wang Scalable and Reactive Data Management for Mobile Internet-of-Things Applications with Actor-Oriented Databases PhD in Computer Science, April 2021 Supervisors: Marcos Antonio Vaz Salles University of Copenhagen Faculty of Science PhD Degree in Computer Science Sigurdsgade 41 2200 Copenhagen 四载春秋，畏作黄粱，虽非寒窗，亦是苦读。知己无柳絮高才，幸有大贤焉而为其徒，博学慎思，人十能之而己百之，笃行亦则足恃矣。今停笔止言之际，回首旦暮，赠以诗酒共年华，赚得知交满天下。愿今后去往之地，皆为热土，期明朝漫漫修远之路，皆伴长风济沧海。 iii Acknowledgements My PhD work conducted in the context of the Future Cropping partnership (Fu- ture Cropping partnership website 2018), supported by Innovation Fund Den- mark. Experimental evaluation partially supported by the AWS Cloud Credits for Research program. In addition, this work was partly supported by the International Network Programme project "Modeling and Developing Actor Database Applications", funded by the Danish Agency for Science and Higher Education (number 7059-000528) and by FAPESP CEPID CCES 13/08293-7. Additional funding provided by FAPESP project 17/02325-5 and by CNPq- Brazil, Department of Computer Science, University of Copenhagen and Pro- gramming technology foundations for Accountability, Privacy-by-design & Robustness in Context-aware Systems (case number: 9131-00077B). Throughout the working of this nearly four years PhD pursuing adventure, I have received a great deal of support and assistance from many aspects. First and foremost, I would like to express my special appreciation to my supervisor, Professor Marcos Antonio Vaz Salles, whose expertise was invaluable in guiding me in the research. I would like to thank you for your insightful feedback pushed me to sharpen my thinking as a researcher and brought my work to a higher level. During the special time when the world been attacked by Covid-19, everything has become more complicated and challenging. Thank you for providing me supports in all aspects and even arrange time to help me every week during your parental leave and vacations. I have deep gratitude for that in my heart. You are the best supervisor in the world. Meanwhile, I would like to thank Professor Yongluan Zhou, Professor Claudia Bauzer Medeiros, Professor Julio Cesar Dos Reis, Vivek Shah and Kasper Myrtue Borggrens, for their extraordinary cooperation, inspirations, insightful comments, and suggestions during my PhD study. Special thanks to Professor Christian S. Jensen for hosting me at Aalborg University for my environmental v exchange. I would like to acknowledge my colleagues in Sigurdsgade 41 for our discussion, brainstorming, and wonderful joy time. I would also like to thank all my friends. I would like to offer my special thanks to Ziming Luo, who has provided company for me from high school to university, and also in my current PhD study. Thank you for providing happy distractions to rest my mind outside of my research, cooking delicious food, and taking care of my pet fish, Panda. I would particularly like to thank my beloved husband. I cannot thank you enough for encouraging me throughout my PhD experience. I really appreciate your wise counsel and sympathetic ear. You are the victim of my crying, screaming, disordering, and weird behaviors in my stressed time. Meanwhile, I would like to thank my lovely family-in-law for embracing me with the family atmosphere when I am thousands of kilometers away from my hometown. Finally, I could not have completed my PhD study without the support of my parents, my sister, and my deceased grandparents. Thank you for your unwavering support and belief in me. I know you are always there for me. Also, I think this dissertation is a perfect prenatal education material for my unborn niece. vi Abstract Development and utilization of Internet-of-Things (IoT) applications are increasing at an unprecedented speed in many domains, such as supply chain and logistics tracking, smart agriculture, e-health, and intelligent transporta- tion systems, to name a few. Among these applications, mobile IoT applications have received special attention by virtue of having an ever more significant impact on people’s day-to-day and society. Mobile IoT applications employ widely used portable and movable IoT entities that collect data, share information, and interact with each other to achieve common goals. While a host of data management solutions may be employed in the architecture of IoT applications, ranging from edge data processing to historical analytical database systems, a particularly interesting component is the IoT data platform, which is a cloud-resident infrastructure for online management of IoT data. The growing connection coverage and increasing requirements in mobile IoT applications bring about special problems and challenges to be explored in IoT data platforms. Firstly, scalability is a necessary requirement for an IoT data platform because of the explosive growth of the IoT. Secondly, tight low-latency reactivity in an IoT data platform is required while managing a massive amount of highly concurrently generated data. Thirdly, support for heterogeneous data types is demanded in an IoT data platform due to the variety of IoT devices. Fourthly, elasticity is also required in an IoT data platform to handle dynamically changing IoT workloads. Last but not least, guidance on ensuring the dynamic and flexible development of IoT data platforms is crucial for application developers. Existing approaches have explored how to support different properties required in IoT data platforms, but they fall short of fulfilling our goal of providing programmable, scalable, and reactive data management for mobile IoT applications. Recently, Actor-Oriented Databases (AODBs) have been proposed vii as a new cloud-based data management architecture for distributed, scalable, stateful, and interactive applications. We find that AODBs are naturally suit- able for satisfying the requirements and solving the challenges of building IoT data platforms. Therefore, we inspect the distinct requirements of building IoT data platforms through two case studies, and we investigate how to ef- fectively model and build IoT data platforms with AODBs. Then, we focus on easing the complexity of building mobile IoT data platforms with AODBs while providing scalability and reactivity. We propose a novel class of data management systems called Moving Actor-Oriented Databases (M-AODBs), which integrate the new abstraction of moving actors for reactive moving objects with two data concurrency semantics for different application use scenarios. A first implementation, Dolphin, is presented to validate the design of M-AODBs. Afterward, to handle the typically skewed spatial distributions in mobile IoT applications, we apply varied classic spatial partitioning techniques in Dolphin to evaluate, analyze, and manage bottlenecks due to data skew. Our experimental study illustrates the impact of spatial partitioning in Dolphin and unveils several promising directions for future work. viii Resumé Udvikling og anvendelse af Internet-of-Things (IoT) applikationer sker med en hidtil uset hastighed på flere anvendelsesområder, såsom forsyningskæde og logistiksporing, smart landbrug, e-sundhed og intelligente transportsystemer. Blandt disse applikationer har mobile IoT-applikationer fået særlig opmærk- somhed i kraft af at have en stadig større indflydelse på samfundet og folks dagligdag. Mobile IoT-applikationer anvender meget bærbare og bevægelige IoT-enheder, der indsamler data, deler information og interagerer med hinan- den for at nå et større fælles mål. Mens en række datastyringsløsninger kan anvendes i arkitekturen af IoT-applikationer, der spænder fra kantdatabehan- dling til historiske analytiske databaser, så er en særlig interessant komponent IoT-dataplatformen, som er en cloud-resident infrastruktur til online styring af IoT-data. Den voksende forbindelsesdækning og stigende krav i mobile IoT applikationer medfører specielle problemer og udfordringer, der skal udforskes i IoT-dataplatforme. For det første er skalerbarhed et nødvendigt krav til en IoT-dataplatform på grund af den eksplosive vækst i IoT. For det andet kræves en lav-latency-reaktivitet i en IoT-dataplatform, når der skal administreres en massiv mængde stærkt samtidig genererede data. For det tredje kræves det at heterogene datatyper understøttes i en IoT-dataplatform på grund af forskelligheden blandt IoT-enhederne. For det fjerde kræves elasticitet i en IoT- dataplatform for at kunne håndtere dynamisk skiftende IoT-arbejdsbelastninger. Sidst men ikke mindst, så er der behov for vejledning til at sikre en dynamisk og fleksibel udvikling af IoT-dataplatforme for applikationsudviklere. Eksisterende løsninger har udforsket, hvordan de forskellige egenskaber der kræves i IoT-dataplatforme kan understøttes, men opfylder ikke vores mål om at sikre programmerbar, skalerbar og reaktiv geodatastyring til mobile ix IoT-applikationer. For nylig er Actor-Oriented Databaser (AODBs) blevet fores- lået som en ny skybaseret datastyringsarkitektur for distribuerede, skalerbare, stateful og interaktive applikationer. Vi finder, at AODBs naturligvis er velegnet til at opfylde kravene og løse udfordringerne der er relateret til opbyggelsen af IoT-dataplatforme. Derfor undersøger vi de specifikke krav der er relateret til opbygningen af IoT-dataplatforme gennem to casestudier, samt hvordan

Scalable and Reactive Data Management for Mobile Internet-Of-Things Applications with Actor-Oriented Databases

Advantages of Schema Less Database

An Evaluation of Compilation-Based PL/PGSQL Execution Tanuj Nayak CMU-CS-21-101 February 2021

CMU-CS-21-106 May 2021

AWS Glue Studio User Guide AWS Glue Studio User Guide

732A54 / TDDE31 Big Data Analytics Topic: Dbmss for Big