Seminar on Network Security and Internetworking Spring 2015 Mario Di Francesco, Sanja S´Cepanovi´C(Eds.)ˇ

Aalto University School of Science Department of Computer Science Seminar on Network Security and Internetworking Spring 2015 Mario Di Francesco, Sanja Sćepanović(eds.)ˇ Tutors: Çaˇgatay Ulusoy, Deng Yang, Vu Ba Tien Dung, Jiang Dong, Jukka K. Nurminen, Keijo Heljanko, Mehrdad Bagheri Majdabadi, Mario Di Francesco, Manik Madhikermi, Matti Siekkinen, Nguyen Trung Hieu, Otto Huhta, Sanna Suoranta, Sanja Sćepanović,Sakariˇ Luukkainen, Sandeep Tamrakar Thomas Nyman, Zhonghong Ou Aalto University School of Science Department of Computer Science Aalto-yliopisto Aalto-universitetet Distribution: Aalto University School of Science Department of Computer Science P.O. Box 15400 FI-00076 Aalto Tel. +358-9-470 23228 Fax. +385-9-470 23293 Preface The Seminar on Network Security and Seminar on Internetworking are Master's level courses in computer science at Aalto University. These seminar series have been running continuously since 1995. From the beginning, the principle has been that the students take one semester to perform individual research on an advanced technical or scientific topic, write an article on it, and present it on the seminar day at the end of the semester. The articles are printed as a technical report. The topics are provided by researchers, doctoral students, and experienced IT professionals, usually alumni of the university. The tutors take the main responsibility of guiding each student individually through the research and writing process. The seminar course gives the students an opportunity to learn deeply about one specific topic. Most of the articles are overviews of the latest research or technology. The students can make their own contributions in the form of a synthesis, analysis, experiments, implementation, or even novel research results. The course gives the participants a personal contacts in the research groups at the university. Another goal is that the students will form a habit of looking up the latest literature in any area of technology that they may be working on. Every year, some of the seminar articles lead to Master's thesis projects or joint research publications with the tutors. Starting from the Fall 2014 semester, we have merged the two alternating courses, one on security and one on internetworking, into one seminar that runs on both semesters. Therefore, the theme of the seminar is broader than before. All the articles address timely issues in security and privacy and networking technologies. Many of the topics are related to mobile and cloud computing and to the new applications enabled by ubiquitous computing platforms and network connectivity. These seminar courses have been a key part of the Master's studies in several computer-science major subjects at Aalto, and a formative experience for many students. We will try to do our best for this to continue. Above all, we hope that you enjoy this semester's seminar and find the proceedings interesting. Mario Di Francesco Sanja Sćepanovićˇ Professor Editor Table of Contents 1. Hussnain Ahmed. Design trade-offs for building a real-time Big Data system based on Lambda 1 architecture. Tutor: Keijo Heljanko 2. Dmytro Arbuzin. Cloud datastores: NewSQL solutions. 9 Tutor: Keijo Heljanko 3. Filippo Bonazzi. Security-Enhanced Linux policy analysis techniques. 17 Tutor: Thomas Nyman 4. Erik Berdonces Bonelo. Bacteria Nanonetworks. 25 Tutor: Mario Di Francesco 5. Christian Cardin. Survey on indoor localization methods using radio fingerprint-based techniques. 31 Tutor: Jiang Dong 6. Markku Hinkka. Big Data Platforms Supporting SQL. 37 Tutor: Keijo Heljanko 7. Antti-Iivari Kainulainen. Review of energy profiling methods for mobile devices. 45 Tutor: Dung Vu Ba Tien 8. Sami Karvonen. User trajectory recognition in an indoor environment. 51 Tutor: Jiang Dong 9. Kimmerlin Maël.Virtual Machine Consolidation with Multi-Resource Usage Prediction. 57 Tutor: Nguyen Trung Hieu 10. Pranvera Korto¸ci.Multimedia Streaming over Cognitive Radios. 63 Tutor: Mario Di Francesco 11. Lauri Luotola. IPv6 over networks of resource-constrained nodes. 73 Tutor: Yang Deng 12. Toni Mustajärvi. New applications to reduce energy consumption of cellular network using Smart 79 Grid. Tutor: Jukka K. Nurminen 13. Kari Niiranen. Security and privacy in smart energy communities. 85 Tutor: Sanja Sćepanovićˇ 14. Ari Oinonen. Software technology challenges in 3D printing. 89 Tutor: Jukka K. Nurminen 15. Jan Pennekamp. MOOCs and Authentication. 97 Tutor: Sanna Suoranta 16. Ashok Rajendran. How dense are cell towers? An experimental study of cell tower deployment. 105 Tutor: Zhonghong Ou 17. Sowmya Ravidas. User Authentication or Identification Through Heartbeat Sensing. 111 Tutor: Otto Huhta 18. Martijn Roo. A Survey on Performance of Scalable Video Coding Compared to Non-Scalable Video 119 Coding. Tutor: Matti Siekkinen 19. Juho Saarela. Biometric Identification Methods. 125 Tutor: Sanna Suoranta 20. Pawel Sarbinowski. Survey of ARM TrustZone applications. 131 Tutor: Thomas Nyman 21. Dawin Schmidt. Secure Public Instant Messaging: A survey. 139 Tutor: Sandeep Tamrakar 22. Junyang Shi. A survey on performance of SfM and RGB-D based 3D indoor mapping. 151 Tutor: Jiang Dong 23. Gayathri Srinivaasan. A survey on communication protocols and standards for the IoT. 159 Tutor: Manik Madhikermi 24. Sridhar Sundarraman. Website reputation and classification systems. 165 Tutor: Otto Huhta 25. Jan van de Kerkhof. Delay-sensitive cloud computing and edge computing for road-safety systems. 171 Tutor: Mehrdad Bagheri Majdabadi 26. Hylke Visser. More ICT to Make Households More Green. 177 Tutor: Sanja Sćepanovićˇ 27. Aarno Vuori. Software market of network functions virtualization. 183 Tutor: Sakari Luukkainen 28. Rui Yang. Something you need to know about bluetooth smart. 189 Tutor: Çaˇgatay Ulusoy 29. Can Zhu. A survey of password managers. 195 Tutor: Sanna Suoranta Design trade-offs for building a real-time Big Data system based on Lambda architecture Hussnain Ahmed Student number: 281557 [email protected] Abstract cessing latencies can lower the efficacy of such applications. Byron Ellis, 2014 differentiates streaming data from the Major Big Data technologies, such as MapReduce and other types of data on the basis of three major characteris- Hadoop rely on the batch processing of large data sets in tics, i.e. the "always on always flowing" nature of the data, the distributed parallel fashion.The latencies due to batch the loose and changing data structures, and the challenges processing techniques are unsuitable for use in real-time or presented by high cardinality dimensions [6]. These three interactive applications. Real-time stream processing en- characteristics also dictate the design and implementation gines can process data in real-time but lack the capacity for choices to handle the streaming data. Another important re- handling large volumes of data. Lambda architecture has quirement for such systems is their ability to analyze the live emerged as a powerful solution to provide the real-time pro- streaming data along with the large volumes of stored his- cessing capability over large volumes of data. Lambda archi- torical data. The final outputs of such data systems are usu- tecture combines both batch and stream processing, working ally the combined results, derived from streaming and stored together in a single system transparent to the end user. It data processing. Recently we have seen some new tools and provides the basic guidelines of a construct for such data techniques to manage such data processing. We have al- system but allows flexibility in using different components ready mentioned Hadoop and its ability to batch process Big to achieve real-time Big Data processing capability. In our Data in the distributed parallel manner. Hadoop 2.0 (YARN) study, we provide a working Lambda architecture implemen- and in-memory distributed batch processing within Apache tation while discussing the underlying trade-offs for the de- Spark framework was introduced to reduce the data process- sign of real-time data systems based on this architectural ing latencies. Similarly, tools such as Apache Storm have paradigm. become very popular as the answer for distributed processing of data streams. Various other tools for functional com- KEYWORDS: Big Data, analytics, lambda architecture, ponents such as data collection, aggregation, and distributed streaming, distributed computing database systems are also available. However, significant ef- forts are required to make appropriate architectural choices to combine these components in the form of a real-time Big 1 Introduction Data analytics platform. Lambda architecture [13] is a design approach that rec- Ubiquitous computing, availability of the fast and mobile In- ommends combining the distributed batch processing with ternet and the phenomenal growth in the use of social media stream processing to enable real-time data processing. This have generated a major surge in the growth of data. Advance- approach dissects data processing systems into three layers, ments in distributed parallel computing have become the i.e. a batch layer, a serving layer and a speed layer [1]. The major source for balancing this punctuated equilibrium. A stream of data is dispatched to both the batch and speed lay- strong collaboration between industry and open source soft- ers. The former layer manages the historical data sets and ware communities has resulted in new programming models pre-computes the batch views. The serving layer indexes the and software frameworks, such as MapReduce and Hadoop, batch views in order to serve queries at a low level of la- to handle Big Data in distributed parallel fashion. A gener- tencies. The Lambda architecture can be implemented using ation of new tools and frameworks is emerging within the various combinations of the available tools, such as Hadoop same ecosystem as building blocks to enable end-to-end Big File System (HDFS), Apache Hive and Apache Spark for Data platforms. The Hadoop framework provides scalabil- batch view generation, Apache Kafka and Apache Storm in ity, reliability and flexibility to handle large volumes of data the speed layer and HBase in the serving layer.

Load more