Technical Report CS-2000-07 [101] Robison WJ (2010) Free at What Cost? Cloud Computing Privacy Under the Stored Communications Act

cHiPSet - Research Work Results Grant Period 1 vi Preface This document is the report of the research work of the cHiPSet members. It surveys the recent develop- ments on infrastructures and middleware for Big Data modelling and simulation. The detailed taxonomy of Big Data systems for life, physical, and socio-economical applications. is also presented. vii Contents 1 Introduction .................................................................. 1 Joanna Ko lodziej, Micha l Marks and Ewa Niewiadomska-Szynkiewicz References . 2 2 Infrastructure for Big Data processing ........................................... 3 George Mastorakis, Rajendra Akerkar, Ciprian Dobre, Silvia Franchini, Irene Kilanioti, Joanna Ko lodziej, Micha l Marks, Constandinos Mavromoustakis, Ewa Niewiadomska-Szynkiewicz, George Angelos Papadopoulos, Magdalena Szmajduch and Salvatore Vitabile 2.1 Introduction to Big Data systems architecture . 3 2.2 Clusters, grids, clouds . 6 2.3 CPU/GPU and FPGA computing . 17 2.4 Mobile computing . 21 2.5 Wireless sensor networks . 37 2.6 Internet of things . 42 References . 52 3 Big Data software platforms .................................................... 61 Milorad Tosic, Valentina Nejkovic and Svetozar Rancic 3.1 Cloud computing software platforms . 61 3.2 Semantic dimensions in Big Data management . 61 References . 69 4 Data analytic and middleware .................................................. 71 Mauro Iacono, Luca Agnello, Albert Comelli, Daniel Grzonka, Joanna Ko lodziej, Salvatore Vitabile and Andrzej Wilczynski´ 4.1 Multi-agent systems . 71 4.2 Middleware for e-Health . 75 4.3 Evaluation of Big Data architectures . 80 References . 84 5 Big Data representation and management ....................................... 93 Viktor Medvedev, Olga Kurasova and Ciprian Dobre 5.1 Solutions for Data Mining and Scientific Visualization . 94 ix x Contents 5.2 Data fusion, aggregation and management . 109 References . 118 6 Resource management and scheduling ........................................... 123 Florin Pop, Magdalena Szmajduch, George Mastorakis, Constandinos Mavromoustakis 6.1 Resource management in data centres . 123 6.2 Scheduling for Big Data systems . 123 7 Energy aware infrastructure for Big Data ........................................ 125 Ioan Salomie, Joanna Kolodziej, Micha l Karpowicz, Piotr Arabas, Magdalena Szmajduch, Marcel Antal, Ewa Niewiadomska-Szynkiewicz, Andrzej Sikora, George Mastorakis, Constandinos Mavromoustakis 7.1 Introduction . 125 7.2 Taxonomy of energy management in modern distributed computing systems . 126 7.3 Energy aware CPU control . 127 7.4 Energy aware schedulers . 130 7.5 Energy aware interconnections. 141 7.6 Offloading techniques for mobile devices . 148 References . 153 8 Big Data security ............................................................. 163 Agnieszka Jakóbik 8.1 Main concepts and definitions . 163 8.2 Big Data security . 164 8.3 Methods and solutions for security assurance BD systems . 166 8.4 Summary . 174 References . 174 Index ............................................................................. 177 List of Contributors Ewa Niewiadomska-Szynkiewicz Institute of Control and Computation Engineering, Warsaw University of Technology, Poland, e-mail: [email protected] Rajendra Akerkar Western Norway Research Institute, Norway, e-mail: [email protected] Luca Agnello Department of Biopathology and Medical Biotechnologies, University of Palermo, Italy, e-mail: [email protected] Marcel Antal Technical University of Cluj-Napoca, Romania, e-mail: [email protected] Piotr Arabas Institute of Control and Computation Engineering, Warsaw University of Technology, Poland, e-mail: [email protected] Albert Comelli Department of Biopathology and Medical Biotechnologies, University of Palermo, Italy, e-mail: [email protected] Ciprian Dobre University Politehnica of Bucharest, Romania, e-mail: [email protected] Silvia Franchini University of Palermo, Italy, e-mail: [email protected] Daniel Grzonka xi xii List of Contributors Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Mauro Iacono Seconda Università degli Studi di Napoli, Italy, e-mail: [email protected] Agnieszka Jakóbik Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Micha l Karpowicz Research and Academic Computer Network, Warsaw, Poland, e-mail: [email protected] Irene Kilanioti Department of Computer Science, University of Cyprus, Nicosia, Cyprus, e-mail: [email protected] Joanna Ko lodziej Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Olga Kurasova Vilnius University, Institute of Mathematics and Informatics, Lithuania, e-mail: [email protected] Micha l Marks Research and Academic Computer Network, Warsaw, Poland, e-mail: [email protected] George Mastorakis Technological Educational Institute of Crete, Greece, e-mail: [email protected] Constandinos Mavromoustakis Department of Computer Science, University of Nicosia, Cyprus, e-mail: [email protected] Viktor Medvedev Vilnius University, Institute of Mathematics and Informatics, Lithuania, e-mail: [email protected] Valentina Nejkovic University of Nis, Faculty of Electronic Engineering, Department of Computer Science, Laboratory of Intelligent Information Systems, Serbia, e-mail: [email protected] George Angelos Papadopoulos List of Contributors xiii Department of Computer Science, University of Cyprus, Nicosia, Cyprus, e-mail: [email protected] Svetozar Rancic University of Nis, Faculty of Science and Mathematics, Department for Computer Science, Serbia, e-mail: [email protected] Ioan Salomie Technical University of Cluj-Napoca, Romania, e-mail: [email protected] Andrzej Sikora Research and Academic Computer Network, Warsaw, Poland, e-mail: [email protected] Magdalena Szmajduch Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Milorad Tosic University of Nis, Faculty of Electronic Engineering, Department of Computer Science, Laboratory of Intelligent Information Systems, Serbia, e-mail: [email protected] Salvatore Vitabile University of Palermo, Italy, e-mail: [email protected] Andrzej Wilczynski´ Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Chapter 1 Introduction Joanna Ko lodziej, Micha l Marks and Ewa Niewiadomska-Szynkiewicz Distributed computing paradigm endeavors to tie together the power of large number of resources which are distributed across a network. Each user has its own requirements and needs which is being shared among the network through proper communication channel [2]. There are three major reasons for using distributed computing paradigm. Such networks are necessary for generating the data that are required for the execution of tasks on remote resources. Second, most of the parallel applications have multiple processes that run concurrently on many nodes communicating over a high-speed interconnect. The use of high performance distributed systems for parallel applications is beneficial as compared to a single Central Processing Unit (CPU) machine for practical reasons. The ability of services distributed in a wide network is low-cost and makes the whole system scalable and adapted to achieve the desired level of the performance efficiency. Third, the reliability of the distributed system is higher than a monolithic single processor machine. A single failure of one network node in a distributed environment does not stop the whole process as compared to a single CPU resource. Some techniques for achieving reliability on a distributed environment are check pointing and replica- tion. Scalability, reliability, information sharing, and information exchange from remote sources are the main motivations for the users of distributed systems [1]. Distributed High Performance Computing (HPC) systems will drive progress in high energy physics, chemistry and material science, shape aeronautics, automotive and energy industries, they will also sup- port world-wide efforts to solve emerging problems of climate change, economy regulation and medicines development. Indeed, HPC systems are viewed as strategic resources for Europe’s future. Nonetheless, implementing such systems faces different new technological challenges, the major ones including reduc- tion of energy and power consumption. Modelling and simulation offer suitable abstractions to manage the complexity of analysing Big Data in various scientific and engineering domains. This document describes survey of the state-of-the-art Joanna Kolodziej Institute of Computer Science, Cracow University of Technology, Poland, e-mail: [email protected] Micha lMarks Research and Academic Computer Network, Warsaw, Poland, e-mail: [email protected] Ewa Niewiadomska-Szynkiewicz Institute of Control and Computation Engineering, Warsaw University of Technology, Polande-mail: [email protected]. edu.pl 1 2 Joanna Ko lodziej, Micha l Marks and Ewa Niewiadomska-Szynkiewicz of infrastructures and middleware for Big Data modelling and simulation. Furthermore, it provides the comprehensive full taxonomy of Big Data systems for life, physical, and socio-economical applications. The attention is focused on the following topics: • Big Data infrastructure (clouds & grids, GPU, FPGA, IoT, WSN, mobile computing), • Big

Technical Report CS-2000-07 [101] Robison WJ (2010) Free at What Cost? Cloud Computing Privacy Under the Stored Communications Act

Lo Hemos Olvidado Todo

Operationalizing Design Fiction with Anticipatory Ethnography

Beyond Assessment: Recognizing Achievement in a Networked World Downes, Stephen

Regulating Adwords: Consumer Protection in a Market Where the Commodity Is Speech

Asterisk Revisited: Debating a Right of Reply on Search Results

Access, Fairness, and Accountability in the Law of Search Oren Bracha

Alice E. Marwick [email protected] |

Surveillance and Privacy in the Implementation of Google Apps For

Messing with the Magic PART TWO - the Google Story

Administrator Guide, Dragon Medical Practice Edition Version 2.0, L-3659

A Study of Emerging Opportunities for Digital Print Production of User-Generated Content Javier Rodriguez-Borlado Martinez

When Chaos Rules, Only the Fittest Survive: the Impact of Disruptive Technologies on Organizational Survival