enno aulAgsoSla rfso aertc,Fclae deCiências Faculdade Catedrático, ProfessorSilva, Augusto ManuelFernando Coorientador deCiências Faculdade Associado, Professor Lopes, BarrosMiguel Luís Orientador 2019 deComputadores deCiência Departamento Doutoramento Ciência de emComputadores Filipe Rodrigues João Mobile A Middleware Edge - Clouds for D Dedicated to My mother, sister and little niece

ii Acknowledgements

I am deeply indebted to my advisors Profs. Fernando Silva and Lu´ısLopes and also to Prof. Eduardo Marques for their fundamental role during my doctoral work. They provided me with every bit of guidance, assistance, and expertise that I needed. I would like to thank, again, my advisors for offering me a research grant and a special thanks for their effort to provide all the material required to the success of this project. Also, I would like to thank, Prof. Rolando Martins for his assistance and guidance at the beginning of my work. I simple cannot imagine better advisors and colleagues to work with. It is a honour to work with such good people.

I am thankful to the Funda¸c˜aopara a Ciˆenciae Tecnologia (FCT) for funding my research grant in the context of project HYRAX (CMUP-ERI/FIA/0048/2013), which supported this work for 48 months, and to the Centre for Research and Advanced Computing Systems (CRACS) for the travel funding through projects financed by ERDF (COMPETE 2020, POCI-01-0145- FEDER-006961), FCT (CMUP-ERI/FIA/- 0048/2013 and UID/EEA/ 50014/2013), and FEDER (NORTE 2020, NORTE-01- 0145-FEDER- 000020).

To all my fellow colleagues from DCC-FCUP, specially F´abioDomingues, Sylwia Bugla and Joaquim Silva, I wish you much success for your professional and personal future.

Finally, I would like to thank my family and friends for all the support during the last years, specially my mother and sister, that believed, trusted and supported me in every way they could. I am also specially grateful to my grandparents, Sildina and Manuel, whose way to life is really inspiring.

iii iv Resumo

Durante a ´ultimad´ecada,os dispositivos m´oveis tornaram-se ub´ıquos. Avan¸cosno processo de fabrico fizeram com que os pre¸cosde tais dispositivos baixassem significati- vamente enquanto aumentava a sua capacidade de processamento e de armazenamento. Tradicionalmente, esses dispositivos eram vistos como thin clients. No entanto, como anteriormente afirmado, os smartphones e tablets de hoje contˆemrecursos de hardware que permitem que software mais sofisticado seja instalado, permitindo a sua utiliza¸c˜ao como thick clients ou mesmo pequenos servidores. Simultaneamente, novos standards e protocolos, como WiFi-Direct, foram desenvolvidos para permitir que os dispositivos m´oveis pudessem comunicar directamente entre si, em vez de ser atrav´esda Internet ou de pontos de acesso WiFi. Al´emdisso, esta tecnologia permite a comunica¸c˜aoentre os dispositivos com alta largura de banda e ao mesmo tempo baixa latˆencia.Por fim, os dispositivos tˆemagora incorporados v´ariossensores que lhes permitem ”sentir” o ambiente, em redor, e alimentar aplica¸c˜oese servi¸cosmais sofisticados.

Todos estes desenvolvimentos est˜aona base de um interesse renovado na ´areade edge- networks. Tais redes s˜aocompostos por dispositivos m´oveis que formam entre si edge- clouds, trazendo parte da funcionalidade da cloud para a periferia da Internet. As edge-clouds s˜aoportanto formadas por conjuntos de dispositivos m´oveis em estreita proximidade que fornecem servi¸cosque se baseiam na agrega¸c˜aodos seus recursos computacionais e de armazenamento. O desenvolvimento de tais aplica¸c˜oese servi¸cos ´e,contudo, dificultada pela pela complexidade da forma¸c˜ao e manuten¸c˜aodas redes, pela instabilidade intr´ınsecadas liga¸c˜oessem fio e pela heterogeneidade do hardware e dos sistemas operativos dos dispositivos.

Nesta tese, apresentamos o desenho e a implementa¸c˜aode um middleware gen´erico para edge-clouds com a finalidade de fornecer aos programadores os procedimentos b´asicospara implementa¸c˜aode aplica¸c˜oesm´oveis para estas plataformas. Com esse objectivo, o middleware oferece ao programador uma API que lida com a complexidade de baixo n´ıvel tal como forma¸c˜aoe gest˜aode redes e a agrega¸c˜aode recursos, de forma

v a gerir situa¸c˜oes como comunica¸c˜aointermitente, mobilidade, churn e execu¸c˜aosem infraestrutura. Tudo isto ´efornecido sem a necessidade de fazer root ao dispositivo, caso contr´arioseria inutiliz´avel no mundo real.

Al´emda arquitectura e dos detalhes de implementa¸c˜ao,fazemos uma avalia¸c˜ao do desempenho, da escalabilidade do middleware e tamb´emuma breve discuss˜aosobre as in´umerasaplica¸c˜oese servi¸cosque tˆemvindo a ser implementados recorrendo ao middleware. Al´emdisso, desenvolvemos e estudamos o comportamento de aplica¸c˜oes reais para a distribui¸c˜aoe partilha de v´ıdeoem infraestruturas desportivas, com ou sem infraestrutura de rede.

vi Abstract

In the last decade, mobile devices have become ubiquitous. Advances in manufacturing processes have significantly dropped the price tag of such devices whilst augmenting their storage and computational capabilities. Traditionally these devices were viewed as simple clients. However, as stated, smartphones and tablets today have hardware resources that allow more sophisticated software to be installed, allowing for their use as thick clients or even thin servers. Simultaneously, new standards and protocols, such as WiFi-Direct, have been developed that allow mobile devices to communicate directly with each other, as opposed to over the Internet or across WiFi access points. This technology enables low latency, high bandwidth device-to-device (D2D) commu- nication. Finally, the devices have now embedded multiple sensors that allow them to feel the environment and feed more sophisticated applications and services.

These developments fostered the research on edge-networks, composed of such devices, and on mobile edge-clouds, where some traditional cloud computing functionality is provided by at the edge. Mobile edge-clouds are thus formed by sets of mobile de- vices in close proximity providing services that crowd-source their computational and storage resources. The development of such crowd-sourcing applications and services is, however, hampered by the complexity of network formation and maintenance, the intrinsic instability of wireless links and the heterogeneity of the hardware and operating systems in the devices.

In this thesis, we present the design and implementation of a general purpose mid- dleware for edge-clouds that provides programmers with the basic building blocks for implementing mobile crowd-sourcing applications. Towards this goal, the middleware provides the programmer with an API that handles the low-level complexity of net- work formation and management and crowd-sourcing of resources, whilst handling problematic issues such as intermittent communication, device churn and untethered execution. All this is provided without the need for ”rooting” the device, which would automatically devoid the approach of its applicability in the real world.

vii Besides the architecture and implementations details, we provide an assessment of the performance and scalability of the middleware and discuss the, by now, numerous applications and services that have been implemented on top of it, including real-world apps for content sharing in sports venues or in infrastructure-deprived environments and general services for computation and storage.

viii Acronyms

3G/4G Third/Fourth generation of wireless mobile telecommunications technology

AP Access Point

AODV Ad-hoc On-demand Distance Vector

API Application Programming Interface

Bluetooth LE Low Energy

CC Cloud Computing

D2D Device to Device

DSDV Destination Sequence Distance Vector

DSR Dynamic Source Routing

GB Gigabyte

Gbps Gigabit per second

GHz Gigahertz

GPS Global Positioning System

IaaS Infrastructure as a Service iOS iPhone Operation System

IoT Internet of Things

IP Internet Protocol

ISM Industrial, Scientific, and Medical radio band

ix MCC Mobile Cloud Computing

MAC Media Access Control

MANET Mobile Ad-hoc Network

Mbit Megabit

Mbps Megabit per second

MEC Mobile Edge Cloud

NFC Near Field Communication

MHz Megahertz

OLSR Optimised Link State Routing Protocol

OS Operation System

OSI Open Systems Interconnection

P2P Peer to Peer

PaaS Platform as a Service

RAM Random Access Memory

RTC Real Time Communications

SaaS Software as a Service

SDK Software Development Kit

SSD Solid State Disk

TDLS Tunneled Direct Link Setup

WPA WiFi Protected Access

ZHLS Zone-based Hierarchical Link State Routing Protocol

ZRP Zone Routing Protocol

x Contents

Resumo v

Abstract vii

Acronyms ix

List of Tables xv

List of Figures xviii

1 Introduction 1

1.1 Motivation ...... 2

1.2 Problem statement ...... 3

1.3 Contributions ...... 5

1.4 Thesis layout ...... 6

2 State of the art 7

2.1 Networking ...... 8

2.1.1 Technologies ...... 8

2.1.2 Network formation ...... 10

2.1.3 Routing ...... 14

2.2 Middleware ...... 17

xi 2.2.1 Generic Middlewares ...... 18

2.2.2 Special purpose ...... 23

2.3 Crowd-sourcing applications ...... 25

2.4 Discussion ...... 26

3 A Middleware for Edge-Clouds 29

3.1 Overview ...... 29

3.2 Link Layer ...... 31

3.2.1 Architecture ...... 32

3.2.2 Example ...... 34

3.2.3 API ...... 39

3.2.4 Implementation ...... 41

3.3 Network layer ...... 43

3.3.1 Architecture ...... 44

3.3.2 Examples ...... 46

3.3.3 API ...... 48

3.3.4 Implementation ...... 50

3.4 Discussion ...... 62

4 Middleware evaluation 63

4.1 Link layer ...... 63

4.1.1 Evaluation setup ...... 63

4.1.2 Latency of link actions ...... 64

4.1.3 Bandwidth measurements ...... 66

4.1.4 Resource consumption ...... 66

4.2 Network Layer ...... 70

xii 4.2.1 Evaluation setup ...... 71

4.2.2 Packet routing ...... 71

4.2.3 Network Formation ...... 87

4.3 Discussion ...... 94

5 Other Contributions 97

5.1 Wireless Technology Assessment ...... 97

5.2 User Generated Replays: Part I ...... 99

5.3 User Generated Replays: Part II ...... 100

5.4 Edge-Cloud Services and Apps ...... 102

5.4.1 Distributed Computing ...... 103

5.4.2 Distributed Storage ...... 104

5.4.3 Publish/Subscribe ...... 104

6 Conclusions 107

References 110

xiii xiv List of Tables

2.1 Frameworks/SDKs available...... 27

3.1 Supported Logic Actions per Wireless Technology...... 34

4.1 Link layer - bandwidth...... 66

xv xvi List of Figures

1.1 Proposed Scenarios...... 4

2.1 WiFi-TDLS...... 8

2.2 WiFi-Direct...... 8

3.1 Hyrax Middleware...... 31

3.2 The link layer internal structure...... 32

3.3 Link Layer API...... 40

3.4 Link Layer Flow...... 41

3.5 A Logical Network...... 44

3.6 The network layer internal structure...... 45

3.7 Network Layer API...... 48

3.8 Network Layer Flow...... 50

3.9 Star Network...... 57

3.10 Mesh Network...... 57

4.1 Link layer - action latency...... 65

4.2 Link layer - CPU usage...... 67

4.3 Link layer - memory usage...... 68

4.4 Link layer - battery usage...... 69

4.5 Bluetooth star ...... 74

xvii 4.6 Bluetooth star traffic...... 75

4.7 WiFi direct star ...... 77

4.8 WiFi direct star traffic...... 78

4.9 WiFi star ...... 80

4.10 WiFi traffic...... 81

4.11 WiFi/WiFi-Direct with up to 3 devices per group...... 84

4.12 WiFi/WiFi-Direct with up to 6 devices per group...... 86

4.13 Bluetooth star formation...... 89

4.14 WiFi-Direct star formation...... 91

4.15 WiFi/WiFi-Direct star formation...... 93

5.1 Benchmark Scenarios (from [85])...... 98

5.2 Cloud to Edge Architecture (from [93])...... 100

5.3 The Edge Cloud Architecture (from [84])...... 101

5.4 Deployment at Nave Desportiva de Espinho (from [84])...... 102

xviii Chapter 1

Introduction

In the last two decades, the domestic and industrial computing paradigm has drasti- cally changed with a new, cost effective, programming paradigm called Cloud Com- puting (CC). The providers of CC solutions offer an illusion of an “infinite” processing power and storage pool supported by a vast infrastructure of powerful, server calibre, machines. Traditionally, such computing power was available only to large companies, institutions and governments. Smaller companies and businesses had to resort to their own information systems for providing online services. The introduction of CC allowed the big companies to monetise their often huge computational resources whilst providing a cost effective, highly flexible, way for smaller companies to deploy and maintain their business services on the Internet.

More recently, the exponential increase in the number of mobile devices has changed the way information is consumed. Whereas before most contents were provided by mostly static, physically connected machines on the Internet, later most of the Internet traffic is already produced by mobile devices and “things”, forming highly dynamic wireless networks. The computational and storage resources of these devices, however, still did not allow complex tasks to be performed “in loco”. Mobile devices were, at most, thin clients. In this setting that CC was given a new twist called Mobile Cloud Computing (MCC). The new paradigm allowed mobile devices to offload complex tasks and data onto a traditional CC infrastructure. The results were then sent to the devices provided they have an Internet connection.

Mobile devices, however, have since evolved considerably in terms of processing power, memory and storage space, some rivalling desktop systems only a decade old. These resources can be exploited to perform complex tasks locally, something unthinkable

1 2 CHAPTER 1. INTRODUCTION only a few years ago. Mobile devices today have the resources to provide thin services. Moreover, many application scenarios never really benefited from the MCC paradigm as they rely heavily on local data, generated and stored in the devices, require high bandwidths and low communication latency. This realisation, the availability of stock hardware supporting high performance wireless connections, and the ubiquity of the devices set the stage for Mobile Edge Computing.

1.1 Motivation

The number of mobile devices have been steadily increasing. It is predicted that by 2020 there will be more than 11.6 billion connected devices, exceeding the world’s projected population at that time (7.8 billion) [21]. The same study suggests that the mobile global traffic will reach 35 exabytes per month in 2020 (contrasting with 11 exabytes in 2017), 75% of which will be video content. Handling such a high traffic will be really challenging for the communication infrastructures (e.g. 3G/4G and WiFi) that must evolve in order to support it (e.g, 5G products are hitting the market now).

Nowadays, a typical mobile device features like computing, memory and storage resources are equivalent to laptop computers only a decade old and more and more beefy models are announced at a very fast pace. Such resources did not go unnoticed by researchers who envisioned using the devices for more challenging tasks, from traditional thin clients up to (thin) service providers. Moreover, new standards and protocols, such as WiFi-Direct, allow device-to-device (D2D) communication with low latency and high bandwidth. Taken together, these advances allowed researchers to propose a computational model similar to Cloud Computing except that the resources are provided by devices at the edge of the Internet, connected via wireless links in highly dynamic network configurations. These “edge clouds” provide the backbone upon which multiple crowd-sourcing applications supporting unthetered execution can be implemented to explore data locality and fast D2D communication [26, 29, 32, 44].

Despite the growing interest in edge-clouds, relatively few crowd-sourcing applications have been proposed, either as research prototypes or commercial products. One of the reasons for this meagre output is, we believe, the lack of adequate middleware to support the development of such applications, allowing programmers to abstract away from the intricacies of D2D communication using multiple protocols, from building and maintaining a mesh network of devices, from moving and storing data between devices, from scheduling computations over the network, and other complex operations. A 1.2. PROBLEM STATEMENT 3 middleware capable of providing an API and core services with such functionality would go a long way to make application development more agile.

1.2 Problem statement

This work was developed in the context of the Hyrax project1 in which three appli- cation scenarios were envisioned to motivate and to provide initial experience before and during the design and implementation of the middleware:

• Crowded Venues: in crowded scenarios the problem arises when loads of people want to access to contents having as consequence the stress of the communication infrastructures, radio spectrum and central servers. Some of that traffic is repeated, representing some kind of context (same interests), of which can be handled more efficiently. For these scenarios, new types of mobile applications may emerge by taking advantage of content locality and the new communication technologies (D2D), like WiFi-Direct, Bluetooth LE, WiFi-TDLS in order to offload some of the traffic in alternative ways. Traditionally, these systems are pure client-server models that must be adapted to work on P2P fashion and to deal with intermittent connectivity and mobility.

• Amber alert: a crowd-sourcing application for distributed face recognition to deal with emergencies like Amber alerts, used to disseminate information about missing people. We consider crowded scenarios such as stadiums or shopping malls, where it is common for instance to have children getting lost from their parents. By using computer vision on edge clouds, it is possible to minimise, and potentially eliminate, the dependency on infrastructural clouds, by doing the search of the missing person’s photo on the local storage of each mobile device without having to disclose them to a central cloud provider. The objective is to gather positive identifications of a missing person, along with important time and spatial information.

• Disasters: several crowd-sourcing applications emerged to deal with emergency and disaster scenarios making use of mobile devices. Most of these applications, though, rely on standard communications and a web/cloud-based infrastructure, and are focused on mobile data collection and dissemination. In disaster sce- narios, infrastructural communications (WiFi, GSM/3G/4G) may be severely

1http://hyrax.dcc.fc.up.pt/ 4 CHAPTER 1. INTRODUCTION

Missing

(a) User Generated Replay. (b) Distributed Photo Search.

(c) Environmental Disasters.

Figure 1.1: Proposed Scenarios.

impaired due to adverse environmental conditions, and people (e.g., a rescue team worker or a person in distress) find it hard or impossible to communicate. The Hyrax application would let users find others that may be near the same physical vicinity, and establish peer-to-peer communication for the exchange of text messages or other media like audio/video, geo-tags, etc.

Based on these motivational examples, for which simple prototypes were implemented from scratch using Android’s API, we hoped to gain an understanding of the technical requisites and issues associated with the main goal of this work: to design and implement a middleware that seamlessly, through a rich API, allowed programmers to form and manage mobile edge-clouds and provide basic computational and storage services for crowd-sourcing applications. We had some generic requisites when we started. We envisioned that the middleware should handle the intricacies of D2D communication using a choice of technologies (e.g., WiFi-Direct, Bluetooth). This includes device discovery, network formation and maintenance and message routing. On top of this communication backbone, several low-level services should be provided 1.3. CONTRIBUTIONS 5 to support distributed computation of tasks, shared storage, streaming and publish/- subscribe operations on data. A major requisite was that the middleware should also be developed without requiring root-level access to the device (“rooted” device) as it would render it useless for general use.

1.3 Contributions

The work started with an evaluation of the available wireless technologies and an assessment of their performance with a small prototype of a video dissemination app for crowded venues. Further development of this app and related services allowed us to make the first real world experiments devices controlled by volunteers during a Champions League game, whose stream we fed into the system. The experience gained allowed us to design the bottom layers of the middleware, accordingly called “Link” and “Network”. We then moved to refactor the video dissemination application using the middleware as the backbone, ditching the Google API. A full evaluation of the middleware then followed and major test of the app in a real world scenario during a professional league volleyball game “in situ”.

The main contributions of this work were published in several conferences and work- shops as listed here:

• assessment of the latency and bandwidth provided by available wireless tech- nologies using an app for the context of the crowded venues scenario, towards their use in the middleware (published at [85]);

• design and development of an crowd-sourcing app for video dissemination in crowded venues; assessment of the app’s performance based on a real-world experiment at the faculty and comparison with results obtained in the previous work (published at [93]);

• design and implementation of the link and network layers of the Hyrax middle- ware (published at [83]);

• integration of the middleware in the crowded venues application; assessment of the app’s performance based on a real world experiment during a professional volleyball league game at Nave Desportiva de Espinho (published at [84]).

Besides this core work, we also provided continuous support for the development of other services on top of the middleware, namely: 6 CHAPTER 1. INTRODUCTION

• P3Mobile, a distributing computing service (published at [89]);

• Ephesus, a distributed storage system based on DHTs (published at [91]);

• Thyme, a publish/subscribe system (published at [17]);

• Panoptic, a distributed face recognition system (published at [36]).

1.4 Thesis layout

This thesis is organised as follows. Chapter 2 describes the wireless technologies available on Android operation system (SO) and the network formation and routing algorithms, in the literature, for mobile ad-hoc network (MANETS). Furthermore a few generic and specific middleware are shown, for edge-clouds, and some existing applica- tions that already take advantage of device-to-device (D2D) technologies. Chapter 3 presents a general view of the Hyrax architecture as well as a detailed description of each architecture layer. Chapter 4 mention how the middleware was implemented and a discussion is made about the results that were gathered for each middleware’s layer. Chapter 5 refer the work that was performed, since the beginning, that originate the Hyrax middleware, and also a brief description is done of the applications that were/are being built on top of the Hyrax middleware. To finalise, the Chapter 6 shows the conclusions, of this work, and also the next steps. Chapter 2

State of the art

The growth and proliferation of mobile devices, opens the possibility to create new applications that take advantage of their aggregate capabilities and of data locality. To that end, edge-clouds have being introduced so that the nearby devices may accomplish something bigger by cooperating with each other. This notion of proximity arises from the requirement to reduce the latency of the communication by performing some of the work, like storage and processing, at the edge of the network avoiding the path to the central servers. Such configuration alleviates the stress on the servers as well the infrastructure, which in turn decreases maintenance costs like hardware and administration teams. Furthermore, mobile edge-clouds can play a major role, in scenarios, where there is no infrastructure available, using exclusively device-to-device (D2D) communication.

No only the number of mobile devices is growing, the technology is also improving. Mobile devices are becoming smaller, computationally powerful and with plenty of storage capacity. Besides the computational resources, these new devices support a diversity of communication technologies such as Bluetooth, Bluetooth Low Energy, WiFi ad-hoc, WiFi-Direct and NFC. These new communicating features are essential to create pure ad-hoc networks bypassing traditional infrastructure.

In this chapter we go through to the most relevant developments on edge-clouds as well as the technologies and algorithms that support it.

7 8 CHAPTER 2. STATE OF THE ART 2.1 Networking

WiFi and cellular networks are becoming faster, but at same time bandwidth demand increases even faster because of the fast growth in users and contents. These constrains inevitably lead to the degradation of the link quality in dense scenarios and to face such traffic demand, new technologies are required that support reasonable bandwidth over short distances. One potential solution is the utilisation of D2D technologies that use the wireless spectrum more efficiently, enabling parallel communications. In this section, a few D2D technologies are described and also some of the most influential network formation and routing algorithms for mobile ad-hoc networks (MANETS).

2.1.1 Technologies

Currently, the mobile devices come with very rich wireless hardware resources. Some of this technologies are very common like Bluetooth and WiFi and other are relatively new such as Bluetooth Low Energy, NFC (Near Field Communication), WiFi-TDLS (Tunnelled Direct Link Setup) and WiFi-Direct. These new technologies are promising since they allow to dynamically create ad-hoc networks and also to optimise the spectrum utilisation. We now briefly discuss these technologies.

Server and Soft Ap

TDLS TDLS

Client Client Client Client Server Client

Figure 2.1: WiFi-TDLS. Figure 2.2: WiFi-Direct.

Bluetooth is a well studied and stable wireless technology standard for short range distance communication (10-30m). It operates on the 2.4-2.485 GHz ISM (industrial, scientific and medical radio band) band and jumps continuously on different channels in order to listen and communicate (frequency hopping spread spectrum). In older versions there are 79 distinct channels with 1 MHz width, but in newer versions the number of channels is just 40 with 2 MHz width [11], achieving speeds of 1–3 Mbit/s. The current version is 4.2 and Bluetooth 5, has been announced that will further increase the speed, the range and broadcast capacity [12]. 2.1. NETWORKING 9

Bluetooth Low Energy is a power-aware version of Bluetooth that was designed for the Internet of Things (IoT) and the maximum speed is 1 Mbit/s. Essentially this version consists on beaconing information that is collected by a set of subscribers. The range and speed of this version is lower than the original, in contrast it increases by far the energy capacity. A coin sized device could last months or even years beaconing information without a recharge [13].

NFC is a communication protocol that enables two devices to communicate when they are really close (≈10cm) [69]. This technology cannot be used to form edge clouds (short range and low data rates), however it can be useful to complement the edge by adding some location aware features. For instance exchange WiFi credentials, notify a position of a certain place in a venue.

WiFi is a stable long range (50-100m) wireless technology that allows high speed communication. It operates on the 2.4 and 5 GHz ISM bands and the new versions of WiFi allow gigabit speeds (802.11ac). The 2.4 GHz ISM band has just 3 non- overlapping channels (20 MHz width), while the 5 GHz ISM band has more channels available, 19 in Europe (20 MHz width). Traditional routers are just able to use one channel at the time (one antenna), but the newer versions can use MIMO (multiple- input and multiple-output) thanks to the multiple antennas available capable of send- ing and receiving data using multiple channels at same time [102].

WiFi-Direct is a standard built on top of the WiFi stack and has the ability to create a soft-AP, on demand, providing WiFi signal to other devices in the area (Figure 2.2). Thus devices in the neighbourhood are able to connect to the WiFi network under the normal circumstances. This standard offers security as well, WPA2, and essentially allows to create small WiFi networks dynamically everywhere [103]. Note this is similar to traditional mobile access point, but with an extra feature in which the devices can belong to an ordinary WiFi network while being part of a WiFi-Direct network at same time.

WiFi-TDLS (802.11z) is a WiFi feature that allows two devices under the same AP to communicate directly to each other (avoiding the AP path) (Figure 2.1). To use this feature two devices, under the same AP, negotiate a different channel to shift the communication and, if they succeed, all the data between the two devices are moved to the new channel - direct link. Otherwise, if the negotiation fails, then they 10 CHAPTER 2. STATE OF THE ART will continue to communicate in a normal way - using the AP. This feature allows to reduce the traffic handled by the AP and also to use the spectrum more efficiently [95]. Note the devices may negotiate a channel from a different band. For example a router can operate on the 2.4 GHz ISM band and their clients may create direct links using a channel from the 5 GHz ISM band.

Is expected, that in the near future, new wireless standards (e.g. 5G) will arise that utilises the spectrum more efficiently, more bandwidth, less energy consumption, and in some cases higher communication range. At same time the cellular networks are evolving to provide more bandwidth, less latency and more coverage. To fill the gaps where these technologies do not operate efficiently, and also to create alternatives, the D2D technologies will play a fundamental role to form edge-clouds where the communication will be driven to a certain context in order to operate either on dense (scalability) or in infrastructureless environments. Some of these technologies are new and must be tested in real environments and improved. The traditional WiFi offers high speeds, however in dense environments may perform poorly due the properly channel management and coverage. The new WiFi extensions or even Bluetooth can help to re-utilise the spare channels in wireless networks in order to improve the communication performance.

2.1.2 Network formation

The most widely form of wireless communication is WiFi (less coverage, higher speeds) and 3G/G networks (high coverage, smaller speeds). These technologies are very similar: a specialised static hardware (router, antenna) emitting a signal and a group of clients connecting to the network when in range - star network. The problem arises in dense areas scenarios, where a large amount of people want to communicate (e.g. stadium), raising scalability issues and in scenarios where there is no communication infrastructure. Thus, to tackle these niche new wireless technologies have risen, like as we describe in the section before 2.1.1, and the next challenge is how we take advantage of these technologies to form edge clouds. Seen this, new and more dynamic formation algorithms must be employed in order to answer to the new set of requirements. Consequently, in order to form edge clouds the devices, that are near to each other, must be organised somehow using these D2D technologies like Bluetooth, WiFi-Direct and WiFi ad-hoc. Note the WiFi-TDLS is a feature and cannot coexist without a WiFi network previously formed.

The Bluetooth technology allows devices in the same area to communicate with each 2.1. NETWORKING 11 other over short range and provides low bandwidth communication, when compared with WiFi. The network formation algorithms theme over Bluetooth is not new and in fact there is plenty of research done on this field. Bluetooth uses a specific topology - piconets and scatternets - to organise the network. A piconet is a star network where a node acts as master and the other nodes act as slaves (connected to the master), in which all share the same communication channel. The problem with piconets is that only eight devices can be part of the network. Moreover to overcome this limitation the notion of scatternet was introduced in order to interconnect the piconets by allowing some nodes to belong to more than one piconet at same time (bridge nodes). The piconets can be connected into three ways:

• Link master-to-master: In terms of number of hops this solution is the most efficient since it only needs one hop to reach another piconet, but in contrast the master nodes that belong to several piconets will drain the battery faster and will be the bottleneck of network since it needs to jump constantly over several channels and is less tolerant to churn - tree based.

• Link slave-to-slave: This solution is less efficient in terms of number hops, but allows to build larger networks and also to distribute the efforts by more nodes (not only the masters). In this case there are less nodes jumping on different channels which is good for network performance and is more tolerant to churn - mesh based.

• Link master-to-slave: This solution sits right in the middle of the previous two. The problems are similar as the first one however it allows larger networks.

Regarding the piconets and scatternets formation the authors in [64] present some guidelines to be considered and discuss the performance aspects about this types of networks. In the literature there are plenty of formation algorithms and they can be divided into two major groups, tree based and mesh based. The tree based algorithms enable structured networks which is good for routing but in contrast is less tolerant to churn and have some crucial points of failure. The next, mesh based, is less structured and adapts easily to changes, but the routing becomes a bit more complex [10]. In the following list are the most influential Bluetooth formation algorithms.

• Bluetrees: is a tree-based formation algorithm where each node belongs at most two piconets at same time. This restriction reduces the channel switching overhead. Essentially the algorithm starts with just one node (blueroot) and the remaining nodes will be added to the network sequentially, forming a tree [106]; 12 CHAPTER 2. STATE OF THE ART

• Bluenet: is a distributed version of Bluetrees in which the formation phase is divided into three main sub phases 1) visibility graph, knowledge about the whole network, 2) independent piconets formation and 3) piconet interconnection form- ing a scatternet. The advantage is the creation of balanced networks, however the algorithm lay on assumptions not feasible on real world scenarios [100];

• Bluestars: is similar to the previous one, however the visibility phase is replaced by an alternating state discovering and discoverable, which allows to form net- works with fewer information [76]. This algorithm formation is costly which compromises the implementation in dynamic environments;

• Others: the core of the algorithms presented in [57] and [86] are very similar to the ones shown before. The major difference is that these new ones have procedures to join/split piconets, nodes move and migration. For higher churn networks these procedures can be tricky because the cost of energy and time, for these changes, is high. Although for stable networks those procedures are good to minimise the network overhead in regard of scatternet properties. These algorithms follows the guidelines present in [64].

WiFi-Direct [104] is a D2D technology that allows direct communication between nodes in range. It has similar bandwidth than the legacy WiFi networks and costs less in terms of energy consumption. Comparing with Bluetooth, the bandwidth is by far higher but in contrast it consumes more energy. Depending on the scenario the technology must be chosen accordingly. For example for a chat application, Bluetooth is more than enough since it does not requires much bandwidth, but in case of a video sharing application WiFi speeds is required inasmuch as it is a bandwidth demanding application. For network formation the algorithm is simpler than Bluetooth since the Android implementation only allows to create isolated stars. To form a WiFi-Direct network one device must be elected Group Owner (GO) and the other devices connect to it using the legacy WiFi interface or WiFi P2P interface.

The Android implementation of WiFi-Direct does not allow multi-group communi- cation, however in the specification [104] this ability is an optional feature. In the literature there is some research done, in this area, that attempts to implement the multi-group communication using the two possible modes of WiFi-Direct: legacy and non-legacy. Note that a client device on non-legacy mode is able to keep a connection to an AP and be part of WiFi-Direct group at same time contrasting with legacy mode. In both cases the GO is always capable of keep a WiFi connection and participate on the WiFi-Direct group at same time. Another restriction is that distinct WiFi-Direct 2.1. NETWORKING 13 groups have always the same network mask (192.168.49.0/24) which is a challenge for communication among groups, e.g to establish a socket. The following work has addressed this issue:

• The authors in [16, 30] were the first to propose the multi-group communication using WiFi-Direct on non-routed devices. The authors propose a solution that uses both legacy and non-legacy interfaces to achieve multi-group communication using a relay node. As the devices belong to same IP network (192.168.49.0/24) they resort to the network broadcast capabilities to provide communication outside the group. Note that in Java the developers may select the network interface to broadcast messages. Android devices, by default, choose the legacy WiFi interface to redirect the traffic since the GO is connected to two networks with the same IP address range, which makes impossible to establish a socket, for example. The intrinsic limitation of this solution is the available bandwidth for communication outside the group since the maximum broadcast speed allowed is 6Mbps, contrasting the unicast speeds of 54Mbps. Another problem is that a TCP socket cannot be established outside the group;

• The authors in [80] present an early solution where the communication is per- formed just inside the group, using WiFi AP mode. They could use WiFi-Direct as communication layer that the system would work without any changes. Next the authors in [96] propose a new architecture that allows the WiFi-Direct groups to communicate to each other using bidirectional links - no need for broadcast. The authors do this by proposing two relay nodes, instead of just one, to connect two WiFi-Direct groups and provide bidirectional links. For example Relay1 is connected to GO1 using the WiFi P2P interface (non-legacy) and connected to GO2 using the normal WiFi interface (legacy), while the Replay2 is connected to GO1 using the normal interface and connected to GO2 with the P2P interface;

• The last work in [37] evaluates the different ways of multi-group communication using WiFi-Direct. The authors propose two alternatives: time sharing and simultaneous connections. For time sharing the idea is that one of the devices switches among groups in a time slicing fashion. The switching process is done by disconnection from one group and connect to another. The authors evaluated the time required for the switching process and the energy consumption for different scenarios. In the second solution the authors keep the group formed and achieve connectivity with other groups using multicast sockets, very similar to the work in [16] or changing the WiFi Direct (requires root) implementation 14 CHAPTER 2. STATE OF THE ART

to assign distinct IP address ranges to different groups and to instantiate a specific network interface. They measure the amount of time to transfer a certain quantity of data and the energy consumed by it.

There are other D2D technologies available such as Bluetooth LE (Low Energy) and WiFi ad-hoc. For Bluetooth LE the network formation is pretty much the same as the legacy Bluetooth systems. In fact this lower power version of Bluetooth has extra features that could optimise the network formation since it allows the devices in range (neighbourhood) to know something about a device (before connecting), enabling decision-making (services). Moreover this version of Bluetooth consumes significantly less power than it’s predecessor, with bandwidth penalty. Regarding WiFi ad-hoc and at this level there is no network formation issues since those are strongly coupled with routing algorithms for MANETs.

2.1.3 Routing

After the network formation the messages must be delivered somehow. Thus the next step is to study the mechanisms available to route messages in the network. In the literature there are plenty of routing algorithms for MANETS and all the existing algorithms assume unrestricted access to low level messages. In mobile devices this is possible only if the user has root permissions. Although, as we are developing mid- dleware that should run on non-rooted devices we cannot resort on such assumptions. In that case, this section will describe the most known routing algorithms and their categories in order to built an application level routing algorithm based on those.

The routing algorithms are divided into two major groups: proactive and reactive algorithms. In proactive routing nodes keep a table of routes, where the route to a specific destination can be found. These algorithms exchange periodic messages in order to keep all the routing tables up to date. One advantage of proactive routing algorithms is the ability to know, previously, the route to a destination node avoiding to discover the route on demand diminishing the number of exchanged messages. On the other hand to keep the routing tables up to date can be a challenge for larger networks or high mobility networks, because the number of messages and the latency grows rapidly. The following are the best known proactive routing algorithms:

• Optimised Link State Routing Protocol (OLSR): The OLSR [22] algorithm exchanges Hello messages among the direct neighbours allowing the discovery of 2.1. NETWORKING 15

it’s neighbours and two-hop neighbours information. After that it will select the best Multipoint Relays the best fits to communicate with two-hop neighbours.

• Destination Sequence Distance Vector (DSDV): The DSDV [75] is based on Bellman-Ford algorithm that gives the best, and only route to a destination. It basically uses two types of messages: fulldump and incremental packages. The fulldump package carries all available routing information. The incremental package carries just the updates since the last fulldump package. The incremental packages are exchanged more frequently than the fulldump packages.

The behaviour of reactive routing protocols is quite opposite, because they search for a route on-demand, in other words, they try to find a route just when it’s needed. It resorts to flooding to find the route to the destination. The main advantage of this kind of algorithms is that it does not need to exchange any control messages because there is no routing table to maintain. One obvious disadvantage is that to find a route, the network must be flooded first which, if not done carefully, will lead to network clogging and high latencies. The best known proactive algorithms are the follow:

• Dynamic Source Routing (DSR): DSR [53] is a reactive routing protocol, that requires to cache all device addresses from the source to destination to route a message. Thus, the routing information will be part of the message header. Whenever a node needs to send a message to a destination it first sends a RouteRequest package to find the destination. Then the destination will answer with a RouteReply packet in which will carry all the devices addresses transversed by the RouteRequest packet. The advantage is does not need to keep a routing table, in contrast the disadvantages are the overhead of finding a new route and also the cache of routes that can be big (for large networks). Additionally the information can be easily outdated (mobility) which may lead to inconsistencies.

• ad-hoc On-demand Distance Vector(AODV): The AODV [74] is very similar to DSR, the big difference being that AODV does not need the whole path during the RouteRequest or RouteReply, just the source or the destination node. The major advantage is that is able to adapt easily to highly dynamic networks, but in another way introduces more delays at route discovering.

The aforementioned routing protocols are the most popular and the reader may find more detailed information about them and performance comparisons in [1, 51, 62]. Be- sides the proactive routing protocols and reactive routing protocols there are protocols 16 CHAPTER 2. STATE OF THE ART that join the best of the two worlds, called hybrid routing protocols. Generally, the hybrid routing protocols have the proactive behaviour in the proximity neighbourhood and reactive behaviour for further nodes. The two best known hybrid routing protocols are:

• Zone Routing Protocol (ZRP): The ZRP is divided into two main parts: Intra- zone Routing Protocol (IARP) and Inter-zone Routing Protocol (IERP). The IARP protocol defines a radius within which the algorithms has a proactive behaviour (table-driven). When the destination node is outside the zone, in this case it uses the IERP that has a reactive behaviour (on-demand) to find the routes in others routing zones. More detailed information and specification about it in [43].

• Zone-based Hierarchical Link State Routing Protocol (ZHLS): The ZHLS is very similar to ZRP the difference here is that a node can just to belong to one zone. To have notion of zone the ZHLS uses Global Positioning System (GPS) to define non-overlapping zones. More detailed information about ZHLS in [46, 79].

We have seen the major categories of routing protocols which are proactive, reactive and hybrid routing protocols. In the literature there are dozens more routing protocols, however most of them derive from the aforementioned routing protocols.

Apart from routing protocols there are other protocols that are useful to exchange messages in network, namely flooding and gossip protocols. Flooding protocols are used frequently by ad-hoc wireless networks and they simply flood the network with packages so that they could reach the destination/s. They are really simple to imple- ment and naturally use the shortest path to reach the destination. In contrast it adds more duplicated packets into the network which in turn introduces more latencies because of the increasing bandwidth requirements. Additionally the packets may live forever if some precautions are not taken [2, 3]. Gossip protocols are another set of communication protocols that disseminate information across the network in more controlled fashion. The data is propagated thought the nodes like a virus and eventually, with high probability, it will reach every node in the network. The advantage is the number of messages exchanged, but on the other hand is always the risk the messages do not reach (low probability) all the desired nodes [4, 58, 98].

Aside from, the aforementioned routing protocols, a new set of routing protocols are emerging which are the routing protocols geographically-aware. To provide some insight the authors in [90] suggest a geographical routing algorithm for MANETs using 2.2. MIDDLEWARE 17 ad-hoc networks combined with infrastructure access for opportunistic routing. Thus, either nodes with infrastructure access (WiFi or 3G/4G) or without any infrastructure access may coexist together in order to create a larger network and to take advantage of D2D communication protocols. The nodes are organised into cells where the routing process is geographical location-based instead of ordinary networking address. The routing process attempts to select the best path to a destination based on some heuristics, like energy consumption, the number of hops or latency. This determines if the message is going be routed through ad-hoc network or over the infrastructure, if available.

2.2 Middleware

Using edge networks to perform computation or to store information, as alternative to the cloud infrastructure is a hot top topic of research. The work in [68] is a good motivation for this kind of scenarios where they use people’s nearby devices to achieve large-scale distributed computing. In the experiment, they took real world traces of human encounter at the university campus and at a conference. After that, the authors analyse the utility of the computation and task farming efficiency on those different scenarios. Additionally they study the average utility of computation in different periods (1 hour interval) of the day which lead them to conclude that the amount of achievable computing (parallelism) is far greater during the student working hours, when there is more people gathering. Other on this area can be found on [28] the authors try to study the trade-offs between offloading computation to a cloud versus on the mobile edge-cloud using the Hyrax framework [61]. Is important to note that the clouds work on top of Gigabit networks while the mobile edge-cloud use WiFi 802.11n as communication relay - Megabit network, which have a big impact on system performance. To compare the trade-offs they use two different scenarios, Panorama Construction and Person Finder. On these scenarios they evaluated the total execution time, the energy consumed and the amount of traffic produced. The authors were able to show that there are applications that could benefit using mobile edge-clouds, for example the Person Finder only if it does not requires to much communication among tasks. These previous work suggest that the mobile devices have a lot of potential by cooperating together.

However, implementation of cloud-like services at the edge involves a considerable effort and implementation is not generally reliable unless a software backbone, a middleware that handles the low level details of networking, components, consistencies, 18 CHAPTER 2. STATE OF THE ART is provided to the developer. We will now describe some work that has been done towards this goal.

2.2.1 Generic Middlewares

Currently, there are a few middleware implementation available. Most require root privileges to function while a few work completely at the application level. The generic middleware is a kind of software architecture that can be adapted to new situations and is not tailored to solve a specific problem.

Hyrax

This work proposes [61] a platform derived from Hadoop MadReduce that supports cloud computing capabilities on Android mobile devices. This platform is able to perform jobs in heterogeneous networks joining together mobile devices and static servers nodes so that they could extend the application to be executed on the edge. Hadoop is a software library that allows to distribute large data sets among a cluster of devices [45]. This software package supports hardware failure, however it was not built for high churn environments, e.g. mobility scenarios. To port Hadoop to the Android system rooting all the mobile devices and making some difficult configuration was required. The port allowed distributed computation to be performed on the mobile devices using WiFi as communication mechanism. The author showed that the system tolerates node-departure keeping the system working with a reasonable performance.

Fram

The authors in this paper [49] present a content distribution middleware architecture for Android based mobile devices. The middleware is able to work without infrastruc- ture support and content dissemination is achieved using WiFi ad-hoc mode (IPv4 statically configured) which requires “root” privileges on the mobile devices. The nodes communicate in Peer-to-Peer (P2P) fashion using publish/subscribe message pattern so that to abstract the formation network issues. One interesting feature is the ability to work in intermittent scenarios, such as nodes leaving/joining the network. For example, a download is divided into fixed chunk sizes which allows incomplete downloads to be resumed later. As a proof of concept the authors implemented several applications to illustrate the Fram middleware work: Opportunistic Media Blog [50]; 2.2. MIDDLEWARE 19

Personal profile sharing [56]; Collaborative music sharing [19]; and Participating light show [60].

C3PO

The C3PO project proposes a middleware [14, 94] framework that exploits different wireless interfaces (WiFi, WiFi-Direct, Bluetooth) of the mobile devices (non-rooted) to form a different type of network. They provide multi-hop networks by structuring the devices on a collection of micronets, grouped together in independent macronets. A micronet is essentially a star network (like Bluetooth piconets) where one device is responsible for the group and the remainder may communicate to each other directly. A macronet is a group of micronets interconnected through gateway devices (a device that belongs to more than one micronet), similarly as Bluetooth scatternets. As message forwarding mechanism they use flooding and gossip-based routing protocols, which are configurable and extendable (custom routing protocol allowed). To finalise they permit two communication paradigms: publish/subscribe and P2P. In the pub- lish/subscribe model the users may publish multimedia content of a specific topic and subscribe in order to receive content related to a given topic. In the case of P2P communication model they use a concept of channel that allows the users to communicate directly to a specific address.

7DS

The 7DS [73] system was originally intended for extending web browsing and e- mailing of mobile nodes beyond the WiFi network coverage. A more recent work [65] is extending the original architecture to provide a mobile generic platform to de- velop disruption-tolerant applications. The 7DS platform is intended for working on intermittent infrastructure WiFi networks (not ad-hoc networks) and provides two main modules: discovery and data sharing. The discovery module is built on top of mDNS [20] that allows announcement, discovery and services resolution in close proximity. The data sharing module permits the users, when in physical range, to synchronise their shared contents (group of interest) transparently, e.g, a group of users editing one document. They only synchronise the content differences, which is a cost efficient mechanism to save bandwidth, using rsync delta encoding algorithm [97]. Applications like Gnutella [54] and BitTorrent [23] provide file sharing in always connected environments. Similar systems, iClouds [48] and Clique [82] implements file sharing mechanisms but the authors in [65] claim they are less efficient. There 20 CHAPTER 2. STATE OF THE ART are other, like Hayes [47], Proem [55] and Peer2Me [99] , systems that take advantage of Bluetooth to form ad-hoc networks for file sharing however they do not provide automatic service discovery and they are also limited by bandwidth.

Multipeer

The Multipeer Connectivity framework [9] is developed by Apple and enables services to be advertised and discovered between nearby iOS devices using different wireless protocols. The wireless protocols that can be used are WiFi, WiFi P2P and Bluetooth. The connected peers are able to transmit messages and files securely, without any web infrastructure. This framework can be divided into three main parts, ”advertising & discovering”, ”session” and ”sending & receiving information”. In advertising & discovering, the peers need to be aware of the services available, therefore the later need to be advertised. After that, a discovery process must be performed in order to find which services are available. Secondly, in order to communicate among devices a session needs to be created. Sessions allow identity verification, through X.509 certificates, and encrypted communication, which significantly reduces the transfer rates. Once a Session is created the devices are ready to exchange messages/data. The Multipeer Connectivity framework has three different forms of data: messages (e.g. short text, serialised objects), streams (e.g. audio, video, real-time data) and resources (e.g. images, movies, documents).

Open Peer

Open Peer [71] is a P2P signalling protocol whose goal is to enable any kind of device to communicate in a P2P fashion. It is implemented in C++ and it has support for language bindings, like Objective C and Java, allowing this to function in iOS and Android devices. Open peer is an open source thin layer that works upon the webRTC [101] standard and is able to work fully in P2P manner offering security (privacy, identity validation and cryptography), network resilience, scalability and federation. In terms of security, it permits the users to exchange information in a free way, without firewalls restrictions of centralised clouds or data centres. WebRTC is an open project, supported by Google, Mozilla and Opera, aiming to provide a way of browsers and/or mobile applications to communicate directly to each other. For this, the devices must support real time communication capabilities [101]. 2.2. MIDDLEWARE 21

P2pKit

The P2pkit [72] is a proximity-aware SDK that enables nearby search of devices and users. Basically, the p2pkit seeks to sense which devices are in the neighbourhood interchanging information, even in places without any type of traditional communica- tion infrastructure. It provides some features such as fast nearby discovery, range estimation (tell us how close someone is), dynamic content exchange (broadcasts information even in places without connectivity), multi platform support, background compatible (works even when the users withdraw their apps from foreground) and is energy efficient. The P2pkit is a commercial, developed by a Swiss company, Ueppa, and it uses Bluetooth low energy to announce services discovering who is nearby and resorts to WiFi-Direct or ad-hoc to transfer content. It supports iOS and Android.

Nearby

Nearby [42] is an API developed by Google, that provides publish and subscribe methods based on proximity. This API is cross-platform which can be used either by Android or iOS devices. This allows to create real-time connections between nearby devices and share messages. Nearby is divided into two distinct APIs: the ”nearby messages” API and the ”nearby connections” API. The nearby messages API is a publish-subscribe API that allows a set of interconnected devices to exchanged small amounts of data. The devices do not need to be in the same network but they must be connected to the Internet. This API uses a combination of Bluetooth, Bluetooth Low Energy and WiFi to establish communication between devices. The nearby connection API allows apps to discover others devices in the neighbourhood, forming a local network, which in turn keeps permanent connections so that to exchange messages in real-time. With this API, one device will become a host and the others must to connect it to form such local network.

AllJoyn

AllJoyn [5] is an open source, agnostic network framework that allows communication among several devices. It has been developed by a consortium, e.g. Microsoft1,

1https://www.microsoft.com/pt-pt/ 22 CHAPTER 2. STATE OF THE ART

Qualcomm2, Sony3, Sharp4 and it is natively implemented in C++, but has the support of bindings to other languages like, C, C#, Objective C, Java and JavaScript. Besides that, it runs on different operation systems like Windows, Linux, OS X, iOS and Android (>= 2.1). The AllJoyn framework abstracts the way the devices communi- cate, hiding the complexity of distinct network protocols and hardware, leaving only application details for the users. The communication is performed using two main objects: BusAttachment and BusObject. The BusAttachment is an abstract message bus, while the BusObject is the messages/data exchanged. The AllJoyn framework supports distinct network technologies as well as to run in different hardware devices (e.g. smartphones, laptops, desktops). To the best of our knowledge only WiFi infrastructure is actually implemented, and the peers are organised as mesh of stars, where the leaf nodes are connected to router nodes and these act as bridges to the others router nodes [6].

Serval Project

Serval [87] is a project with the objective of reducing the load of cellular networks, making possible to perform phone calls using only WiFi infrastructure (not resorting to cellular infrastructure). They developed a hardware [88], called the Serval Mesh Extender, aiming to communicate even in areas without any WiFi or cellular infras- tructure. Serval Mesh Extender hardware, is a peripheral that allows to transform a smartphone into an access point, enabling other devices to connect to it. Additionally, this hardware, is able to cover a wider range than a common mobile WiFi chip and consumes substantially less energy. Another motivation for them to build such hardware is that the android does not allow WiFi ad-hoc mode connection without requiring a custom ROM, rooting the device. Essentially, they are able to make phone calls resorting only to WiFi, forming a mesh network where the several ad-hoc access points are able to communicate with each other. More information may be found in [39–41, 88].

Open Garden - MeshKit

The Meshkit [63] is a proprietary software module that enables P2P communication among nearby mobile nodes. It uses different types of wireless communications, like

2https://www.qualcomm.com/ 3http://www.sony.com/ 4http://www.sharp-world.com/ 2.2. MIDDLEWARE 23

Bluetooth, Bluetooth LE, WiFi Direct, to form a mesh network in order to deliver messages or share Internet connection to peers that are part of the mesh network. Thus every node, in such mesh network serves as relay of messages and allows to form dynamic networks taking into account node mobility. The mesh connectivity grants the possibility to function on intermittent connectivity scenarios or even in scenarios without any communication infrastructure.

2.2.2 Special purpose

Besides the general purpose middleware architectures there some more specialised projects that solve very specific problems like distributed task management or low level services management. mCrowd

The mCrowd [105] is a mobile crowdsourcing platform that uses the sensor-rich iOS mobiles devices to execute large scale tasks. The general idea is that users can post sensing tasks to be further explored by mobile workers who can decide to answer to the posted tasks using their mobile phones. Additionally, mCrowd, exploits the existing crowdsourcing services like Amazon Mechanical Turk (MTurk) [67] or ChaCha [18]. MTurk and ChaCha are services that empower the human intelligence to perform tasks that computers cannot usually do. For example answering to the question “which is the best restaurant of a certain area?” is not an easy task for a computer. Thus the solution goes by asking around through a mobile app that question, for example using D2D technologies to take advantage of the crowd. The authors claim mCrowd supports four type of tasks: image tagging; image data collection; textual queries from users; and GPS location-based queries.

Femto Clouds

Habak et all [44] present the design and architecture of a distributed computing system. Their system uses a central controller (cloudlet), which provides WiFi signal, to which the mobile devices in range may connect to it and provide the available resources to perform parallel tasks. The core of the FemtoClouds system is on the central controller since it implements a sophisticated scheduling mechanism to assign tasks to the mobile devices and is also able to control when the devices must communicate in order to avoid 24 CHAPTER 2. STATE OF THE ART the WiFi congestion problem. The scheduling algorithm assigns tasks based on some metrics that are dynamically inferred, e.g., heart-bit, WiFi strength, and others that are provided by the user’s profiles that dictates some rules to join a FemtoCloud or not, e.g, minimum battery level, percentage of CPU usage. Additionally the FemtoCloud system has churn handling in mind which makes dynamic adaptations to the tasks assignment as the users leave or join the system.

MMPI - Mobile Message Passing Interface

MPI [66] is a programming model used by the High Performance Computing com- munity to perform distributed/parallel jobs on a cluster of static machines. With the proliferation of Bluetooth capable mobile devices the authors in [25] thought to take advantage of that by proving a MMPI library that could execute MPI jobs on Bluetooth capable devices such as laptops, smartphones. The authors were able to perform an experiment on Nokia devices, distributed matrix multiplication, showing execution speedups. Two years later, the authors [24, 27] showed promising results on calculation the Mandelbrot set fractal using Bluetooth piconets and Bluetooth scatternets, respectively, using the MMPI (embarrassingly parallel only).

Honeybee

The Honeybee [31] is a distributed computing framework that employs “work stealing” to dynamically load balance tasks across mobile devices. Those tasks can either be automatic CPU intensive tasks, e.g., image processing, or tasks that require human aid, e.g, image tagging, surveys. The authors in [32] developed three Android applications, on top of Honeybee framework, using different real mobile devices. Those applications are: distributed face detection; distributed Mandelbrot set; and collaborative pho- tography. The computation is fully done by mobile devices which are interconnected to each other using Bluetooth. The papers shows that collaborative work may be beneficial since the tasks do not introduce much communication overhead (dependen- cies). In [33] they were able to conduct a similar experiment but using the WiFi- Direct as communication mode among the mobile devices. The results suggests that simply moving the communication technology, to WiFi Direct, the speedup increases significantly since the WiFi based technologies have far more bandwidth than the Bluetooth based. 2.3. CROWD-SOURCING APPLICATIONS 25

Bonjour

Bonjour Services [8] are a zero-configuration networking technology developed by Ap- ple. A zero-configuration networking is a network that is automatically formed without any manual intervention [81]. This implementation is open source, and supports OS X, iOS, Linux and Windows. The services work at the IP level, and are divided into three main components: ”Addressing”; ”Naming”; and ”Service Discovery”. The Addressing component permits to allocate different IP addresses to distinct devices, while Naming creates an alias for each device. The Service Discovery works similarly to a publish-subscribe service, where the peers advertise their services so that the other peers on the network may listen and connect to them. In order to the devices in a network to be discovered, upon turning on the Bonjour service, they announce themselves to the network. At the same time, other nodes in the network, already running a Bonjour Service, periodically ask the network what devices are available. As an optimisation, these requests are sent to the network using a exponential back-off and service announcement optimisation, thus decreasing the bandwidth use. Other optimisations used are DNS caching and suppression of duplicate messages.

2.3 Crowd-sourcing applications

In the literature there are already a few real world applications that take advantage of the device proximity by enabling P2P communication. On this section a few such applications are described.

FireChat

Firechat [35] is a proprietary mobile app, based on Firebase [34] and developed by Open Garden5. The Firechat app offers a secure multi-user and multi-room chat with flexible features, such as authentication, moderator capabilities, user presence and search, private messaging and chat invitations. This mobile app uses an underlying network abstraction framework called Open Garden (the same name of the company). It allows the formation of mesh networks and communication without having an Internet connection. To achieve this, it uses distinct wireless technologies, e.g., Bluetooth, WiFi and Apple’s Multipeer Connectivity Framework (only supported on Apple devices).

5https://www.opengarden.com/ 26 CHAPTER 2. STATE OF THE ART

ZombieChat

Like Firechat, ZombieChat [107] seeks to build a mesh network for users to exchange messages without the need of an infrastructure. It relies on Bluetooth, WiFi P2P and WiFi to form a mesh network allowing to communicate in a vicinity or beyond that. This app is available for iOS only.

2.4 Discussion

So far, some middlewares (generic and specific) were described. In this section a discussion will be made in order to justify the reason why we did not choose any the options available. For the concept of edge clouds to properly work, generally, ad-hoc networks formation is required using dynamic configurations, e.g., D2D communication technologies. For that reason we hoped to build a middleware that is technology agnostic, i.e, it allows communication between nearby devices using any of technologies available. The middleware must allow the creation and connection of multiple ad-hoc networks (for larger coverage) and route messages in a transparently way. We identified the following set of requirements that must be addressed:

• Non-rooted: The middleware must work on non-root devices since the target audience is common people. Is not feasible to oblige the users to root their devices since most of them do not possess the knowledge to do so and also they may lose the device’s warranty. Another problem of routing devices to perform communication is the lack of control for energy consumption which is a precious resource in mobile environments.

• Distinct wireless support: Every wireless technology has its own limitations and for that is useful to provide support to multiple technologies. Another reason is that WiFi networks are not available in every place so other communication means must be employed. As such, the middleware must support different wire- less technologies such as Bluetooth, Bluetooth LE, WiFi, WiFi-Direct, 3G/4G.

• Ad-hoc networking: Pure ad-hoc network connections, i.e, to be able to perform communication without assistance of pre-installed infrastructure. The infras- tructure may not be available at a certain time or may be overloaded, in that case alternatives are required to perform communication. 2.4. DISCUSSION 27

• App-level Routing: As multiple technologies are being used, every message must be able to traverse distinct network interfaces, transparently. To do so, an app- level routing mechanism is needed. Note that the devices are non-rooted which is another motivation to do the routing at app-level. For some technologies, like WiFi the routing can be a mix of network-level and app-level since sockets already allows to pass through several nodes.

• Abstract communication: The middleware must be technology agnostic which means that does not matter which technology is being used to route a specific message. From the developer’s perspective the only interface available is a send/receive of bytes. Another motivation for this abstraction is to hide the complexity of traversing distinct wireless technologies.

• Basic Services: As a general purpose middleware it must provide a few basic services like computing, storage, publish/subscribe (P/S) and streaming. The services are important since they allow to discover peers that are interested on the same subject e.g. a video. Besides being technology agnostic, the services are a simpler form to listen to changes (P/S) or to distribute work, like computing and/or storage.

• Open Source: The framework/SDK must be open. It’s easier to expand/change an open source project in order to fix bugs, to adapt to new situations and also to add new features.

Open Garden Serval P. AllJoyn MultiPeer Open Peer P2pKit Nearby Open Source  XX  X  X Non-rooted X  XXXXX Multiple Technologies X  X  XX Ad-hoc Network XX  X  X  Multi Hop Network XXXXXX  App Level Routing X  XX  Abstract Comm  X  Multi platform X  X  XXX Basic Services 

Table 2.1: Frameworks/SDKs available.

Table 2.1 depicts a summary about the requirements, we wish, against the frameworks available. First of all, we eliminate the frameworks that are not open source, Open 28 CHAPTER 2. STATE OF THE ART

Garden, MultiPeer and P2pKit. For evaluation purposes is really important to known what is happening under the hood and also to modify. The Serval Project cannot be utilised since it does not meet the non-rooted requirement. Other middlewares like Hyrax, Fram, C3PO and 7Ds can not be used for the same reason. Following the next requirement, multiple technologies, the framework AllJoyn and Open Peer are eliminated since they were built to work on top of WiFi networks only. The Nearby framework cannot be used since it needs to keep a WiFi connection (Internet) in order to function.

In order to provide a good middleware to support edge clouds, it is imperative that aforementioned requirements are meet. Otherwise the developers lose the focus on the application. We believe that is the way to provide a flexible piece of software were all the complex network and routing details are hidden. Additionally, we wish the middleware to fully use the power of D2D technologies which in turn to take advantage of distributed systems over P2P networks. The edge clouds do not seek to replace the cloud computing, although as the mobiles devices market is growing and these are becoming more powerful they can be a strong ally to perform computation locally, save content in proximity which consequently permits to alleviate the stress that the infrastructures and central servers have today. Chapter 3

A Middleware for Edge-Clouds

This chapter presents the design and implementation of the Hyrax middleware for edge-clouds. The middleware is divided into three main layers: the link layer, the network layer and the services layer. Here we focus on the first two where most of our contribution has been made, and leave the services layer for a later chapter.

3.1 Overview

The design of the middleware was constrained by the pre-requisites identified in the previous chapter. Namely, the middleware should:

• not require rooting the devices;

• seamlessly support multiple wireless technologies;

• support the formation ad-hoc networks using distinct technologies;

• allow the composition of multiple ad-hoc networks;

• seamlessly route messages across ad-hoc complex networks;

• provide simple send/receive semantics, both synchronous and asynchronous;

• provide builtin basic edge-cloud services;

• be open source.

29 30 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

The first item is a stringent constraint and it forced us to design and implement an application level middleware that sits on top of Android on the devices. Moreover, support for multiple technologies that access the wireless medium (at the 2.4 and 5 GHz bands) and the formation and management of ad-hoc networks of devices using these technologies requires a relatively sophisticated link layer that abstracts away the programmer from the intricacies of the hardware and provides a rich API to provide device-to-device connectivity with full control of technology type, channel properties and message semantics. This layer is anchored in Android’s native APIs for each of the wireless technologies and is the main reason why only versions upwards of Android SDK API 19 (Android 4.4/KitKat) are supported, the APIs for earlier versions being too volatile.

Typically, however, current technologies support only rather small ad-hoc networks. To effectively crowd-source the resources of enough devices - which not only pro- vides more computational/storage muscle but also renders the infrastructure more resilient to device malfunctions, intermittent communication and churn - we need to compose more of these ad-hoc networks into single logic networks. Devices in these large networks would then be able to seamlessly communicate point-to-point, the messages being routed automatically both within and across individual networks. Such functionality is naturally built on top of the link layer and naturally provides what we call the network layer.

After we have the ability to form large networks of devices and to perform point to point communication between devices therein, we are left with the task of building ba- sic edge-cloud services over these networks. Typical cloud services include computing and storage systems. The rationale for these is obviously not the same as that of these services in traditional clouds. Edge-clouds take advantage of the locality of information to store and disseminate it and to compute over it. Other services, however, are pos- sible in this scenario that are difficult to envision in traditional infrastructures. These include publish/subscribe and streaming services that are particularly interesting here due to the local availability of information and the relatively high bandwidth and low latencies that can be attained in edge-clouds. Thus we envisioned an extra services layer built on top of the former two.

Finally, the open source pre-requisite had implications in terms of the middleware implementation, forcing us to use only auxiliary tools that are themselves open sourced or, alternatively, implementing our own tools.

Thus, overall, the middleware provides a rich API to the programmer that facilitates 3.2. LINK LAYER 31 the building of edge-clouds with specific basic services readily available for development of more sophisticated apps. The programmer retains, however, a lot of control of even the lower levels of the network, e.g., choosing the wireless technologies and their runtime parameters, choosing a routing algorithm or providing his own. In general, however, the programmer is able to focus on the logic of the application rather than the gritty details below.

Figure 3.1: Hyrax Middleware.

Figure 3.1 shows a high-level view of the final middleware architecture, with the three layers depicted together with some of the functionality they provide. The link layer provides a normalised API to control distinct wireless technologies, e.g., WiFi- Direct and Bluetooth. On top of the link layer, the network layer provides an API to control the network composition and routing. From here, the developers may choose their network configurations and then they are immediately prepared to send/receive network messages using unified communication interface. Finally, the service layer provides building block services for edge-cloud applications.

3.2 Link Layer

Current Android devices support a panoply of communication technologies. For each technology a different library is provided that makes the usability a bit complex since 32 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

there is no common way to interact with such libraries. For this we designed the lowest layer of the middleware, the link layer, to offer an unique API to control multiple communication technologies, such as: Bluetooth, Bluetooth Low Energy, WiFi, WiFi- Direct and 3G/4G. TDLS (Tunneled Direct Link Setup) for WiFi and WiFi-Direct is also supported.

3.2.1 Architecture

Figure 3.2 shows the internal architecture of the link layer, as built on top of the Android operating system. It is composed of three sub-layers: the Application Pro- grammer Interface, the logic controllers and the translators.

API

Sync Api Async Api Promise Api

Logic Controllers

Synchronous Controller Asynchronous Controller

Discovery Connection Visibility Acceptance Services

Translators

Wifi Wifi Direct Bluetooth Bluetooth Le 3G/4G

Link Layer

Wifi Libraries Bluetooth Libraries Others

Android OS

Hardware

Figure 3.2: The link layer internal structure.

The API provides an interface for the developers to communicate with the link layer. At this level the developers are able to interact with distinct wireless hardware using a common controller, that hides completely the differences among the Android libraries. The native Android libraries provide only asynchronous messages. In the link layer, these are multiplexed into three execution models: synchronous, asynchronous and promises (chainable asynchronous calls). In the synchronous model each API call is 3.2. LINK LAYER 33 blocked until a result is available, success or error. The API calls on the asynchronous model are performed and released immediately and the result can be caught later on a callback function instantiated by the developer. Finally, in the promise API the calls are asynchronous as well, but with the ability to chain several asynchronous calls together which makes the code more readable and maintainable, avoiding the ”callback hell” problem [15, 77]. The interaction with the wireless hardware can be unreliable and occasionally the actions may take too much time to return something or even do not return nothing at all. Thus to prevent undesirable situations every API call has a maximum time to execute and if it is not executed within the timeout interval then the call will return a timeout error.

The Logic Controllers is where the abstraction is achieved by offering a logic control over wireless technologies. Every action performed on the link layer has a logic correspondence which will be translated later into a concrete technology specific action by the translators. The logic actions are subdivided into five modules: discovery, connection, visibility, acceptance and services. The discovery module is responsible for discovering devices in the neighbourhood, e.g., a WiFi scan to discover reachable access points (AP). After the discovering process a connection phase may be needed. The connection module will take care of connection establishment between two devices, e.g., a connection to an AP. In order for the devices to announce themselves to other nearby devices a visibility control is required. The visibility module allows the user to control the visibility of a device over time. For a connection to be established, a device must be willing to receive connections from clients, moreover the acceptance module deals with the reception of client connections by implementing a few control rules (if necessary), e.g., a server device can limit the amount of clients connected at same time. Generally, the discovery and connection modules are used by client devices while the visibility and acceptance modules are used by the server devices. Depending on the wireless technology the devices can be either client or server at same time, e.g., this is a requirement for mesh networks. Finally, the service module, which is embedded in the visibility and discovery modules (a feature), permits to implement a more fine grained control about what is published or discovered, e.g., a server device may publish extra information (a service), so that the clients may run a search for a specific service. Besides the logic control over the actions, the synchronous and asynchronous models are also logic operations on the link layer. For example the synchronous calls are achieved at this level, by blocking the current thread until a notification is received. 34 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

The Translators are composed of technology specific classes that translate logic commands into concrete Android library calls. For each wireless technology there is a class that translates all logic actions to technology specific actions. For example, the WiFi translator receives a logic discovery action which is redirected to a concrete discovery call in the Android WiFi library. Note that all interactions from here are asynchronous. The wireless technologies supported are WiFi, WiFi-Direct, Bluetooth, Bluetooth Low Energy and 3G/4G. Note that not all technologies support all the predefined logic activities. In such cases, the technology specific translators send back appropriate exceptions as results. For instance, the 3G/4G technology does not support discovery, thus the 3G/4G translator returns an error when this logic action is called so that the developer may have a human friendly message about why the action has failed. Table 3.1 depicts the logic actions available on the link layer and shows whether these actions are supported ( X ) or not (  ) for each of the wireless technologies.

Action Bluetooth Bluetooth LE WiFi WiFi-Direct 3G/4G Hardware On/Off XXXXX Discovery On/Off XXXX  Connect On/Off XXXX  Visibility On/Off XX  X  Accept On/Off XX  X   Service On/Off  XX X 

Table 3.1: Supported Logic Actions per Wireless Technology.

 The symbol “ X ” on the “Service” row and “WiFi” column means that WiFi supports services, but only if the devices are connected to an AP. In WiFi the services are UDP packets broadcast through the network.

3.2.2 Example

We now present a small example that illustrates the use of the link layer. Although it uses only WiFi-Direct, supporting other wireless technologies involves changing just one line of code for basic use. For more advanced usage the main structure remains the same, but the properties objects associated with technology specific features will change. Listing 3.1 shows an example of a device that becomes a WiFi-Direct group owner. Afterwards, it publishes a service that provides information to the clients. 3.2. LINK LAYER 35

1 String SERVICE_NAME ="WD_GROUP"; 2 int PORT = 8080; 3 4 LinkPromise wdLink = LinkService.getLinkPromise(Technology.WIFI_DIRECT); 5 ServerProperties serverProp = wdLink.getFeatures() 6 .newServerBuilder() 7 .setSettings(WifiServerSettings.newBuilder() 8 .enableProxy(true) 9 . build ()) 10 . build (); 11 12 VisibilityProperties visProp = wdLink.getFeatures() 13 .newVisibilityBuilder() 14 .setAdvertiseId("adv_id") 15 .setAdvertiserSettings(WifiAdvertiseSettings 16 .newBuilder(SERVICE_NAME) 17 .setPort(PORT) 18 . build ()) 19 .setAdvertiserData(new WifiAdvertiseData() {{ 20 addData ("KEY_1","VALUE_1"); 21 addData ("KEY_2","VALUE_2"); 22 }}) 23 .setTimeout(5000) 24 . build (); 25 26 wdLink.enable() 27 .done(enabled -> wdLink.accept(serverProp)) 28 .done(accept -> wdLink.setVisible(visProp)) 29 .done(visible -> Log.d("GroupOwner", 30 "Group creation success.")) 31 .fail(error -> Log.e("GroupOwner", 32 "Group creation failure.")); Listing 3.1: WiFi-Direct - Server Side.

In line 1 the service name that will be published to the clients is defined. The service name is just a service identifier that is used to distinguish different services. Another vital information is the server socket listen port number, in line 2, used to receive client socket connections. In this example the socket port is statically defined, however it can be dynamically assigned if the user instantiates a server socket, with the argument port to 0, and then access to the one automatically defined by the operating system. In a production system this is important because another application may already be using that (statically defined) port. 36 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

In order to access a specific link layer instance, the server must obtain a handle using the link service. Thus, in line 4, the WiFi-Direct link is being instantiated using an enum object, that calls a factory method, afterwards, and returns the required link type. The link layer provides three types of links - LinkSync, LinkAsync and LinkPromise. The LinkSync allows the execution of operations in synchronous mode, while the LinkAsync performs the operations asynchronously and the results must be caught by a callback. The LinkPromise provides the best of two worlds and allows chaining several asynchronous operations. Next, in lines 6-24, the properties objects required for the WiFi-Direct group owner are created and configured. The server properties, lines 5-10, are used to fine tune the server behaviour. In this case, the properties tell the server to enable the HTTP proxy so that other devices in his group can have Internet access. Note that, by default, the settings in lines 7-9 are not required. Next come the visibility properties that announce the group owner to other devices and also, in this example, publish a service that provides information. The user may define the service settings. This is where the service name and port come in, lines 15-18, as well as the service additional data, lines 19-22. Moreover, more properties that can be defined, for example, in line 14, the visibility identifier and, in line 23, the maximum visibility time are set. These properties have default values and their definition is not strictly required. As in the server properties the lines 14-23 could be removed for basic usage only, no service and no extra data.

After the properties definition, everything is set to call the link operations. This is done next, in lines 26-32, with a chain of asynchronous operations. Every method is executed, one-by-one, in a pipeline and if any call fails the execution will jump directly to the last error function in lines 31-32. In line 26, the hardware enable operation will be triggered. Note that the user can check first if the hardware is enabled or every other operation. Even if the hardware is already enabled the method enable will work as expected. After the hardware is enabled the function defined in line 27 is called. Inside that function the command to accept connections will be triggered. When it is ready, another function, in line 28, is called and again the next command will be triggered, which is the visible operation. If the visible action succeeds then another function will be called, in line 29, and from here on the WiFi-Direct group owner is publishing its state to nearby devices.

Listing 3.2 shows the client side code for this example. In order to join a group, the clients need to search for available group owners publishing a specific service. When that service is found, a client attempts to establish a connection with the corresponding group owner. 3.2. LINK LAYER 37

1 String SERVICE_NAME ="WD_GROUP"; 2 3 LinkPromise wdLink = LinkService.getLinkPromise(Technology.WIFI_DIRECT); 4 DiscoveryProperties disProp = wdLink.getFeatures() 5 .newDiscoveryBuilder() 6 .setScanId("scan_id") 7 .setScannerFilter(WifiScanFilter 8 .newBuilder() 9 .setServiceName(SERVICE_NAME) 10 .setFilter(device -> device 11 . getName () 12 .startsWith("prefix")) 13 . build ()) 14 .setStopRule(devices -> devices.size() > 1) 15 .setTimeout(5000) 16 . build (); 17 18 wdLink.enable() 19 .done(enabled -> wdLink.discover(disProp)) 20 .done(devices -> { 21 Device dev = pickOne(devices); 22 Collection sData = dev.getScanData(); 23 return wdLink.connect(dev, 24 wdLink.getFeatures() 25 .newConnectionBuilder(dev) 26 .setTimeout(10000) 27 . build ()); 28 }) 29 .done(devConnected -> Log.d("GroupClient","Device connected.")) 30 .fail(error -> Log.e("GroupClient","Group not connected.")); Listing 3.2: WiFi-Direct - Client Side.

In line 1, the service name is defined in order to filter group owners with this specific service name. Note that this service name is exactly the same as the one defined on the server side. In line 3, the WiFi-Direct instance is obtained. The link layer handles are instantiated just once, however they can be accessed multiple times from different locations - singleton class - using the link service. Next, in lines 4-16, the discovery properties are set. They define the rules to search devices and services. By default the discovery process returns every device that it finds. However some filters may be applied in order to reduce the number of discovered devices. For that purpose the scan filter, lines 7-13, is defined. First of all, the search will be filtered by the service name, 38 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS line 9, and the devices that match that criteria will afterwards be filtered, again, by a user defined function that filters the devices by a prefix, lines 10-12. In this example, this filter does not do much but exemplifies the expressiveness and flexibility that can be added to the discovery process. Another interesting function is provided in line 14. This is a function that tells the discovery process to stop if a certain criteria has been met. In this case when more than one device has been found. This provides an elegant possibility to stop the discovery earlier than a defined timeout. Next, in line 16, the maximum time that the discovery process will run is defined. Note that this timeout has a default value as well. For just basic usage, lines 6-15 are not strictly required, meaning that the discovery will return every device that finds.

Lines 18-30 show the main skeleton of the client side code. As in the server side, the operations are executed one-by-one as a chain of asynchronous calls. In line 18, the WiFi hardware is enabled, and then the function in line 19 will be called and the discovery process will start. After the discovery finishes, the function in line 20 will be called with the discovery result - a collection of devices. After, one device must be selected in order to establish the WiFi-Direct link to the group owner, in lines 21-27. Every discovered device may have multiple associated services. In this case, the services can be retrieved as presented in line 22. Remember that, on the server side it was defined the server socket port and additional information that can now be accessed using the method in line 22. Now, to connect to the group owner the operation “connect” will be triggered with the remote device object and also with some connection properties. An example of such property can be a password, or even the timeout time, line 26, for the connect operation. After that time an exception is thrown. If the connection succeed then the function in line 29 will be called and from here on the device is connected to a group owner. Now the client device has everything set to establish a socket with the server, in order to perform communication. For example the client may use the port defined by the server published on its service. That information can be accessed using the method defined in line 22. Finally, if any operation on the chain fails then the error function will be called, in line 30.

The code in Listings 3.1 and 3.2 shows how to form a WiFi-Direct group with nearby devices. It can be easily adapted for Bluetooth by changing only the link instantiation Technology.BLUETOOTH instead of Technology.WIFI DIRECT. Even for distinct tech- nologies the code is similar, more readable and easier to maintain. For more advanced usage the only things that change are the properties objects while the main skeleton remains the same. 3.2. LINK LAYER 39

3.2.3 API

Figure 3.3 depicts the API accessible to the developers as well as the main properties objects used to customise method calls and for packaging call results. The gateway to the link layer is the interface Link which allows the programmer to control and query the available wireless technologies in a normalised way. Furthermore that are three specialised interfaces - LinkPromise, LinkAsync and LinkSync - that extend interface Link. These are essential to define the execution models (synchronous vs. asynchronous) and the methods they provide have the same names but distinct signa- tures otherwise. As expected the requirements to execute synchronous or asynchronous actions are different and the API reflects this. For example a method of the interface LinkSync returns an object while the same method on interface LinkAsync returns void and the result is caught in a callback passed as argument. The LinkPromise follows the asynchronous paradigm but allows to chain multiple async calls together leading to a more comprehensive execution flow and error management.

Each Link interface has a few properties objects that allow the customisation of some actions on the link layer. These objects follow the builder design pattern that allows to create read only objects for the sake of stability preventing unexpected behaviours. These objects are very flexible, enabling removal/addition of features effortlessly without changing API main methods signatures. All the properties objects - ConnectionProperties, DiscoveryProperties, VisibilityProperties and ServerProperties - have generic attributes (e.g., settings, data, filter) so that every technology may be fine tuned to take advantage of the best features that the hardware provides. For example, define a few settings rules so that Bluetooth LE uses less energy in the discovery and/or visibility processes. The link layer provides two other interfaces - IFilter and IStopRule - that allow higher level control of the discovery filter mechanism and how the discovery must stop, respectively. For instance, discover a nearby device with a specific address; stop the discovery process when such a device has been found.

The link layer also provides well defined events. Thus there two enumerations - Tech- nology, LinkEvent and EventCategory. The later two are used to characterise individ- ual events or groups of events, respectively. For example, the events LINK VISIBILITY {ON, DONE, OFF}, belong to the group LINK VISIBILITY in the enumeration LinkEvent. Enumeration Technology determines the wireless technologies supported by the link layer. Finally, there still two more interfaces - Device and Outcome. Inter- face Device allows to express each remote device, in an abstract manner, independently the technology used. Note that these type of objects are not interchangeable between 40 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

«interface» S, F LinkPromise DiscoveryProperties D S + enable(): Promise - identifier: String «interface» ConnectionProperties IStopRule - timeout: Integer + disable(): Promise - identifier: String - stopAfterTimeoutExpiration: Boolean + stop(D arg): Boolean + discover(DiscoveryProperties): Promise - password: String - scannerSettings: S + cancelDiscover(String): Promise - timeout: Integer - scannerFilter: F D + setVisible(VisibilityProperties): Promise - removeWhenDisconnect: Boolean «interface» - stopRule: IStopRule> IFilter + cancelVisible(String):Promise - settings: S + apply(D arg): Boolean Extends + connect(Device, ConnectionProperties): Promise S, D VisibilityProperties + disconnect(Device): Promise - identifier: String + accept(ServerProperties): Promise S ServerProperties - timeout: Integer + deny(String): Promise D «abstract» - identifier: String - stopAfterTimeoutExpiration: Boolean ScanData - name: String - advertiserSettings: S - originalObject: D - settings: S - advertiserData: D + getOriginalObject(): D «interface» Link «interface» Device + isEnable(): Boolean + getName(): String «enumeration» + isDiscovering(String): Boolean LinkEvent + getUniqueAddress(): String + getScanners(): Collection LINK_DETACH + getRssi(): Integer «enumeration» + isVisible(String): Boolean LINK_HARDWARE_ON Technology + getOriginalObject(): Object + getAdvertisers(): Collection BLUETOOTH LINK_HARDWARE_OFF + getScanData(): Collection + isConnected(Device): Boolean BLUETOOTH LE LINK_DISCOVERY_ON + getTechnology(): Technology + getConnected(): Collection WIFI LINK_DISCOVERY_FOUND + isAccepting(String): Boolean D LINK_DISCOVERY_DONE «interface» WIFI DIRECT Outcome + getAccepted(): Collection MOBILE LINK_DISCOVERY_OFF + getTechnology(): Technology + isSuccessfull(): Boolean LINK_VISIBILITY_ON

+ isSupported(): Boolean + getOutcome(): D LINK_VISIBILITY_DONE «enumeration» + getFeatures(): LinkFeatures + getError(): LinkException EventCategory LINK_VISIBILITY_OFF

+ listen(LinkListener): Void + ifError(Executable): Void LINK_DETACH LINK_CONNECTION_NEW

+ listen(EventCategory, LinkListener): Void + ifSuccess(Executable): Void LINK_HARDWARE LINK_CONNECTION_LOST

+ listen(LinkEvent, LinkListener): Void LINK_DISCOVERY LINK_CONNECTION_SERVER_ON

+ listen(LinkEvent, Object, LinkListener): Void D LINK_VISIBILITY LINK_CONNECTION_SERVER_INFO «interface» + listenOnce(LinkEvent, Object, LinkListener): Void Executable LINK_CONNECTION LINK_CONNECTION_SERVER_NEW_CLIENT

+ removeListener(LinkListener): Void + execute(D arg): Void LINK_CONNECTION_SERVER LINK_CONNECTION_SERVER_OFF

Figure 3.3: Link Layer API.

technologies, but in some cases they can be converted. For example, a device discovered by the WiFi-Direct technology can be converted to be used as a connection argument on the WiFi technology. The ScanData object represents the services available on a specific remote device. On the other hand, the Outcome interface is used to define the result of an action (null objects are not allowed), the motivation being to force the developers to always verify the result of an action. Besides that, it is flexible in the way of knowing exactly what the cause of the error and also allows to provide some automatic logic as in methods ifError or ifSuccess in order to execute a function, based on some generic knowledge of a result. Note that this kind of features makes sense if using java 8 lambdas paradigm with functional interfaces, like the Executable interface. 3.2. LINK LAYER 41

API

Listener Hardware Discover Visible Connect Accept Service Registry

Logic Commands Aggregator Logic Command 1 Listener 1 Logic Command 2 Listener 2 Logic Command 3 Map Listeners Listener 3 ... and Dispatch ... Perform Logic Ops Category && Type (Sync or Async)

Translators Receiver ­ Worker Thread Transl. WiFi 1 Transl. Blth 1 Event 1 Transl. WiFi 2 Transl. Blth 2 Event 2 Transl. WiFi 3 Transl. Blth 3 Events Converter and ... Notifier ... Category && Type Translate to Android Library Call Raw events to Link Layer Events

Android Libraries

Figure 3.4: Link Layer Flow.

3.2.4 Implementation

Figure 3.4 shows a detailed view of the flow within the link layer as calls to the API are made or events are received from Android, below. The top-most part of the figure depicts the actions available for developers that were already described. Moreover, the API also provides a registry module that allows to listen for events triggered by the link layer. We now present a step by step description of API calls and event processing by the link layer. Figure 3.4 depicts two distinct action flows: red, for normal API calls, and green, for registering event listeners.

Normal API call. When an action is executed the API layer selects the logic command that matches that action. Note that every action has a corresponding logic command. At same time an action receives as argument a callback function that is registered, by the API layer, in the aggregator module so that the result can be caught later on. Every callback function is freed up when the result is triggered keeping the memory clean, avoiding further inconsistency and unpredictable problems. Next, the 42 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS logic command will instantiate a matching translator. Like the actions of API layer every logic command has a corresponding translator for each technology, e.g., the translator of WiFi and Bluetooth are different. Besides that, the logic commands allow extra features that are not available on Android’s libraries. For example, applying user defined filter and stop rules during the discovery process, in all wireless technologies. The translators interpret the logic commands and maps each of them in a technology specific command, which will be executed using the Android’s libraries. Most of the interactions with Android’s libraries are asynchronous, thus the result will be caught in another phase. Eventually, the result of an action is dispatched and caught by the link layer. The link layer receiver listens for raw Android events and maps them into link layer’s events. The link layer events provide some extra properties essential for aggregator’s decision process. When a aggregator is notified of a new event it needs to decide which listeners to trigger. For example, dispatch only listeners registered for visibility related events. After that, the aggregator will notify the listeners, answering to the API action call.

Event Listeners. The API allows developers to program listeners for external events, in this case originating in the Android libraries. The listener registry notifies the ag- gregator to registry a new listener internally. When an event is caught, the aggregator dispatches all listeners that match the event.

By default the link layer’s action are asynchronous, however those actions can be executed synchronously. That happens on the logic commands module that blocks the current thread until a result is available. Besides these API actions the developers may query the link layer about some state information, e.g., the clients connected to the WiFi-Direct networks.

The link layer was implemented using Java for Android and is compatible with a few features of version 8. We make use of Java 8 lambda expressions, method references and default methods. These features are also available to developers using the API. The link layer supports Android SDK API 19 (Android 4.4/KitKat) up to the current version API 27 (Android 8.1/Oreo). This represents 96.8% of Android market share [7]. Besides this, two external libraries were used:

• JDeferred [52] (version 2.0). It adds Promise capability to the link layer by allowing to chain asynchronous calls;

• Little proxy [59] (version 1.1.0). This is a HTTP proxy that allows WiFi-Direct 3.3. NETWORK LAYER 43

group owners to share their Internet connections with devices in the group. By default, WiFi-Direct does not provide any Internet connectivity.

This layer was specifically developed for Android, however with some effort it could be migrated to any machine running Java. Some module’s implementation is Android specific, e.g., the translators and the broadcast receiver. Thus, to move this layer to another machine it would require a re-implementation of the technology specific translators, e.g., WiFi, Bluetooth and the use of some pure Java-based classes. The logic operations and the API would stay unchanged since it utilises the standard Java convention.

3.3 Network layer

The link layer allows developers to seamlessly create local ad-hoc networks of devices using distinct wireless technologies. Devices in such groups can then talk to each other, point-to-point by using synchronous or asynchronous send/receive primitives. As remarked above, these technologies support only rather small ad-hoc networks. To effectively crowd-source the resources of enough devices we need to compose more of these ad-hoc networks into single logical networks or overlays. Moreover, devices in these large networks must then be able to seamlessly communicate, point-to-point, the messages being routed automatically both within and across individual networks. This is the rationale for the next software layer of the middleware - the network layer - built on top of the link layer.

Figure 3.5 shows an example of the kind of logical network we envision, composed of devices belonging to distinct ad-hoc networks, each using its own choice of wireless technology. The network layer allows developers to compose existing local networks and build a logic network or overlay that is then provided as a single entity to apps or services. It provides a set of pre-defined network formation and message routing algorithms that can be chosen by the developer or the later can plug its own algorithms into the middleware allowing customisation for specific service and/or app scenarios. The communication between devices in the logical network is packet-based. Packets move back and forth between devices in a completely transparent way, all the intricacies of low level messaging and technology hand-shaking being handled by the network or link layers. 44 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

Figure 3.5: A Logical Network.

3.3.1 Architecture

Figure 3.6 shows the internal structure of the network layer and its main modules. Starting from bottom to top, the formation module is responsible for establishing the links to the nearby mobile devices, i.e., for making the physical connections. In other words, this module just advertises/discovers devices and establishes connections between them. After the establishment of connections, communication channels are created in order to exchange messages on the network. These channels are managed by the channel registry module. It creates/destroys channels and also manages their lifecycle. This module also controls the server channels that are responsible for accepting the client channels connections. These channel are implemented on top of regular Java TCP sockets objects and this abstraction hides the details of message read/write by offering a more flexible way for message exchange, decoupling the sender and the receiver. The channels were implemented on top of synchronous Java socket objects for simplicity, but is entirely possible to migrate to the asynchronous Java socket version (java nio channels), which features considerable less memory usage and less thread context switching overhead for medium/large networks. Note that UDP sockets are allowed and are particularly useful for service announce/discovery, on WiFi networks. Although, on the communication side is worth to pay the TCP overhead since it already solves multiple network problems hard to manage and to control, e.g., message delivery, congestion control, bandwidth adaptability.

The next step is, after the channel creation, to resolve the logic address of the end device. At this point, the two devices exchange their logic addresses making the 3.3. NETWORK LAYER 45

API

Streams Packets

Routing

Logical Address Translator

Channel Registry

Formation

Network Layer

Wifi Wifi Direct Bluetooth Bluetooth LE 3G/4G Link Layer

Android OS

Figure 3.6: The network layer internal structure. channel prompt to be delivered to the routing algorithm. The logic address is UUID based and allows to uniquely identify a mobile device independently the technology is being used. Additionally, the network layer supports multiple channels to the same device, but with distinct technologies. When a device has many channels connecting it to each other, the routing algorithm chooses the channel that best fit the app or service requirements, to deliver the message, which can even be sent through all channels. For example, for performance the routing will choose a channel on top of WiFi technology - if available. At this level each device only knows the devices it is directly connected to. To know beyond, on top of the network layer the developers may created their own discovery mechanism. The routing algorithm is also able to do that, if required. Note that, there is no direct connection between logical and physical addresses because the logic address are mapped into channels objects which in turn hide the physical address.

After the address resolution, the routing module is able to route packets through distinct devices. The routing algorithm is notified to send a packet from the user, from top layers, and then selects a set of channels to route for. In the current implementation the packets are sent asynchronously, however this is not strictly required - depends of the routing implementation. On the opposite way, the routing algorithm receives a packet, from the network, and then decides to deliver it, to the upper layer, or to 46 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

route it through others channels. The routing scheme is purely logic and, at this level, there is no knowledge about any physical network address like IP address or MAC address, only the logic address. The link establishment is done proactively by the channel registry, when notified by the formation algorithm. Alternatively, a more lazy approach would involve establishing a link only when communication was required. For some technologies this would lead to higher latencies, since the connections may take a few seconds, e.g., in Bluetooth. The top-most layer, the network layer Application Programmer Interface, provides a simple and yet rich interface through which the routing and formation algorithms can be configured from a set of builtins or defined and plugged into the middleware. Apart from this, the API offers straightforward send/receive abstraction for messaging. Additionally the API offers the possibility to listen network events, e.g., devices joining or leaving the network.

3.3.2 Examples

In this section we present a small code example (Listing 3.3) that uses the API of the network layer. The code shows how to form a logical network from Bluetooth and a WiFi-Direct local networks and then uses the logical network to allow devices therein to send and receive messages.

1 Network network = NetworkService.boot(context, Routing.Type.FLOOD_CONTROL); 2 // listens network events 3 network.addNetworkListener(new NetworkPeerListener() { 4 public void onPeerJoin(@NonNull Address address) { 5 // peer join notification 6 } 7 public void onPeerLeave(@NonNull Address address) { 8 // peer leave notification 9 } 10 }); 11 // listen network messages 12 network.addPacketListener((bytes, len, srcAdr, tag) -> { 13 // network messages 14 }); 15 network.enableFormation(Formation.Type.WIFI_DIRECT_STAR); 16 network.enableFormation(Formation.Type.BLUETOOTH_MESH_RANDOM, 17 FormationProperties.newBuilder().build()); 18 ... 19 // sends message to everyone 20 network.sendToAll("Hello All!".getBytes(), TAG_ALL); 3.3. NETWORK LAYER 47

21 // sends message toa specific device 22 network.send("Hello!".getBytes(), destAdr, TAG_ONE); 23 network.send("Hello!".getBytes(), destAdr, 24 RoutingProperties 25 .newBuilder(TAG_ONE) 26 .setMaxNumberOfHops(5).build()); 27 ... 28 network.disableFormation(Formation.Type.WIFI_DIRECT_STAR); 29 network.disableFormation(Formation.Type.BLUETOOTH_MESH_RANDOM); 30 31 NetworkService.shutdown(); Listing 3.3: Network Layer API - Example.

Lines 1 show how to instantiate the network class. The instantiation is done using a network service and the arguments are the Android context and the type (enumeration) of the routing algorithm that must be used. Note that only one routing algorithm can be utilised at same time. Also the developers can implement their own routing algorithms and pass the corresponding class objects of type Routing as arguments, instead of choosing a builtin routing algorithm. Next, a set of listeners are registered to listen for networks changes, lines 3-10, and/or network packets, lines 12-14. The first listener detects the devices that join and leave the network, only the directly connected ones. The argument of the callback functions is Address which is a logic unique representation of a remote device. The second listener listens the packets that are addressed to the current device, either unicast or broadcast packets. The arguments of the callback function are: ’bytes’ - the packet’s payload, ’len’ - the size of the packet payload, ’srcAdrr’ - the source device’s logical address and ’tag’ - the type of packet. Every packet has a tag, moreover is possible to register a packet listener with a specific tag as argument in order to listen only packets with that specific tag, ignoring the other tags.

After the listeners are registered, the code chooses which network formation algorithm to use. Lines 15-17 show an example how to start the formation process, in which every device will belong to a WiFi-Direct and a Bluetooth network at same time, however the formation processes are independent. For example, a device can have three connections using WiFi-Direct and in Bluetooth can have seven with completely different neighbours. Every formation algorithm has default formation properties previously defined, but these the can fine tuned at runtime, line 17. If the fine tuning process is not enough, then the developer can built their own formation algorithm of type Formation and pass it as an argument instead of the formation type. After the 48 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS network initialisation the packets have everything set in order to flow. Lines 20-26 depict a few examples how to sent a packet. First of all there two types of packets sendToAll, which are the ones to be sent to everyone, and send, which are the packets addressed to a specific device. In line 20 the device sends a packet to everyone else in the network, while in the following lines, 22-26, it sends a packet to a specific destination device. The developer may limit the number of hops that a packet can travel, line 26. The final lines, 28-31, show how to stop the network and free all the resources used, for example free the network layer’s internal data structures.

3.3.3 API

In the previous sections we presented the architecture of the network layer, its software modules and their interactions when the API is invoked. We now describe in more detail the main abstractions provided by the network API as shown in Figure 3.7.

Network «interface» «enumeration» Formation Formation.Type - netRegistry: NetworkRegistry + getName(): String BLUETOOTH_STAR - routing: Routing + getType(): Type BLUETOOTH_MESH_RANDOM - channelRegistry: ChannelRegistry + setFormationProperties(FormationProperties): Void WIFI_DIRECT_STAR - schExecService: ScheduledExecutorService + setChannelRegistry(ChannelRegistry): Void WIFI_DIRECT_LEGACY_STAR # start(): Void + setExecutorService(ScheduledExecutorService): Void WIFI_MESH_RANDOM # stop(): Void + start(): Void + enableFormation(Formation, FormationProperties): Void + pause(): Void FormationProperties + disableFormation(Formation): Void + resume(): Void - timeDiscovery: Limits + isOnline(): Boolean + stop(): Void - timeVisible: Limts + send(Address, Byte[], RoutingProperties): Void + isRunning(): Boolean - connection: Limits + sendToAll(Byte[], RoutingProperties): Void - acceptProbability: Bouble + addNetworkListener(NetworkPeerListener): Void «interface» - idleTime: Integer + addPacketListener(Integer, NetworkPacketListener): Void NetworkPacketListener - enableHardwareOnStart: Boolean + onNewPacket(Address, Byte, Byte[]): Void

«interface» NetworkPeerListener T Limits + onPeerJoin(Address): Void «interface» + min: T Routing + onPeerLeave(Address): Void «interface» + max: T + getName(): String Channel

+ getType(): Type + start(): Void

+ setNetworkRegistry(NetworkRegistry): Void + stop(): Void RoutingProperties + onPeerJoin(Address, Channel): Void + isRunning(): Boolean - maxHops: Integer + onPeerLeave(Address, Channel): Void + getAddress(): String - tag: Byte + onPeerMessage(Address, Channel, Message.Packet): Void + getTechnology(): Technology «enumeration» + send(Address, Byte[], RoutingProperties): Void Routing.Type + writeMessage(Message.Packet): Boolean

+ sendToAll(Byte[], RoutingProperties): Void FLOOD + setStatusListener(StatusListener): Void

+ destroy(): Void FLOOD_CONTROL + setMessageListener(MessageListener): Void

Figure 3.7: Network Layer API.

The main object, of the network layer, is the singleton class Network, which only 3.3. NETWORK LAYER 49 permits one instantiation of the network layer. Each edge application or service must instantiate its own, unique, network and link layer objects. Failure to do this and they will eventually end up managing the same physical network concurrently, e.g., two apps trying to form a WiFi-Direct group at same time. With the physical network established, each edge application or service can have its own logical network since logic links do not interfere with physical ones. Through the class Network, the developers may define the algorithms required for their applications as well as send and receive messages to devices. Every algorithm, and message, instance have a set of configurable properties in order to meet the developers requirements. For message exchange the address used is the logical address generated by the network layer. With that the messages from a device can be sent to another one (multi-hop), regardless of the technology, e.g., a message can transverse a WiFi, Bluetooth and WiFi-Direct network transparently. The developers may alternatively listen for messages by registering the interface NetworkPacketListener or listen to devices joining/leaving the network through the interface NetworkPeerListener.

To create a new routing algorithm the interface Routing must be implemented. This interface is very simple and essentially listens to network channels and also receives the calls to send messages. In this sense each routing algorithm manages the direct connected channels and decides which of the channels will the message be sent over. In the opposite direction, it receives a message from a channel device and also decides to deliver the message (triggers a message event) and/or to forward the message to other channels. To send a message the RoutingProperties must be configured in order to define a tag, a message classification, and of course the maximum number of hops so that the message will not live forever in the network. The network layer has already a few simple implementations of routing algorithms, which can be instantiated through an enumeration Routing.Type. Not every technology can use the Java Socket class in order to transfer data. e.g., the Bluetooth has its own socket implementation. To overcome these distinctions the interface Channel was created so that data can be exchanged, independently of the technology. Thus, the interface Channel provides and agnostic and simpler way the send/receive messages to a specific channel. Note that the message receive is asynchronous, then a MessageListener interface must be implemented in order to listen for messages.

Not only the routing algorithms are implemented, the formation algorithms can be as well, using the interface Formation. The implementation of new formation algorithm is not that straightforward, because the developers must be familiarised with the link layer API and also there are a few considerations to take care of in order for the 50 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS new algorithm to be correctly integrated in the network layer. For example when a formation algorithm is running and a new device has joined/left the network the correct data structures must be notified. Every formation algorithm share a common ChannelRegistry class and also a ExecutorService. The ChannelRegistry manages the channel creation/destruction while the ExecutorService is used to execute the asynchronous formation actions. To start a new formation algorithm the developer may tune the algorithm actions by setting the FormationProperties, which allows to define some basic behaviours like the time interval to perform discovery or the maximum time that a new connection may take. Some of the properties are defined with the minimum and maximum Limits in which aims to select a random number on the interval. As with the routing algorithms, the network layer is provided with a set of builtin formation algorithms that can be instantiated through the enumeration Formation.Type.

3.3.4 Implementation

API

Listener Enable/Disable Start/Stop Formation Send/Receive Registry

Routing Net Registry Channels Listener 1 Listener 2 Sender (queue) Receiver (queue) Channel 1 Channel 2 Peer Listener Packet Listener ... Channel n Listener 3 Channel Manager

Address Translator Socket

Formation ­ Worker Thread

Alg Formation 1 Alg Formation 2 Alg Formation 3 ... Alg Formation n

Link Layer

Figure 3.8: Network Layer Flow. 3.3. NETWORK LAYER 51

Figure 3.8 shows a more detailed view of the network layer’s modules and their interactions. The top-most part of the figure shows the actions available for develop- ers: enable/disable the network layer, start/stop a formation algorithm, send/receive messages and register listener to listen for network events.

The routing algorithm is specified during the network handle instantiation, implying that only one routing algorithm will be running for each network object. In contrast the formation module allows multiple instances of network topologies to run concurrently. For instance the network layer can have Bluetooth and WiFi-Direct logical networks running at same time on the same physical devices. In the current implementation each formation algorithm is completely isolated from each other. Once more, at the routing level, it does not matter which technology is being used, as the channels are completely decoupled from technology.

The network layer provides two ways of sending messages: send to a specific device or send to all devices (broadcast). Both are asynchronous. At the moment of this writing, the only communication method available is packet-based, however in the near future a stream-based approach is to be added which will be very useful, e.g., for content transfer. This layer allows to listen all messages, on a listener, or to listen and filter messages based on metadata.

The interactions within the network layer are divided into four main flows: initialisa- tion (red), formation (black), routing (blue) and listeners (green).

Initialisation. the network layer’s initialisation allocates all the internal data struc- tures required to function. The main argument is the routing algorithm, which stays the same during the active period of the network layer. To change the routing algorithm the network layer must be stopped and started with a new one. The network layer has a few predefined routing algorithms, however customised algorithms are allowed. The link layer is also initialised at this point.

Formation. At this point the developer chooses which type(s) of algorithm(s) wishes to use for the network formation process. Note that multiple algorithms, running concurrently, are allowed and can be stopped and started dynamically in runtime. The formation module is the entity that interacts with the link layer in order to build a specific network topology. When a new device is found and a connection to it is available, the channel manager is notified in order to construct a socket object through which send/receive will be performed. Also at this point, the channel manager 52 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS must resolve the logic address of the new device by exchanging special packets between the two end devices. After the address is translated, a new channel is added to the current set. The routing algorithm and the network registry are also notified of the new device. Last, the network registry will notify the developer of new peer as well. Note that developers may implement their own formation algorithms and use them with the middleware.

Routing. When a packet is sent the API layer notifies the send queue of the routing algorithm of a new packet. The packet is added to a queue and it will be sent on its turn. The message send is asynchronous and after entering the queue the program may continue to do some other work. When a packet is to be sent the routing algorithm picks a set channels to write that packet to. In contrast, when a packet arrives, the channel objects notify the receiver queue of the new received packet. At this point, the routing algorithm must choose between two paths: deliver a packet to the program through the network registry; or to re-route the packet (specifically when not addressed to the device). There is one exception though which are the “send to all” packets. These are delivered to the program and re-routed at same time.

Event listening. This is similar to the listeners in the link layer. It allows to register listeners to passively listen for events from the Android libraries (not caught by the link layer) or from the link layer, e.g., new/lost connections to devices.

These operations can be invoked and fine tuned by making calls to the network layer API. Their complexity is largely contained given the fact that they are anchored in the link layer which abstracts away many low level gritty details.

The network layer was implemented using language Java for Android and is also compatible Java 8 features. The API is very straightforward to use (c.f. Section 3.3.2 for an example) and adds no further restrictions to the Android versions supported relative to the link layer (Android 4.4/KitKat to Android 8.1/Oreo). Besides the Java 8 features, one extra external library was used:

• Protocol Buffers[78], developed by Google, that consists of a bandwidth conser- vative neutral language for efficient data structure (de)serialisation.

Thus, as in the link layer, this layer could easily be migrated to any machine running Java, since it uses normal Java libraries and rarely uses Android specific libraries, such as the Android application context. 3.3. NETWORK LAYER 53

Message structure

The network layer API provides a simple send/receive abstraction for sending messages as sequences of packets. The internals of this process, hidden from the developer, are however a bit more complex. Each packet requires a header in order to be routed through other devices. We used Google’s Protocol Buffers to define the packet headers. This meta-language allowed us to define a generic message structure provided very efficient (de)serialising algorithms for the messages.

Listing 3.4: Protocol Buffer Message

1 syntax="proto3";

2

3 option java_package="org.hyrax.network.routing.packet";

4 option java_outer_classname="Packet";

5

6 // message header

7 message Header{

8 //mandatory field tag- tags0 to128 are reserved.

9 uint32 tag=1;

10 //mandatory field for routing- message identifier

11 uint32 id=2; 12 //body length- followsa byte array of size 'body_length ' 13 uint32 body_length=3;

14 //mandatory field for routing- source uuid address

15 Address source=4;

16 //mandatory field for routing- destination uuid address

17 Address destination=5;

18 //maximum hops allowed

19 uint32 max_hops=6;

20 }

21

22 //logic address representationUUID.

23 message Address{

24 //the most significant bits

25 int64 mostSigBits=1;

26 //the least significant bits

27 int64 leastSigBits=2;

28 }

29

30 // neighbors table exchange

31 message Table{

32 repeated Address neighbors=1;

33 } 54 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

Listing 3.4 shows the Protocol Buffer specification for the message structure used in the network layer. Note that the protocol buffer only defines a packet header, the payload is former by a sequence of bytes that follows the header and is not part of the structure. The structure is compose by three main objects that support routing operations. The first object is Header which represents the packet header itself. The header packet includes properties such as:

• tag - the packet’s classification. The tag field is defined by the developers, however for routing proposes a few numbers are reserved (0-128). For example to exchange routing tables, or for heartbeat packets.

• id - the packet’s identifier. This identifier is generated and controlled by the routing algorithm and is used to detect duplicated packets.

• body length - the packet’s payload size. With this the socket that is reading the packet knows exactly how much data to read. Thus on the same stream different types of content can exchanged at same time, e.g., a chunk of video or a control message.

• source - the logical address of the sending device. The source address is registered on the packet header in case the destination wishes to reply. It is also used to detect duplicated packets. Thus, the keys to finding duplicates is in analysing the packet’s ids and their source addresses.

• destination - logical address of receiving device. Of course the destination address is required in order to reach a specific device. Note when a developers wishes to send a packet to everyone the destination address is set to a known static flood address.

• max hops - the maximum number of hops a packet can perform before being discarded. This field is used to control the reachable network radius. For example send a packet to everyone device that is, at most, two hops away.

The second object is Address and represents a logical address. This is just a Universally Unique Identifier (UUID), a 128-bit number that can be (de)serialised using only two long numbers, representing the most and least significant bits.

Finally, object Table is used to exchange routing tables, a procedure that is useful for routing optimisations that save bandwidth and energy. For more complex routing 3.3. NETWORK LAYER 55

algorithms the message structure may require more information. These changes will not break the network layer’s API and, thus, do not impact applications or services built upon it.

Network Formation Algorithms

The formation module (c.f. Figure 3.6) is the base of the network layer with the important role of setting up networks. In general a network formation algorithm finds devices in the vicinity and establishes connections with them according to a set of underlying rules. The middleware is provided with two builtin network formation algorithms that produce star and mesh networks. Which network to use, of course, depends on the application or service scenario being implemented.

Star networks are hierarchical networks in which one device assumes the role of a master and others connect to it as clients (Figure 3.9). The main advantage lies in its simplicity. Once a master is selected, routing between devices is trivial since it consists mostly of two hop paths with the master serving as a relay. The main disadvantage comes from the single point of failure - the master. For example, if the master leaves the network all its clients will have to reorganise and a new master must be selected. This can severely disrupt an application in more dynamic environments. Also, most of effort to keep the group together is done by the master device.

Mesh networks, on the other hand, have no imposed connection structure (Figure 3.10). They are simple to form and churn is usually less disruptive since there is no single point of failure, all devices assuming equal importance. Routing of messages, however, is typically quite more difficult than in star networks, given the more variable network topology.

Algorithm 1 shows the procedure for a star network formation. It comprises both leader election phase (master) and also the connection phase (clients). The advantage of using a normalised link layer interface is that one network algorithm implementation can be used by multiple technologies, differing only in a few technical details. In general the arguments for network formation are the lower/upper discovery limits, lower/upper visibility limits, connection timeout and the probability to become group owner (line 1). At the beginning the role for each device is UNDEFINED, (line 2), and the dis attempt variable is initialised to zero (line 3), in order to oblige all devices that enter the network, for the first time, to search for masters devices. Lines 4-26 show the code for the main skeleton of the star network formation. Thus, the first 56 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

Algorithm 1 Star Formation Algorithm 1: procedure star formation(dis, vis, conn time, m prob) 2: role ← UNDEFINED 3: dis attempt ← 0 4: while Running() do 5: switch role do 6: case UNDEFINED 7: role ←next state(m prob, dis attempt) 8: case MAST ER 9: time ← random int(vis.min, vis.max) 10: status ← accept and visible(time) 11: dis attempt ← 1 12: when status is SUCCESS do 13: wait for event() 14: otherwise do 15: role ← UNDEFINED 16: sleep and recover() 17: case SLAV E 18: time ← random int(dis.min, dis.max) 19: status ← discover and connect(time, conn time) 20: when status is SUCCESS do 21: wait for event() 22: otherwise do 23: dis attempt ← dis attempt + 1 24: role ← UNDEFINED 25: sleep and recover() 26: end while 27: end procedure 28: procedure next state(m prob, dis attempt) 29: rand ← random() 30: th ← get threshold(m prob, dis attempt) 31: when rand > th do return SLAV E 32: otherwise do return MAST ER 33: end procedure 3.3. NETWORK LAYER 57

Figure 3.9: Star Network. Figure 3.10: Mesh Network.

time the algorithm is executed, and every time the role is UNDEFINED, the code in lines 6-7 will be performed. That piece of code will call the function which will decide the next role, for the device, using a few rules. Note that the formation process is completely blind, the devices know nothing about the other devices on the network. However, the detection of a device during the discovery process means that it is the master device and that it is willing to connect to more clients.

Lines 8-16 show the code executed by the master devices. If a device is already a MASTER then does nothing and jumps directly to line 13, otherwise it executes the instructions. The first instruction, in line 9, generates the time that the device will be visible, and the next instruction tells the link layer to accept connections and to be visible to other devices in the neighbourhood. If the instruction is successfully performed then a sleep state follows, line 13, otherwise the role is reset and the algorithm restarted, in lines 15-16. The function WAIT FOR EVENT, on master side, tells the device to wait for the ”visibility off” and ”connection lost” event. If the master receives those events, then the device will become again visible if it does not have enough clients, otherwise it will continue to be in the sleep state.

Lines 17-25 show the code executed by the client devices. If the device is already a SLAVE, then does nothing and jumps directly to line 21, otherwise will execute the instructions. The instruction in line 18 that generates the time that a device will be on discovery mode. The next instruction performs a discovery and then connects to a master device. If it succeeds, it jumps to line 21 otherwise the role is reset and the algorithm restarted, lines 23-25. The function WAIT FOR EVENT, on the client side, tells the device to wait for the ”connection lost” event. If the client receives that event, 58 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS then the device will restart the algorithm. In case of failure the dis attempt variable, line 23, is incremented. This value is used to increased or decrease dynamically the probability of becoming a master. The function SLEEP AND RECOVER on both sides, master and client, sleeps a specific time and then restarts the algorithm.

Lines 28-33 show the code for the role decision function, the one that decides whether the device becomes a master or a client. Given a probability of becoming a master and a number of discovery attempts, it returns the next role, master or client. With these rules, if multiple devices join the network at same time they will, most probability, have distinct roles making feasible the network formation without any knowledge. Every time the discovery and connection process fails the probability of the device become a master increases, line 23. Note that during a certain period of time a device has just one role, client (discovering) or master (visible). Most of the technologies support both states, discover and visible, at same time. The idea behind this procedure is to avoid connection failures, that cost a lot of time, and also to form networks as connected as possible: less star networks, each with the maximum number of clients allowed by the technology.

At the moment, the network layer, has a star network implementation for every technology available. Regarding the mesh algorithm the devices, during the formation, alternate between discovery and visibility states in order to form a mesh network. There is no notion of master or client, the devices only knows which device are directly connected with. Only Bluetooth and WiFi have implementations for mesh topology. Besides these topologies, we implemented a third topology, which is a tree topology, that uses WiFi and WiFi-Direct. At first, all the devices are connected to the WiFi AP and then negotiate among them which devices will become group owners or which ones will be clients of the group owners. If a device decides to be part of a group, it disconnects from the WiFi AP and connects to a specific group owner, creating a two-tier network. In short, at the end of the algorithm, the devices connected to WiFi are group owner and others are clients of the groups. For the negotiation phase the devices exchange an integer (broadcast packet over WiFi) that represents the capacity of a device to become group owner. For example the integer can carry the battery level plus WiFi signal strength. The devices with higher values have better conditions to be a group owner. 3.3. NETWORK LAYER 59

Routing Algorithms

After the network formation, the devices need to exchange packets somehow. To reach devices that are not in direct range, a packet routing mechanism is required. The packet routing must be able to cross talk with different types of technologies in order to achieve communication over a large network. For example a device may, simultaneously, participate on a WiFi and a Bluetooth network, and a packet must be able to move transparently between these two different networks. There are several types of routing protocols, usually divided in two groups: proactive and reactive. The proactive protocols need to maintain local copies of routing tables between devices. This is hard to maintain consistently and not very suitable for very dynamic mobile networks. In contrast, the reactive versions do not maintain any routing information. The routes are found on demand, every time. An example of such an approach is flooding, where the network is literally flooded with the same packet, the destination device eventually receiving the packets. Flooding is very simple to implement and some controlled forms can be very effective in several scenarios, like peer-to-peer messaging networks, sensor networks and service-oriented networks. The main drawback is typically the huge traffic it generates in the network, most of it being redundant. Thus, as we mentioned several forms of controlled flooding are used.

As we write, the network layer provides two builtin routing algorithms: standard flooding and scoped flooding.

Standard Flooding: A packet will flood the network until the number of hops gets to zero. Each packet is marked with a maximum number of hops allowed. That number is decreased every time the packet is routed by a device. When that number reaches zero the packet is discarded. Note that, if a sender or router devices has a direct logic channel with the destination device then the packet will just travel, on that channel, ignoring the other channels.

Scoped Flooding: This is a variant of the standard version. Here, each routing device keeps track of each packet received (by storing its ID and source address in a local cache). If the packet has already been seen by the device it is immediately discarded. As memory is a limited resource, every routing device just keeps a small window of tracked packets, that can be dynamically adjusted. The cache for each device has n entries, n being a parameter of the algorithm. When the cache is full, the algorithm removes all entries that have been in it for more than k seconds, where k is a parameter that defined the ”time to live” for the entries. 60 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS

Algorithms 2, 3 and 4 describe the flood routing version.

Algorithm 2 Flood Sender Algorithm 1: procedure send(to, tag, body, hops) . Sends message to a peer 2: src ← get local address() 3: id ← get new id() . Gets current id and increments 4: pkt ← pack msg(id, src, to, tag, body, hops) 5: route packet(to, pkt) 6: end procedure 7: procedure send to all(tag, body, hops) . Sends message to all peers 8: src ← get local address() 9: flood ← get flood address() 10: id ← get new id() . Gets current id and increments 11: pkt ← pack msg(id, src, flood, tag, body, hops) 12: route packet(flood, pkt) 13: end procedure

First of all (Algorithm 2) there are two ways to send packets, send to a specific destination, lines 1-6, or to send to everyone, lines 7-12. The arguments for the first function are, to - logic address of the destination device, tag - packet classification, body - packet bytes and hops - the maximum number of hops. The arguments for the second function are similar, however, it is not required that the destination address be defined. Every time a packet is sent, the send function creates a header with some of the function arguments and also with two more extra attributes. The first one is the source address, line 2, which is obtained internally, and the other one is the current packet identifier, line 3, that is managed by the routing algorithm. The send to all function is very similar, to the send function, the only difference being that the destination address is a statically defined logic address also obtained internally, line 9.

Then next function (Algorithm 3), is responsible for selecting the exit channels to route a certain packet. This function receives as argument the destination address (to) and the packet object (pkt). The first thing, this function does, is to check if these are any channels connected to the destination, line 2. If those are available then it will pick one of the channels and write the packet to it, line 2. It is important to note that one device can have multiple channels to another device, moreover the selection criteria is the channel with higher bandwidth, e.g., a WiFi channel has more bandwidth than a Bluetooth channel. At the moment, the criteria is statically assigned but in the 3.3. NETWORK LAYER 61

Algorithm 3 Flood Routing Algorithm 1: procedure route packet(to, pkt) . Routes a message 2: ch ← get channels(to) 3: if ch 6= {} then 4: pick and write(ch, pkt) . Picks one channel and writes to it 5: else 6: peers ← get connect peers()/{src ch} 7: for all p ∈ peers do 8: ch ← get channels(p) 9: pick and write(ch, pkt) 10: end for 11: end if 12: end procedure

future this can be dynamic in order to adapt to the network fluctuations. On the other hand, if the destination is not directly connected then the message will be sent to all connected devices, lines 5-10, excepting the device that sent/routed the packet.

Algorithm 4 Flood Receiver Algorithm 1: procedure receive(rcv ch, pkt) . Receives message from 2: (from, to, id, tag, body, hops) ← unpack msg(pkt) 3: if is valid(from, id, hops) then 4: if is to deliver(to) then 5: notify packet(from, tag, body) 6: end if 7: if is to route(to) then 8: new pkt ← decrement hops and pack(pkt) 9: route packet(to, pkt) 10: end if 11: end if . If the packet is not valid then drops it 12: end procedure

The last function (Algorithm 4), deals with the received packets and decides whether to deliver those packets to the upper layer or route them. This function receives as argument the channel address rcv ch, from which the packet was transmitted, and the packet object pkt. When a packet is received, it must be unpacked, line 2, in order to extract the required information for the decision process. Note that the packet body stays unchanged, the unpacking being applied only to the packet header. For example, 62 CHAPTER 3. A MIDDLEWARE FOR EDGE-CLOUDS in this way a socket can receive just the packet header analyse it and transfer the packet body directly to another one, without requiring to load everything to memory - working as a stream. After extracting the packet, the receiver will verify if the same packet has the number of hops greater than zero and if the packet is not duplicated (control flood), line 3. The aforementioned cache cleaning procedure is performed at this step and is called from within IS VALID. In case the packet is not valid then just drop it, otherwise it must be delivered or routed. If the packet is addressed to the current device then that will delivered, lines 4-6, otherwise the packet will be routed through other connected devices, lines 7-10. When a packet is routed the number of hops is decremented by one, line 8. There is an exception though, which is when a received packet is a sent-to-all packet, the packet is delivered and routed at same time. The receiver recognises the packet by the special unique flood address and acts accordingly.

3.4 Discussion

In this chapter we presented the Hyrax middleware architecture that can be used on Android devices. The architecture is modular and flexible, allowing new behaviour to be programmed and plugged in. The middleware is divided into three main layers which provide hardware specific communication for local networks, network composi- tion and routing and, finally, basic services and applications. In the next chapter we perform a sequence of tests to make a preliminary assessment of the performance of our implementation and the impact of using the middleware in both execution time and device resources such as CPU, memory and energy. Chapter 4

Middleware evaluation

This chapter presents an evaluation of the link and network layers of the middleware. The experimental setup used is first presented, followed by the results and discussion. The evaluation focuses on resource usage metrics (e.g., CPU, memory, energy), per- formance metrics (e.g., bandwidth, latency) and topological metrics (e.g., number of groups formed, group size). For the link layer we focus on resource usage, latency and bandwidth, to estimate the footprint of the software at establishing simple, device-to- device, connections using distinct technologies. For the network layer we focus mainly on latency, bandwidth (formation and routing) and network topology (formation). The data gathered in these experiments provides a first characterisation of the performance of the middleware relative to the baseline Android libraries.

4.1 Link layer

In order to evaluate the impact of the link layer, some synthetic data was gathered aiming to comprehend the technology operations penalties and also to have numbers to serve as guideline, e.g. data for simulation. Thus, on this section, we assessed the cost of every link layer action, the bandwidth limitations, and monitored the resources usage like CPU, memory and battery.

4.1.1 Evaluation setup

To gather our data only two devices were used: one acting as a server, another acting as a client. The server first becomes visible, then accepts a client connection and

63 64 CHAPTER 4. MIDDLEWARE EVALUATION henceforth replies to messages sent by the client. Turning on the visibility allows the client to ”see” the server during the discovery process. The client role is, naturally, complementary. It discovers the server device, establishes a connection, sends multiple messages to the server and waits for the replies.

For the experiments were used two fully charged Nexus 9 devices with the following hardware characteristics: CPU 2.3 GHz dual core, 2 GB RAM and running Android OS (version 6.0.1). For the WiFi experiment the router utilised was ASUS RT-AC56U. The experiments were performed manually, without connecting the devices to USB ports to automate the procedure using ADB (Android Debug Bridge). This is due to the fact that when the devices are connected to USB ports they are constantly charging, leading to misleading results for the battery levels we wanted to measure. Moreover, in some experiments the WiFi interface was disabled which also made it impossible to use ADB (through WiFi).

Each experiment consisted of several phases. Thus, the first thing that we did was to enable the App on both devices. At startup each device knew which role it would play in the experiment. The next action was to enable the hardware (Bluetooth or WiFi) on both devices. After this, the client and server executed their roles. Every experiment was repeated 16 times. At the end of each experiment, the hardware was disabled as well as the Android App. We logged timestamped information on CPU, memory and battery usage in local files for each device. At the end of the experiments these were transferred to a desktop computer for processing and analysis. The values for each metric were averaged over the 16 experiments and the corresponding 95% confidence intervals were computed to be plotted.

4.1.2 Latency of link actions

Figure 4.1 depicts the amount of time taken by each link layer action in seconds. The actions at stake and the corresponding measurements are as follows.

Initialisation: the time taken to initialise the hardware. This is almost the same for all technologies, at approximately 0.5 seconds;

Discovery: the time taken by the client to discover the server on the network. Bluetooth- based technologies are faster than the WiFi-based. In particular the discovery time for Bluetooth LE is less than 1 second, while WiFi-based technologies all take approximately 2.5 seconds. 4.1. LINK LAYER 65

5

4

3 ) s (

e m i T 2

1

0 Hardware Discovery Visibility Connection Accept Ping

Bluetooth Bluetooth LE WiFi WiFi Direct Wifi Direct Legacy

Figure 4.1: Link layer - action latency.

Visibility: the time taken by the server to become visible to other devices. All the technologies perform similarly except for WiFi that takes almost 1 second to provide a visible service, since it implies an overhead of requiring all devices to join the network first before actual discovery through broadcast packets.

Connect: the time taken by the client to connect to the server. The results are heterogeneous, with Bluetooth taking more than 4 seconds, WiFi less than 1 second, and other technologies taking between 1 and 2 seconds.

Accept: the time taken by the server to accept incoming connections. For all tech- nologies this takes less than 1 second.

Ping: this is the round-trip time measured at the client by sending 20×9 Byte mes- sages and receiving them back (echoed) from the server over a (Bluetooth or TCP) socket connection. For Bluetooth, Bluetooth LE and WiFi this takes less than 100 milliseconds. For WiFi-Direct and WiFi-Direct Legacy the latency is around 200 milliseconds. 66 CHAPTER 4. MIDDLEWARE EVALUATION

4.1.3 Bandwidth measurements

Technology Time(s) Speed(KB/s) Length(MB) Bluetooth 87 115 10 Bluetooth LE 50 6 0.3 WiFi 2.2 4500 10 WiFi-Direct 3.1 3200 10 WiFi-Direct Legacy 5 2000 10

Table 4.1: Link layer - bandwidth.

In each experiment, after the ping operation, we performed a bulk data transfer from the client to the server with the purpose of measuring the transfer speed per technology. For Bluetooth LE, given the scarce bandwidth available, 0.3 MB of random data were transmitted. For all other technologies the size of the data transfer was 10 MB.

Table 4.1 depicts the measured transfer speeds. As expected, Bluetooth-based links are slower, in particular the transfer speed of Bluetooth LE transmission is limited to a few KB/s, while WiFi-based technologies speeds surpass 2 MB/s and reach up to 4.5 MB/s. The Bluetooth based technologies are not suitable to transfer large streams of data, however the ping operation results presented earlier suggest that they can perform well when transferring small chunks of data. Regarding WiFi technologies, they are clearly the choice for applications with higher bandwidth requirements (e.g., image, audio or video dissemination). Both WiFi-Direct and WiFi-Direct legacy use the devices’ on-board WiFi card, explaining the difference to the WiFi configuration that uses a dedicated router. WiFi-Direct is also clearly faster than WiFi-Direct legacy.

4.1.4 Resource consumption

We now report on resource usage - CPU, memory and battery - during the experiments.

CPU

Figure 4.2 depicts the CPU average usage measured at intervals of 200 milliseconds. The y-axis shows the average values, in percentage, while the x-axis represents the role of the device - server and client - as well as the wireless technology used, represented in different colours. Furthermore, the x-axis has two extra columns which detail 4.1. LINK LAYER 67

60

50

40 e g a t

n 30 e c r e P

20

10

0 Server Client Client Formation Client Transfer

Bluetooth Bluetooth LE WiFi WiFi Direct Wifi Direct Legacy

Figure 4.2: Link layer - CPU usage. the average CPU consumption on the client side during network formation - client formation - and the average CPU usage during the block data transfer - data transfer. Each column is divided into two sections, the lighter section depicting the CPU used by the test application, the darker section showing the CPU usage by other processes of system. The whole bar means the total CPU usage.

The results show that the technology that spends less CPU is Bluetooth LE, at approximately 22% (server) and 30% (client). The other technologies consume between 32 and 44% on client side. On the server side all technologies consume less than 30%, except for WiFi that goes up to 51%. The 95% confidence interval shown (for the total CPU time), indicates that the system exhibits relatively small variations of CPU usage for each of the experiments.

The percentage of CPU spent, by the process, in both the client and the server is fairly low when compared to the CPU spent by the whole system. However, as we already mentioned, it is important to note that the link layer interacts with other processes of the system, namely through the Android native libraries and drivers, thus impacting on the overall CPU consumption as well. To explore the effect of this dependency, we did other experiments where the app was initialised and then remained idle for one 68 CHAPTER 4. MIDDLEWARE EVALUATION minute, measuring the CPU consumption during that time. The CPU consumption was between 6-10% on average which suggests the interaction of the link layer with the wireless technologies through Android has a noticeable impact on CPU, which is expected, due the radio usage. Note that interaction with wireless technologies includes the data transfer, which is the biggest CPU consumer.

Considering the evaluation setup, in general, the data transfers have a higher impact on CPU usage than network formation. The exceptions are the Bluetooth LE, which is expected since it is optimised to save energy, and WiFi-Direct Legacy, because the network formation process is bit more complex than that of the other WiFi technologies. Also of note is the fact that for technologies with higher bandwidth the percentage of CPU usage on the transfers is also higher which confirms that the radio usage has a considerable impact on CPU (Table 4.1).

Memory

125M

100M

) 75M B M (

h t g n e

L 50M

25M

0 Server Client

Bluetooth Bluetooth LE WiFi WiFi Direct Wifi Direct Legacy

Figure 4.3: Link layer - memory usage.

Figure 4.3 shows the average memory consumption by the test application. The samples were taken every 200 milliseconds. Results are shown for the client and server 4.1. LINK LAYER 69 configurations and for the distinct technologies. The stacked columns are divided into sections. The lighter section represents the memory consumed by the app alone while the darker section represents the total memory consumption by all the processes in the devices. For each configuration, memory usage fluctuates only slightly, as illustrated by the small amplitude of the 95% confidence intervals.

Again, the memory shown is not just consumed by the app and middleware given their connections to Android services and drivers. To understand the impact of the link layer in memory usage we again did some experiments in idle mode. Thus, we left the app running for a minute just doing nothing (with middleware disabled) and during that time the memory consumption was recorded. We concluded that the memory consumption is pretty much the same, between 75-80 MB, which suggests that the link layer has a small memory footprint.

It is noticeable that Bluetooth LE consumes a bit more memory than the other technologies in which is related, most probably, with the internal implementation since at link layer level the experiment is exactly the same.

Battery

7

6

5 ) t t

a 4 W (

r e w

o 3 P

2

1

0 Server Client Client Formation Client Transfer

Bluetooth Bluetooth LE WiFi WiFi Direct Wifi Direct Legacy

Figure 4.4: Link layer - battery usage.

So far, the memory and CPU consumption have been analysed. Next we analyse battery consumption. Figure 4.4 shows the amount of energy per second (power) 70 CHAPTER 4. MIDDLEWARE EVALUATION dissipated on average during the experiments. The y-axis depicts the energy consumed, in Watt (Joules per second), while the x-axis shows the types of devices, server and client. As in the CPU analysis, two additional columns were added to x-axis that distinguish the energy consumed during the network formation and data transfer phases of the experiments. The different colours represent distinct technologies.

On the server side, the technologies that consume less energy are the Bluetooth-based (at 1.7 W), while the WiFi-based consume considerable more (>2.7 W). WiFi (red) is by far the technology that consumes more energy (at 5.2 W). On the client side, the results are different. Here, the technology that consumes more energy is Bluetooth (blue) and the one that consumes less is Bluetooth LE (green), the discovery and connection phases having a huge impact on battery consumption. As for the WiFi- based technologies, the one that consumes less energy is WiFi-Direct (yellow). During the data transfers, the technologies that consume more energy are Bluetooth (at 4.2 W) and WiFi (at 4.1 W). Bluetooth LE is the one that consumes less energy (at only 2.3 W) as expected. Note, however, that while a large data transfer performed via WiFi-based technologies consumes a bit more energy, the amount of data transferred per unit of time is far greater. For example, to transfer 10 MB over WiFi it took just ≈3 seconds while in Bluetooth it took 87 seconds. Bluetooth-based technologies really stand out for small data transfers, with regards to energy consumption and latency. To understand the relative impact of the app in the battery level, we ran the app for 1 minute in idle mode and measured the average dissipated power at 1.7 W.

4.2 Network Layer

We now turn to the evaluation of the network layer. More specifically, we focus on the performance of the fundamental functionality provided by the layer: network formation and packet routing. For packet routing we performed several experiments in which we measured the latency associated with routing packets to random destinations in the network and the amount the traffic generated on each such experiment. For the network formation, the experiments focused on analysing the behaviour of the formation algorithms under distinct conditions and the overheads of network formation at hand. 4.2. NETWORK LAYER 71

4.2.1 Evaluation setup

In these experiments, the hardware utilised was exactly the same as in the link layer. All the devices were connected over USB hubs to a desktop machine so that they could be automatically controlled by a (Python) script that we developed. The script consists of a set of commands, over ADB1 and App Intents2, controlling the device and the application, respectively, and another set of commands to extract the experiment logs. The overall experiment lifecycle is as follows: application start, device synchronisation, experiment run and log gathering and processing.

4.2.2 Packet routing

In order to understand how the packet routing algorithms implemented in the mid- dleware behave over different wireless technologies we set up a series of experiments that measure the message latency of unicast packets and also the amount of traffic generated by the routing of the packets.

It is important to note that our middleware is running on top of Android, at the application layer, over the TCP or Bluetooth stack. All packets move through reliable channels (Java TCP/Bluetooth sockets). This, however, does not necessarily mean that a packet will be delivered since we face identical routing issues of the network layer of TCP/IP stack, more specifically the routing queues overflow and messages delivery guaranties break down.

The experiments tested three wireless technologies: Bluetooth, WiFi-Direct and WiFi. The topology employed for the Bluetooth technology was a one star network in which we vary the star size from 2 to 7 (maximum allowed). We used the same topology for WiFi-Direct and WiFi, varying the star size from 2 to 6 and from 2 to 20, respectively. Finally, we also tested a more complex topology, a star of stars, with individual stars using distinct technologies. In this case, the WiFi-Direct was mixed together with WiFi in order to create a hierarchical star network.

Distinct choices of parameters were evaluated: number of devices, number of hops, packet rate and packet length. This allowed us to assess the performance of the routing algorithm under different network operation scenarios. For example, by varying the number of devices we can assess the scalability of the routing algorithms, while varying

1https://developer.android.com/studio/command-line/adb 2https://developer.android.com/reference/android/content/Intent 72 CHAPTER 4. MIDDLEWARE EVALUATION the number of hops will show the latency penalty for each extra hop involved in routing. The packet rate and variations on the packet length test the robustness of the routing algorithms from light to heavy loads exposing the bandwidth limits of the technologies.

A packet payload of size n is composed of a (4 byte) integer identifier followed by a randomly generated sequence of n − 4 bytes. This identifier is different from the routing header identifier, because on top of network layer it is not possible to access the identifier generated by the routing algorithm. Thus this payload identifier is a higher level message identifier. In the results we are ignoring the message header which is 60 bytes long. In this way, to compute the packet latency, every sent message is timestamped. A receiver gets the message and answers with an acknowledgement message (ACK) with payload size of 4 bytes and carrying just the original message’s identifier. When the sender receives the ACK packet, it calculates the difference between the current timestamp and the previous one.

For packet routing, the experiment lifecycle is the following:

• app activity and middleware boot;

• network formation (barrier until all the devices belong to a network);

• experiment start (with specific parameters, e.g., packet length, packet rate);

• experiment end (barrier until all the devices finish);

• middleware and app shutdown;

• log gathering.

For every test, the network topology was set to the best setting, i.e., the topology is previously set up to guarantee to minimise the number of hops necessary for each devices to communicate with any of the other devices. The experiments were automat- ically controlled, and consisted on sending packets to randomly chosen destinations in the network. After the network is formed, the devices perform a discovery in order to find all the other reachable devices. The discovery is a user-defined process in which each device broadcasts a packet with a special tag and every a device that receives that packet just answers with an ACK. After this phase is complete, the devices have the knowledge of the network dimension and all the devices belonging to it. Note, at this level, the devices are uniquely identified by a logic address. Additionally, the discovery process can be controlled to find devices only at a distance less than a given number of hops. 4.2. NETWORK LAYER 73

We now show the results obtained and discuss their implications.

Bluetooth Star

Bluetooth is a communication technology that allows devices to communicate directly over short ranges. Devices are arranged into star networks in which all communicate over an unique data channel. In star networks, a device acts as a master and the others connect to it (thus forming a star). The master handles all the traffic of the network, working as a relay and routing messages between devices in the star. Bluetooth also allows several stars to interconnect, however, in these experiments a single star was used. For these experiments the number of hops varies from 1 to 2, since a device can find any other with 2 hops. For 1 hop experiment it is ensured that all packets do exactly one hop. In contrast, for the 2 hop experiment most of the packets perform 2 hops while a few perform 1 hop. This is so because the destination devices are randomly selected and all the clients are 1 hop away from the master and 2 hops away from each other.

Figure 4.5 shows the results for the Bluetooth experiments in different situations for a single star network. The difference between the charts is the packets length of 1 KB, 2 KB, 4 KB and 8 KB, respectively. They depict the packet delivery average latency (y-axis) over an increasing number of devices (x-axis). Note that all the devices belong to the same star network and the y-axis is at logarithmic scale. The charts present two main groups of series blue and green, representing 1 hop and 2 hops experiment respectively. The green bars have a grey section, on top, meaning to distinguish the average of the two experiments (1 and 2 hops). The whole bar (green + grey) shows the average latency for 2 hops experiment while only the green part represents the average of all experiments. Both series show distinct colour intensities, which depicts the message rate ranging from 1-16 packets per second. The lighter colours show lower rates while the darker colours show higher rates. For each experiment every device sent 100 packets at the adopted packet rate.

Figure 4.5a, shows the experiment for packet length of 1 KB. The latency for 1 and 2 hops are in general satisfactory since the technology can deal with the generated traffic. This experiment is showing a few signals of saturation, for star network of size 7 and rate 16, taking around one second for one hop. This gives a total bandwidth of 1 KB × 7 × 16 = 122 KB/s. In Figure 4.5d, for a packet length of 8 KB, the saturation point is very clear even for a star network of size 2 comparing the rates 8 and 16. At rate 8 the traffic generated is 8 KB × 2 × 8 = 128 KB/s while for the rate 16 the 74 CHAPTER 4. MIDDLEWARE EVALUATION 7 7 2 2 H H _ _ 1 1 R R 6 1 1 6 H H _ _ 6 6 1 1 R R 5 2 2 5 1 1 H H _ _ H H _ _ 6 6 8 1 8 1 R R R R 4 4 1 2 1 2 H H H H (b) 2 K (d) 8 K _ _ _ _ 4 8 4 8 R R R R 3 3 1 2 1 2 H H H H _ _ _ _ 2 4 2 4 R R R R 2 2 2 2 1 1 H H H H _ _ _ _ 1 2 1 2 R R R R 1 1 4 2 4 0 4 2 . 1 1 . . 0 0 . 1 0

0 1 0 0 . 0 0

1 0

) s ( y c n e t a L ) s ( y c n e t a L 7 7 2 2 H H _ _ 1 1 R R Figure 4.5: Bluetooth star 6 6 1 1 H H _ _ 6 6 1 1 R R 5 5 2 2 1 1 H H _ _ H H _ _ 6 6 8 1 8 1 R R R R 4 4 1 2 1 2 (c) 4 K H H H H (a) 1 K _ _ _ _ 4 8 4 8 R R R R 3 3 1 2 1 2 H H H H _ _ _ _ 2 4 2 4 R R R R 2 2 2 2 1 1 H H H H _ _ _ _ 1 2 1 2 R R R R 1 1 1 1 1 4 2 0 0 4 2 4 . . . . 1 0 0 0

0 0 . 0 0 .

1

0 0

) s ( y c n e t a L ) s ( y c n e t a L 4.2. NETWORK LAYER 75 traffic generated is doubled at 256 KB/s, translating into a steep growth of latency. For the intermediate cases, 2 KB an 4 KB (Figures 4.5b and 4.5c), it is noticeable that Bluetooth is struggling with an increasing traffic demand. As expected, in general, for 2 hops experiments the latency doubles when comparing with 1 hop.

20

10

4 ) B

M 2 (

c i f f a

r 1 T

0.4

0.2

0.1 2 3 4 5 6 7

R1_1K R2_1K R4_1K R8_1K R16_1K R1_2K R2_2K R4_2K R8_2K R16_2K R1_4K R2_4K R4_4K R8_4K R16_4K R1_8K R2_8K R4_8K R8_8K R16_8K

Figure 4.6: Bluetooth star traffic.

Besides the latency, the traffic data was analysed as well. Figure 4.6 shows the traffic generated, by the whole network, on each experiment. The y-axis depicts the traffic in MB while the x-axis shows the star size. Note that the y-axis has a logarithmic scale and the bars show absolute values. The chart is divided into four main series which are blue, green, red and yellow for packet length of 1 KB, 2 KB, 4 KB and 8 KB, respectively. Every main category has distinctive colours intensities denoting the different rates ranging from 1 to 16 (lighter to darker colours). The bars show a grey area, on top, showing the traffic overhead while performing 2 hops. The whole bar (grey + coloured area) depicts the whole traffic generated by 2 hop experiment (with routing overhead) while just the coloured area, shows the amount of traffic generated by the devices (data produced). One expects an almost duplication, of traffic, for a 2 hops experiment since the routing device receives the whole packet and then redirects it through another channel. 76 CHAPTER 4. MIDDLEWARE EVALUATION

The traffic grows linearly as the number of devices and the packet length increases, however the same does not apply for packet latency since the Bluetooth technology does not support much bandwidth, ≈ 120 KB/s. Thus the latency rises rapidly as the rate and the network size increases.

WiFi-Direct Star

WiFi-Direct is a WiFi-based technology that allows any device to become an Access Point (AP) on demand. Once a device becomes an AP the other devices can connect to it using the legacy WiFi interface or the internal WiFi-Direct API that allows, at same time, to be part of two networks, WiFi-Direct and traditional WiFi. This technology supports the formation of star networks, however one limitation when comparing with Bluetooth is that devices in different networks cannot communicate with each other. In contrast, the advantage is the same high bandwidth and low latency as WiFi. For this experiment a single star network was used and, thus, the number of hops, as in Bluetooth, varied from 1 to 2. For the 1 hop experiments we ensured that all packets do exactly one hop. In contrast, for 2 hop experiment most of the packets perform 2 hops while a few perform 1 hop since the destination device is randomly selected and all the clients are 1 hop away from the master and its clients 2 hops away from each other. Note that both master and clients participate, actively, at the experiment.

Figure 4.7 shows the results for the WiFi-Direct experiments in different configurations for the star network. Again, the difference between the charts is the packets length of 2 KB, 4 KB, 8 KB and 16 KB, respectively. We did experiments for 1 KB and 32 KB as well, however these charts are enough to show the efficiency and the saturation point. The charts depict the packet delivery average latency, on y-axis, over an increasing number of devices, on x-axis. Note that all the devices belong to the same star network and the y-axis is at logarithmic scale. The charts present two main groups of series blue and green, representing 1 hop and 2 hops experiment respectively. The green bars have a grey section, on top, meaning to distinguish the average of the two experiments (1 and 2 hops). The whole bar (green + grey) shows the average latency for 2 hops experiment while only the green part represents the average of all experiments. Both series show distinct colour intensities, that depict the message rate ranging from 1-64 packets per second. The lighter colours show lower rates while the darker colours show higher rates. The number of packets sent by each experiment is given by the following function: p = 30s × rate where rate ∈ {1, 2, 4, 8, 16, 32, 64}. The goal was that each experiment would require approximately the same time to run. 4.2. NETWORK LAYER 77 1 1 2 2 H H H H 6 6 _ _ _ _ 2 6 2 6 3 1 3 1 R R R R 1 1 2 2 H H _ _ H H _ _ 6 6 5 5 1 8 1 8 R R R R 1 2 1 2 H H H H _ _ _ _ 8 4 8 4 R R R R 4 4 1 2 1 2 H H H H (b) 4 K _ _ _ _ (d) 16 K 4 2 4 2 R R R R 3 3 2 2 1 1 2 2 H H _ _ H H H H _ _ _ 4 _ 4 2 1 6 2 1 6 R R R R R R 2 2 1 2 1 2 1 1 H H H H _ _ _ _ H H _ 4 2 _ 4 2 1 6 3 1 6 3 R R R R R R 1 1 1 1 4 2 8 6 4 2 0 0 . . . . 0 1 0 0 0 0 0

. 0 0 0 0 . . . .

1

0 0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 1 1 2 2 H H H H 6 6 _ _ _ _ 2 6 2 6 3 1 3 1 R R R R 1 1 Figure 4.7: WiFi direct star 2 2 H H _ _ H H _ _ 6 6 5 5 1 8 1 8 R R R R 1 2 1 2 H H H H _ _ _ _ 8 4 8 4 R R R R 4 4 1 2 1 2 H H H H (c) 8 K (a) 2 K _ _ _ _ 4 2 4 2 R R R R 3 3 2 2 1 1 2 2 H H _ _ H H H H _ _ _ 4 _ 4 2 1 6 2 1 6 R R R R R R 2 2 1 2 1 2 1 1 H H H H _ _ _ _ H H _ 4 2 _ 4 2 1 6 3 1 6 3 R R R R R R 1 1 1 8 6 4 2 8 6 4 2 4 2 4 2 ...... 0 0 0 0 0 0 0 0 0

0 . 0 ...... 0 0 0 0

0 0 0 0 0 0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 78 CHAPTER 4. MIDDLEWARE EVALUATION

As expected and comparing with Bluetooth, the WiFi-Direct results show a consid- erable improvement in both latency and available bandwidth. Moreover, for packet lengths of 2 KB, 4 KB and 8 KB there are no significant differences in latency (c.f. Figures 4.7a, 4.7b, 4.7c, respectively). An interesting pattern in these results is that as the packet rate increases the latency decreases. We attribute this behaviour to the operation system optimisations that pack as much data as the TCP segment can handle and also optimisations at TCP level, e.g., flux and congestion control. The amount of data sent by unit of time is adjusted dynamically to the network state. If the conditions are good then the TCP window is larger (benefits higher rates), otherwise is shorter (benefits lower rates). It is only for the highest packet size and rates (Figure 4.7d) that we find the first signs of saturation, for a star network of size 5, a packet rate of 64 and a packet size of 16 KB, corresponding to a total traffic of 16 KB × 5 × 64 = 5.1 MB/s. The latency also rises, being on average ≈4 seconds. Of course for higher rates and packet lengths these numbers get even worse. Despite this, these values are quite good allowing, for example, High Definition (1080p) video streaming. As expected, for the 2 hops experiments the latency doubles when comparing with the results for 1 hop.

1k

100 )

B 10 M (

c i f f a r 1 T

0.1

0.01 2 3 4 5 6

R1_2K R2_2K R4_2K R8_2K R16_2K R32_2K R64_2K R1_4K R2_4K R4_4K R8_4K R16_4K R32_4K R64_4K R1_8K R2_8K R4_8K R8_8K R16_8K R32_8K R64_8K R1_16K R2_16K R4_16K R8_16K R16_16K R32_16K R64_16K

Figure 4.8: WiFi direct star traffic.

Figure 4.6 represents the traffic generated by the whole network, for each experiment. 4.2. NETWORK LAYER 79

The y-axis depicts the traffic in MB and the x-axis shows the star size. Note that the y-axis is on logarithmic scale and the bars show absolute values. The chart is divided into four main series which are green, red, yellow, purple for packet length of 2 KB, 4 KB, 8 KB and 16 KB, respectively. Every main category have distinctive colours intensities denoting the different rates ranging from 1 to 64 (lighter to darker colours). The charts bars show a grey area, on top, showing the traffic overhead while performing 2 hops. The whole bar (grey + coloured area) depicts the whole traffic generated by 2 hop experiment (with routing overhead) while just the coloured area, shows the amount of traffic generated by the devices (data produced). We expect an almost duplication of traffic for the 2 hops experiments since the routing device receives the whole packet and then redirects through another channel.

The traffic grows linearly with the number of devices, as the packet rates and the packet length increases. The packets latency is stable until a certain threshold (≈ 3 MB/s). After this point, it starts to grow rapidly as in Bluetooth, but the available bandwidth is much higher.

WiFi Star

For this experiment we used the traditional WiFi technology with an ordinary AP. By default, the network topology is a star network as well and in this scenario the AP is dummy since does not hold any intelligence, when comparing with WiFi-Direct or Bluetooth. The AP is, however, optimised for routing “duty” so the performance is expected to improve. It also allows, and effectively handles, a much higher number of client devices than WiFi-Direct. For this experiment an unique star was used and the number of hops performed is 2 since all devices are 2 hops away of each other. Thus, is this experiment all packets are ensured to make 2 hops and, like in the previous experiments, the destination devices are all randomly selected.

Figure 4.9 shows the results for the WiFi experiments in different configurations for a single star network. The difference between the charts is the packets length of 2 KB, 4 KB, 8 KB and 16 KB, respectively. We did experiments for 1 KB and 32 KB as well, however this ones are enough to show the efficiency and the saturation point. The charts depicts the packet delivery average latency, on y-axis, over an increasing number of devices, on x-axis. Note that all the devices belong to the same star network and the y-axis is at logarithmic scale. The charts present only one group of series, blue, representing 2 hops experiment. Note that all devices are 2 hops away from each other and the AP is dummy. This series show distinct colour intensities, which 80 CHAPTER 4. MIDDLEWARE EVALUATION 0 0 2 2 8 8 1 1 6 6 2 2 1 1 H H _ _ 4 4 6 6 R R 4 4 1 1 2 2 H H 2 2 _ _ 1 1 2 2 3 3 R R 0 0 1 1 2 2 H H (b) 4 K _ _ (d) 16 K 6 6 8 8 1 1 R R 6 6 2 2 H H _ _ 8 8 R R 4 4 2 2 1 1 1 1 1 4 2 4 2 0 0 . . . . 0 0 1 0 0 0

0 . 0 . 0 0 . .

1

0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 0 0 2 2 8 8 1 1 Figure 4.9: WiFi star 6 6 2 2 1 1 H H _ _ 4 4 6 6 R R 4 4 1 1 2 2 H H 2 2 _ _ 1 1 2 2 3 3 R R 0 0 1 1 2 2 H H (c) 8 K (a) 2 K _ _ 6 6 8 8 1 1 R R 6 6 2 2 H H _ _ 8 8 R R 4 4 2 2 1 1 1 1 1 4 2 4 2 4 2 4 2 ...... 0 0 0 0 0 0

0 . 0 . 0 0 0 0 . . . .

0 0 0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 4.2. NETWORK LAYER 81 depicts the message rate ranging from 8-64 packets per second. The lighter colours show lower rates while the darker colours show higher rates. The number of packets sent by each experiment is given by the following function: p = 30s × rate where rate ∈ {1, 2, 4, 8, 16, 32, 64}.

When compared to WiFi-Direct, which supports only up to 6 devices, the results are better since we are running a specialised AP against an ordinary WiFi card on a mobile device. In general for packet length of 2 KB, 4 KB and 8 KB there are no significant differences in latency (c.f. Figures 4.9a, 4.9b, 4.9c, respectively). In Figure 4.9c, we observe the first signs of saturation, for a configuration of 20 devices, packet rate of 64 and size of 8 KB, translating into a total traffic of 8 × 64 × 10 = 10.2 MB/s. The impact of increased packet rate is very clear, raising packet latency. Finally, Figure 4.9d shows the extreme case where 20 devices send 16 KB packets with a frequency of 64 per second. The AP is not able to handle this much traffic and latency suffers. For less extreme scenarios WiFi performs rather well and is stable.

1k

100 ) B M (

c

i 10 f f a r T

1

0.1 2 4 6 8 10 12 14 16 18 20

R8_2K R16_2K R32_2K R64_2K R8_4K R16_4K R32_4K R64_4K R8_8K R16_8K R32_8K R64_8K R8_16K R16_16K R32_16K R64_16K

Figure 4.10: WiFi traffic.

The Figure 4.10 represents the traffic generated, by the whole network, on each experiment. The y-axis depicts the traffic in MB and the x-axis shows the star size. Note that the y-axis is on logarithmic scale and the bars show absolute values. The 82 CHAPTER 4. MIDDLEWARE EVALUATION chart is divided into four main series which are green, red, yellow, purple for packet length of 2 KB, 4 KB, 8 KB and 16 KB, respectively. Every main category have distinctive colours intensities denoting the different rates ranging from 8 to 64 (lighter to darker colours). Comparing with WiFi-Direct the traffic generated is lower since all the devices are directly connected, but still do 2 hops, and the traffic that AP handles is not accounted for. Note the traffic that matters for us is the data that is generated and handled by middleware.

The traffic grows linearly as the number of devices, the packet rates and the packet length increase. The packets latency is stable until a certain threshold (≈ 10 MB/s, far above the threshold for WiFi-Direct). After that, it starts to grow rapidly. So far, the WiFi technology is the best performer since it allows more devices to belong to the same star network and also provides higher bandwidth. The difference, to WiFi- Direct, is a dedicated hardware that handles the traffic, contrasting with an ordinary WiFi card at a mobile device that does not have such power.

WiFi/WiFi-Direct Star

In the final set of experiments we used a network built by composing several star net- works using different technologies. The central star used WiFi with an AP. Connected to the AP were several WiFi-Direct star networks via their Group Owners (GO). The motivation behind this topology is that WiFi-Direct does not allow different groups to communicate directly with each other, which is very restrictive since groups can only hold a few devices, depending of the WiFi network card and Android API implementation. Another problem is each WiFi-Direct group generates the same IP network (192.168.49.0/24) and when a device wishes to establish a socket outside the group it gets ”confused” since there are potentially multiple networks with the same IP range. For this experiment, we did a small optimisation on the routing algorithm so that the devices exchanged the routing tables, between them, in order to avoid the traffic exponential growth which would overflow the routing queues. Nevertheless, another line of our work [84, 93] strongly suggests that we can take advantage of this type of network for caching and disseminating contents while relieving traditional WiFi infrastructure.

Groups with up to 3 devices. For this experiment all the devices were under the influence of a single AP. A few of the devices were directly connected to the AP (the GO) and the other were clients of WiFi-Direct groups. The latter had at most 4.2. NETWORK LAYER 83

3 devices including the group owner. In these experiments, all devices are, at most, 4 hops away from each other, as opposed to 2 hops away in the previous experiments. We ran a single trial for 4 hops and then isolated the packets that performed 1, 2, 3 and 4 hops and plotted the charts accordingly. Note that in the previous experiments we have run isolated trials for both 1 and 2 hops, making the experiments more time consuming.

Figure 4.11 shows the results obtained. Again, the difference between the charts is the packet length of 1 KB, 2 KB, 4 KB, respectively. We did experiments for 8 KB and 16 KB as well, however the presented packet lengths are enough to show the saturation point since more hops are being performed. The charts depict the packet delivery average latency versus the number of devices. The y-axis has a logarithmic scale. The charts present four groups of series, blue, green, red and yellow representing 1, 2, 3 and 4 hops experiment, respectively. Note that all devices are, at most, 4 hops away from each other. The yellow bars have a grey section, on top, meaning to distinguish the average of the whole experiment, ignoring the number of hops performed. The whole bar (yellow + grey) shows the average latency for packets that made 4 hops while only the yellow part represents the average for all packets. This series show distinct colour intensities, which depicts the message rate ranging from 16-64 packets per second. The lighter cools show lower rates while the darker colour’s show higher rates. The number of packets sent by each experiment is given by the following function: p = 30s × rate where rate ∈ {1, 2, 4, 8, 16, 32, 64}.

We expected worse results because, on average, packets do more hops before reaching their destination. Figures 4.11a and 4.11b show reasonable results even for 4 hops packets. The maximum bandwidth being generated is 2 × 20 × 64 = 2.6 MB/s. The first signs of saturation are visible in Figure 4.11c, referring to an experiment with 15 devices, a packet rate of 64 (darker yellow bar) for a total bandwidth of 4×15×64 = 3.8 MB/s. Every extra hop adds to the latency linearly until this threshold bandwidth is reached, growing rapidly afterwards.

The Figure 4.11d represents the traffic generated by the whole network, for each experiment. The y-axis depicts the traffic in MB and the x-axis shows the network size. Note that the y-axis has logarithmic scale and the bars show absolute values. The chart is divided into three main series which are blue, green and red for packet lengths of 1 KB, 2 KB, 4 KB, respectively. Every main category has distinctive colour intensities denoting the different rates ranging from 16 to 64 (lighter to darker colours). The charts bars show a grey area, on top, representing the traffic overhead, which is the extra traffic being generated by the routing devices. The whole bar (grey + coloured 84 CHAPTER 4. MIDDLEWARE EVALUATION

0 0 2 4 K 2 2 2 H H _ _ _ 4 4 4 6 6 6 R R R 8 8 2 4 K 1 1 2 H H _ _ _ 2 2 2 3 3 3 R R R 5 5 2 4 K 1 1 2 H H _ _ _ 6 6 6 1 1 1

R R R 2 2 1 3 K 1 K 1 1 4 H H _ _ _ _ (b) 2 K 4 4 4 4 (d) Traffic 6 6 6 6 R R R R 9 9 1 3 K K 1 4 H H _ _ _ _ 2 2 2 2 3 3 3 3 R R R R 6 6 1 3 K K H H 1 4 _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 1 1 1 4 2 4 2 . k 1 . . 0 0 0 0 0 1 0 . 1 0 0 . . 0

0 1 0 0

) B M ( c i f f a r T ) s ( y c n e t a L 0 0 2 4 2 4 2 2 H H H H _ _ _ _ 4 4 4 4 6 6 6 6 R R R R 8 8 2 4 2 4 1 1 H H H H _ _ _ _ 2 2 2 2 3 3 3 3 R R R R 5 5 2 4 2 4 1 1 H H H H _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 2 2 Figure 4.11: WiFi/WiFi-Direct with up to 3 devices per group. 1 3 1 3 1 1 H H H H _ _ _ _ (c) 4 K (a) 1 K 4 4 4 4 6 6 6 6 R R R R 9 9 1 3 1 3 H H H H _ _ _ _ 2 2 2 2 3 3 3 3 R R R R 6 6 1 3 1 3 H H H H _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 1 1 1 1 0 4 2 8 6 4 2 . . . . 1 0 0 0 0 0

0 0 . 0 0 . . . .

0 0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 4.2. NETWORK LAYER 85 area) shows the whole traffic generated and only the grey area depicts the overhead. Comparing with the WiFi only star, and for the same packets length, it is clear that the traffic generated almost duplicated.

The traffic grows linearly as the number of devices, the packet rate and the packet length increase. Despite this, the packet routing latency is stable until ≈3.8 MB/s threshold. As usual, as in any communication technology, the latency starts to grow rapidly after a threshold been exceeded. Comparing with WiFi technology the bandwidth available dropped to half because of the traffic being generated, by the routing devices, in which is due the extra number of hops performed. To fully take advantage of this type of architecture the devices need to cache content and communicate as much as possible inside the group. For lower cache-hit ratios, it is preferable to go to a central server than to a device in the other groups. The results are, nevertheless reasonable - available bandwidth and low latency for interesting configurations - and this scenario can be a good fit to when there is no central server available or even when the central server is busy. Considerable improvements may be possible by using less greedy algorithms for packet routing.

Groups with up to 6 devices. Here we use the same configuration as in the previous experiment with the exception that the WiFi-Direct groups may now have up to 6 devices, including the GO. Figure 4.12, shows the results we obtained. The difference between the charts is the packets length of 1 KB, 2 KB, 4 KB, respectively. The charts depict the packet delivery average latency, on y-axis, over an increasing number of devices, on x-axis. The y-axis has logarithmic scale. The charts present four groups of series, blue, green, red and yellow representing 1, 2, 3 and 4 hops experiment, respectively. Note that all devices are, at most, 4 hops away from each other. The yellow bars have a grey section, on top, meaning to distinguish the average of the whole experiment, ignoring the number of hops performed. The whole bar (yellow + grey) shows the average latency for packets that made 4 hops while only the yellow part represents the average for all packets. This series show distinct colour intensities, which depicts the message rate ranging from 16-64 packets per second. The lighter colours show lower rates while the darker colours show higher rates. The number of packets sent by each experiment is given by the following function: p = 30s × rate where rate ∈ {1, 2, 4, 8, 16, 32, 64}.

Figures 4.12a and 4.12b show that the performance is very similar to the previous experiment, with smaller groups. The results start diverging when more traffic is added to the network as in Figure 4.12c, which is noticeable for all bars of rate 64 86 CHAPTER 4. MIDDLEWARE EVALUATION

2 4 K 2 H H _ _ _ 4 4 4 6 6 6 R R R 0 0 2 2 2 4 K 2 H H _ _ _ 2 2 2 3 3 3 R R R 2 4 K 2 H H _ _ _ 6 6 6 1 1 1

R R R 8 8 1 1 1 3 K K 1 4 H H _ _ _ _ (b) 2 K 4 4 4 4 (d) Traffic 6 6 6 6 R R R R 1 3 K K 1 4 H H _ _ _ _ 2 2 2 2 3 3 3 3 2 R R R R 2 1 1 1 3 K K H H 1 4 _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 1 1 4 2 4 2 . k . . 0 0 0 0 0 0 4 0 0 1 0 1 0 0 . . 0 0 0 4 2

1 0 0 4 2

) B M ( c i f f a r T ) s ( y c n e t a L 2 4 2 4 H H H H _ _ _ _ 4 4 4 4 6 6 6 6 R R R R 0 0 2 2 2 4 2 4 H H H H _ _ _ _ 2 2 2 2 3 3 3 3 R R R R 2 4 2 4 H H H H _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 8 8 1 1 Figure 4.12: WiFi/WiFi-Direct with up to 6 devices per group. 1 3 1 3 H H H H _ _ _ _ (c) 4 K (a) 1 K 4 4 4 4 6 6 6 6 R R R R 1 3 1 3 H H H H _ _ _ _ 2 2 2 2 3 3 3 3 2 2 R R R R 1 1 1 3 1 3 H H H H _ _ _ _ 6 6 6 6 1 1 1 1 R R R R 1 1 1 1 0 0 4 2 8 6 4 2 . . . . 1 0 0 0 0 0 0

0 0 . 0 0 . . . .

1

0 0 0 0 0

) s ( y c n e t a L ) s ( y c n e t a L 4.2. NETWORK LAYER 87

(darker colours). The reason behind this is that, in the previous experiment, there were more devices 3 hops away and less 4 hops away - more groups, while in this experiment we have the opposite - less groups each with more devices. Thus, the cause for this experiment to perform worse than the previous one is that for the same amount of generated traffic there are more packets to travel 4 hops than 3 hops. In contrast, in this experiment, there are more devices 2 hops away, but these are not enough to compensate the performance penalty for those which are 4 hops away.

Figure 4.12d represents the traffic generated by the whole network, on each experiment. The y-axis depicts the traffic in MB and the x-axis shows the network size. Note that the y-axis is on logarithmic scale and the bars show absolute values. The chart is divided into three main series which are blue, green and red for packet length of 1 KB, 2 KB, 4 KB, respectively. Every main category have distinctive colours intensities denoting the different rates ranging from 16 to 64 (lighter to darker colours). The charts bars show a grey area, on top, representing the traffic overhead, which is the extra traffic being generated by the routing devices. The whole bar (grey + coloured area) shows the whole traffic generated and only the grey area depicts the overhead. The amount of traffic generated is similar to that of the previous experiment.

As previously mentioned, the hierarchical WiFi networks are not a good fit for common use. To take advantage of this type of network the devices that are part of groups need to cache content in order to provide it later on to group neighbours, thus reducing the load on the WiFi infrastructure and on the central server. Furthermore this kind of architectures can take advantage of WiFi-TDLS which allows devices on the same WiFi network to communicate directly, on a spare channel. Another aspect that has a huge impact on the network performance is the quality of the groups owners (better WiFi card and CPU), since the devices are heterogeneous [84, 93].

4.2.3 Network Formation

In this section we evaluate the network formation for distinct technologies. The goal of these experiments is to understand how the formation process behaves under different scenarios and to measure its impact on performance. The network topology employed for Bluetooth and WiFi-Direct was the star topology in which distinct groups are not reachable from each other. In contrast, for the experiment with composed WiFi and WiFi-Direct stars, distinct groups are able to communicate between each other, forming a hierarchical star. We experimented with different network sizes, namely, 5, 10, 15 and 20 devices. The formation algorithms utilised were previously presented in 88 CHAPTER 4. MIDDLEWARE EVALUATION the section 3.3.4.

In our experiments we want to measure several metrics, namely: the network formation time, the time taken on each formation action and, also, the number of groups formed, when applicable. The network formation time will provide an insight on how much time each device takes to join a network. To form a network several actions are performed and the time for each will be evaluated separately. Finally, the number of groups will be studied by using as a reference the best groups setting - minimum number of groups. The experiments are divided into 4 categories and for each category the experiments were performed eight times. The categories reflect the time intervals that devices must wait before trying to join the network. We experimented with 0, 1, 2 and 4 second intervals. For 0 all devices try to join at the same time. These experiments were also controlled using a Python script with similar flow:

• app activity and middleware boot;

• network formation starts;

• network formation ends;

• middleware and app shutdown;

• log gathering.

The following sections describe the results for the different network scenarios.

Bluetooth Star

Figure 4.13a depicts the amount of time that each device takes, on average, to be part of the network, y-axis (seconds), for increasing numbers of devices, x-axis. The chart shows four categories, green, yellow, red, blue, denoting the delay time for each device to enter into the network, 0, 1, 2, 4 seconds respectively. The results shows that algorithm takes more time to adjust when all devices enter at the same time rather than with some delay between them. As more devices are added the median time for a device to join the network remains practically the same and the dispersion of times (quartiles 1, 2 and 3) are very similar, suggesting no big impact as more devices join the network. The negative side of having more devices belonging to the network seems to be more outliers (or jitter) - devices that take more than 1.5 IQR (Inter Quartile Range) seconds to be part of the network. 4.2. NETWORK LAYER 89

80

60 ) s ( e 40 m i T

20

0 5 10 15 20 # Devices

D0 D1 D2 D4 (a) Formation time.

15

12.5

10 ) s ( e 7.5 m i T

5

2.5

0 5 10 15 20

F0 F1 F2 F4 C0 C1 C2 C4 S0 S1 S2 S4 (b) Time per action.

6

5

4 r e b

m 3 u N

#

2

1

0 5 10 15 20 # Devices

G0 G1 G2 G4 Best Conf (c) Groups

Figure 4.13: Bluetooth star formation. 90 CHAPTER 4. MIDDLEWARE EVALUATION

Figure 4.13b shows the average sliced time that each device spends in every action, y-axis, for increasing numbers of devices, x-axis. For example, the time that spent on discovery and the time spent on establishing connections. The actions are divided into three main categories: connection failure (red series), connection success (green series) and discovery (blue series). It is visible that when the devices enter the network at same time they spend more on the discovery process, while in the other experiments, with delay, the time of connection success and discovery are similar. Every discovery action takes, at most, five seconds to run. Thus, higher values imply that the discovery actions were performed multiple times. It can be observed that connection failures decrease when using longer delays, positively impacting the time it takes for a device to join the network.

Figure 4.13c represents the number of groups generated for each experiment. The y-axis shows the number of groups while the x-axis the number of devices. As in the previous chart, each column represents the time delay for each device to enter on the network. There is an extra column (in red) that represents the optimal number of groups that should be formed with the available devices. The results show that the number of groups is lower, and nearer to the optimal configuration, when the delay increases. Note that these multiple groups do not communicate with each other, another network formation algorithm required. To conclude, if the devices enter on the network with a certain delay (one second is enough) the proposed algorithm behaves well, by taking less time to form a network and also generates less number of groups. Figures 4.13a, 4.13b and 4.13a, clearly show the positive impact such a delay.

WiFi-Direct Star

This experiment is very similar as the previous one, but using WiFi-Direct instead of Bluetooth. Figure 4.14a depicts the amount of time that each device takes, on average, to be part of the network, y-axis (seconds), over an increasing number of devices, x- axis. The chart shows four categories, green, yellow, red, blue, denoting the delay time for each device to enter into the network, 0, 1, 2, 4 seconds respectively. As in the Bluetooth experiment, the results show that the formation algorithm takes more time to adjust when all devices enter together than with a certain delay. As more devices are added the median time for a device to join the network remains practically the same and the dispersion of times (quartiles 1, 2 and 3) are very similar, suggesting no big impact as more devices join the network. The negative side of having more devices belonging to the network seems to be more outliers, devices that take more than 1.5 4.2. NETWORK LAYER 91

200

150 ) s ( e 100 m i T

50

0 5 10 15 20 # Devices

D0 D1 D2 D4 (a) Formation time.

25

20

15 ) s ( e m i T 10

5

0 5 10 15 20

F0 F1 F2 F4 C0 C1 C2 C4 S0 S1 S2 S4 (b) Time per action.

8

6 r e b

m 4 u N

#

2

0 5 10 15 20 # Devices

G0 G1 G2 G4 Best Conf (c) Groups

Figure 4.14: WiFi-Direct star formation. 92 CHAPTER 4. MIDDLEWARE EVALUATION

IQR (Inter Quartile Range) seconds to be part of the network.

Figure 4.14b shows the sliced time, on average, that each device spent on every action, y-axis, for increasing number of device, x-axis. For example the time that spent on discovery, the time that spent on connections. The actions are represented into three main categories, connection failure (red series), connection success (green series) and discovery (blue series). Is visible that when the devices enter the network at same time they spend more on the discovery process, while on the other experiments, with delay, the time of connection success and discover are similar. Every discovery action takes, at most, five seconds to run. Thus, higher values means the discovery actions were performed multiple times. When comparing with Bluetooth is noticeable that the actions, in general, take more time to executed. It can be observed that connection failures decreases as the delays increases and the total amount of time easily mapped to the median time on the chart on the Figure 4.14a.

The Figure 4.14c represents the number of groups generated for each experiment. The y-axis shows the number of groups while the x-axis the number of devices. As in the previous chart, each column represents the time delay for each device to enter on the network. There is an extra column (in red) that represents the minimum number of groups that should be made for the available devices. The results show the number of groups is lower when the delay increases, which is nearer to the optimal configuration. To conclude, the number of groups created is similar to that of Bluetooth, however, the WiFi-Direct network formation takes more time than with Bluetooth, which is explained for the extra time required in the discovery and connection process. The proposed algorithm is the same, however, and thus the adopted technologies must be the major factor in the results. As above, the effect of delays in network formation is readily visible in Figures 4.14a, 4.14b and 4.14a.

WiFi/WiFi-Direct Star

In this experiment, the algorithm used for the network formation is different since the network is a composition of sub-networks running on different technologies. Otherwise, the experiments were done in a similar fashion. Figure 4.15a depicts the amount of time that each device takes, on average, to be part of the network, y-axis (seconds), over an increasing number of devices, x-axis. The chart shows four categories, green, yellow, red, blue, denoting the delay time for each device to enter into the network, 0, 1, 2, 4 seconds respectively. Once again the results suggest introducing a small delay is advantageous since allows to each device to be part of the network quickly. For the 4.2. NETWORK LAYER 93

100

80

60 ) s ( e m i T 40

20

0 5 10 15 20 # Devices

D0 D1 D2 D4 (a) Formation time.

17.5

15

12.5

) 10 s ( e m i T 7.5

5

2.5

0 5 10 15 20

F0 F1 F2 F4 C0 C1 C2 C4 (b) Time per action.

5

4

r 3 e b m u N

# 2

1

0 5 10 15 20 # Devices

G0 G1 G2 G4 Best Conf (c) Groups

Figure 4.15: WiFi/WiFi-Direct star formation. 94 CHAPTER 4. MIDDLEWARE EVALUATION network size of 15 and 20 devices the time it takes on delay 0 is twice when comparing with the other experiments delay 1, 2 and 4. For the experiment with delay 1, 2 and 4 the results are very similar - same dispersion time (quartiles 1, 2 and 3). Also, as in the previous scenarios, as more devices are added to the network the number of outliers increases.

Figure 4.15b shows the time it took to perform each formation action, y-axis. The x-axis show the network size. In this experiment, there is no discovery phase because the all devices passively listen for broadcast messages and act accordingly. Despite this, it is possible to see the relation of the connection times with the median times on chart, in Figure 4.15a, connection times for experiments with delay 0 being higher than with other delays applied. At same time, the time spent on connection failures is much higher mainly for network length of 15 and 20 devices. As before, connection failures decrease as the time delays per device increase, resulting in a positive impact on the time to join the network.

The Figure 4.15c represents the number of groups generated for each experiment. The y-axis shows the number of groups while the x-axis the number of devices. As in the previous chart, each column represents the time delay for each device to enter on the network. There is an extra column (in red) that represents the optimal number of groups that should be formed with the available devices. Due the distributed consensus used for forming WiFi-Direct networks, the number of groups created is close to optimal. For example when a group is full another is device is elected to be the group owner, this group will be filled and then device will be elected to group owner. When a group is full the group owner rejects new connections.

4.3 Discussion

In this chapter we evaluated the link and network layers.

We measured the resource usage for Bluetooth- and WiFi-based technologies, used as the building blocks for the link layer. We confirmed that Bluetooth and Bluetooth LE, though very limited in terms of bandwidth, are very energy efficient. They are amenable for apps that require low data rate exchange, e.g., chat or sensing apps. In contrast, the WiFi-based technologies consume more energy, but provide more bandwidth and lower latency, compensating for the extra energy expenditure if the application scenario demands it, e.g., HD video streaming or photo sharing. 4.3. DISCUSSION 95

From the point of view of the network layer, we evaluated the packet routing and we observe that every technology has a bandwidth threshold up from which the latency grows rapidly. For example, Bluetooth handles traffic well up to 120 KB/s while WiFi can handle traffic up to 10 MB/s. We have also shown that the communication with multiple WiFi-Direct groups is possible, using an intermediate WiFi AP, with a reasonable latency and a bandwidth threshold of 3.8 MB/s. For larger networks the routing algorithm - scoped flooding - would require a few optimisations in order to prevent too much redundant traffic from flowing.

We also analysed the overhead of the network formation in which we measure the time that a device takes to join a network and, also, the number of groups created. We conclude that introducing a small delay between consecutive join requests improved the average formation time. Furthermore, it is clear that the discovery of devices is the phase that costs more time during formation. This seems to be due to the fact that multiple discovery attempts must be performed due to connection failures or to empty discovery results. In terms of the number of groups formed, Bluetooth and WiFi-Direct produced roughly the same amount of sub-optimal size. The combination of WiFi and WiFi-Direct, on the other hand, always created a near optimal number of groups.

From these observations, and given the fact that the middleware has not been sys- tematically optimised, we believe that some improvements can be introduced such as: using less greedy routing algorithms to save bandwidth; taking advantage of the best qualities of distinct wireless technologies for network formation (e.g., Bluetooth + WiFi); and, for more specific scenarios, using the WiFi-TDLS to offload the traffic into spare channels. 96 CHAPTER 4. MIDDLEWARE EVALUATION Chapter 5

Other Contributions

Beside the design, implementation and evaluation of the Hyrax middleware, as de- scribed in the previous 2 chapters, we invested a considerable amount of work in assessing the performance of available wireless technologies in mobile networks, namely WiFi-Direct and Bluetooth, and designed and implemented an edge-cloud application for video dissemination. An initial version of this application used the Android API for device-to-device communication. Later, the application was ported to the Hyrax middleware and used in two real world experiments, one of then in a sports venue during a professional Volleyball game. This work is described in Sections 5.1, 5.2 and 5.3. Finally, the development of the middleware allowed several services and applications to be implemented by other members of the project team using the provided API. In section 5.4 we describe these projects as they provide the required closure to the middleware, namely providing basic services for layer 3. Besides the middleware software, we also provided occasional support to team members, or imple- mented additional features deemed important.

5.1 Wireless Technology Assessment

Mobile devices are now equipped with multi-core processors, multi-GB memory and multiple communication interfaces. Simultaneously, new standards and protocols, such as WiFi-Direct and WiFi-TDLS (Tunneled Direct Link Setup), have been established that allow mobile devices to talk directly with each other, as opposed to over the Internet or across WiFi access points. This can, potentially, lead to ubiquitous, low- latency, device-to-device (D2D) communication.

97 98 CHAPTER 5. OTHER CONTRIBUTIONS

Part of our preliminary work on the Hyrax project consisted on evaluating the perfor- mance of the different wireless technologies and protocols for specific application on non-rooted devices. The technologies in question were WiFi, WiFi-TDLS and WiFi- Direct. The application chosen was peer-to-peer video dissemination, which was of interest as it had direct consequences for the “User Generated Replays” scenario envi- sioned in the projects’ proposal. We defined a set of content dissemination scenarios for downloading video replays from a soccer game in a stadium (Figure 5.1). In all scenarios we assume WiFi is used for communication. In the case of pure WiFi, all scenarios include a traditional, dedicated, access point that the devices connect to and use for communication with other devices, the access point working as a relay. When using WiFi with TDLS, the access point is used only for the initial contact between the nodes. In the case of WiFi-Direct, this role is performed by the group owner.

Server Server and Soft Ap Gigabit

TDLS TDLS

Client Client Client Client Client Client Client Client Client Server Server Client

(a) WiFi Server. (b) WiFi M. Server. (c) WiFi-TDLS. (d) WiFi-Direct.

Figure 5.1: Benchmark Scenarios (from [85]).

In the first scenario 5.1a we have a main server, acting as the source of the video replays. The server is connected to an access point and all the clients are mobile. The seconds scenario 5.1b the central server is just removed, sticking just with the access point, the servers and the clients being all mobile. In the third scenario 5.1c the access point is used as first contact between the devices and then the TDLS allows the devices to communicate directly and transfer data. Again the servers and the clients are all mobile under the same access point. Finally, in the fourth scenario the configuration is similar, as the second one, however the access point replaced by one of the devices acting as a hot-spot, the group owner. All communication between mobile servers and clients must go through this device.

The experiments were setup in such a way that we guarantee that all the content requested by the clients during the runs exists in all servers. In each experiment, clients were required to download 20 video files, each 3 MB in size, from the servers. Before starting the transfer, each client computes a random permutation of the 20 file names using a uniform distribution, to even out the requests for each individual file during an experiment. Accesses were also performed randomly in time, as each client 5.2. USER GENERATED REPLAYS: PART I 99 waited a random time interval, within given bounds, before requesting the next file of the sequence. For each scenario, we ran a set of experiments with a varying number of mobile servers/clients. Each experiment was repeated 8 times to smooth out statistic flukes. We ran the experiments in two phases. In the first there was one server and an increasing number of clients (1 to 16). In the second phase the number of clients was fixed (12 clients) as the number of servers grew (1 to 8). For the results we measure the average download time per file, the traffic handled by AP and the average energy consumed per experiment. We found that the energy consumption is highly related with the use of the wireless ratio, due the packet loss and collisions which leads to re- transmissions of the same information. We were able to find, also, that the experiment with one central server outperforms the other scenarios with one mobile server plus other wireless technologies. Although in the multiple servers scenarios the WiFi-TDLS outperforms, by far, the other technologies because it manages to re-utilise the spare channels in the spectrum (2.4 GHz and 5 GHz) to perform parallel downloads under the same AP. We also observed that the traffic handled by the AP was reduced by 65% which leads to conclude that the D2D communication in mobile networks can significantly remove load from the AP [85].

5.2 User Generated Replays: Part I

After the assessment of the wireless technologies we designed and implemented an Android application that allowed users to download and visualise replays from sports games in their mobile devices. We then performed a real world experiment to try to validate some of the results mentioned in the previous section. For the experiment we asked students at the faculty’s student lounge to watch a Champions League game between S. L. Benfica and Besiktas J. K.. To encourage the students to attend, we offered some pizzas and beverages during the half time. To participate in the experi- ment the students needed to install our Android application on their smartphones or to use one of the pre-installed tablets available at the lounge.

We wanted to find out whether there it is an advantage to cache the content locally, in order to disseminate it to nearby devices faster. Some the devices organised themselves dynamically in ad-hoc networks (WiFi-Direct or WiFi-TDLS) while others were just connected to the APs. Our aim was to provide a different network configuration where the distinct devices could communicate in different channels, than the AP, by consequently increasing the throughput. Under the same AP, groups of devices were formed where group owners acted as access points. 100 CHAPTER 5. OTHER CONTRIBUTIONS

Figure 5.2: Cloud to Edge Architecture (from [93]).

To feed the application with videos, part of our team had access to the TV video stream of the game and built a script to extract interesting pieces from it, the replays. After a replay was generated, it was upload to a central server that was responsible for creating the metadata for the video, e.g., name, legend, length, thumbnail. This metadata was then actively disseminate to the devices so that they were aware of the availability of the video. This process was repeated several times during the experiment. The users used the app to check for video publications by the server and to select a subset for download and visualisation. Video content dissemination followed a very simple patterns: content was downloaded to a device by first asking a copy from local devices and them trying the central server if no local copy was available.

In these experiments we observed that the edge-cloud was able to serve up to 80% of connected users and provide 56% of all downloads requested from within. An average of 9.7 groups were active during the game operating at an average capacity of 30%. Moreover, the speed of downloads at the edge was 3 times greater than the ones made through the AP, indicating that the edge-cloud enhances quality of service [93].

5.3 User Generated Replays: Part II

The next step in this work involved an evolution of the aforementioned app so that users were not only capable of downloading video replays but also of capturing their own videos and publish them in the network by sending the corresponding metadata. Also, the mobile networks were now capable of functioning completely untethered, without infrastructure support. This can happen locally, e.g., within WiFi-Direct groups, or in larger networks, in this case with the help of a second network tier composed of modest cloudlet servers organised in a dynamic peer-to-peer mesh that 5.3. USER GENERATED REPLAYS: PART II 101 synchronise their contents on-the-fly. Besides caching the videos published by devices, these servers also work as access points for devices and WiFi-Direct groups within a given spatial region.

Mesh Network

Wifi Network Wifi Network Wifi Network

Wifi Direct Network Wifi Direct Network

Figure 5.3: The Edge Cloud Architecture (from [84]).

Figure 5.3 shows the 2-tier architecture we used for this evolution. First, in tier 1, we have a mesh of cloudlet servers (in this case Raspberry Pi devices) that actively cache video contents published by tier-2 devices. They feature 2 wireless network interfaces: one to support the cloudlet mesh, the other to provide access points to tier 2 devices. The cloudlet servers actively synchronise their local video caches so that local videos can be accessed by devices under remote cloudlet servers. This is done on a best-effort basis, as no attempt is made to provide any strong form of consistency between the contents of the caches at each cloudlet server since eventual consistency will be attained. Mobile devices in tier 2 can form WiFi-Direct groups and use them to disseminate local contents, or they can connect directly to a cloudlet server. Each device also features a cache for holding the videos it generates and publishes plus the videos it downloads from other devices or cloudlet servers.

The software was validated experimentally in the real world setting of a Portuguese league volleyball game. The edge-cloud was composed of mobile devices, possibly organised in WiFi-Direct groups, that produced and consumed videos, and a mesh of three Raspberry Pi cloudlets that cached and disseminated the videos produced by the devices. The experiment, Figure 5.4, showed that the edge cloud was sufficiently robust to provide videos to tens of users with low latencies. Moreover, the multiple 102 CHAPTER 5. OTHER CONTRIBUTIONS

Figure 5.4: Deployment at Nave Desportiva de Espinho (from [84]).

caching levels at devices, WiFi-Direct groups and cloudlets made it resilient to device or cloudlet failures. In particular, we illustrated that an unexpected long-term fault by one the cloudlets could be compensated through the combined effects of caching and churn for opportunistic content sharing. The role of the WiFi-Direct groups was specially relevant in this overall picture. They significantly offloaded traffic from the mesh infrastructure, by involving 49% of devices on average and serving 43% of the downloads issued by devices in such groups. Moreover, we used churn induced by the natural movement of devices in the venue to disseminate contents opportunistically, allowing devices to publish their videos to neighbouring edge-clouds as they enter them [84].

5.4 Edge-Cloud Services and Apps

Several services and apps have been implemented using the Hyrax middleware or a subset of its API. The services are part of layer 3 of the middleware and provide distributing computing, storage and publish/subscribe APIs for user applications. Some applications have been developed, e.g., Ramble, that plug into these API for more agile development. 5.4. EDGE-CLOUD SERVICES AND APPS 103

5.4.1 Distributed Computing

P3-Mobile [89] is a service for opportunistic, best-effort parallel computing, based on a prior version of the P3 system (Parallel Peer-to-Peer) [70] for parallel compu- tation, developed for traditional desktop/server-based cabled networks. The system builds a peer-to-peer hierarchical overlay on top of WiFi or WiFi-Direct networks. Computational tasks are then assigned for parallel execution as new devices join the overlay, but the scheme is also fault-tolerant when devices leave without notice. A snapshot mechanism keeps track of the completed portion of the workload for a given task. Thus, there is an on-the-fly reconfiguration of workload of the computation that is churn-tolerant. P3-Mobile also implements a simple distributed key-value store that can be used as a communication channel between devices, e.g., for reporting the results of subtasks. An early proof-of-concept application was implemented on top of P3-Mobile to assess a best case scenario performance. The example chosen was the “embarassingly parallel” Mandelbrot set computation. Substantial speedups were observed for this application tested with up to 16 Android devices. Scaling up performance with network size is, however, far more difficult in the general case where parallel computations contains multiple synchronisation points. Thus, such computing services are obviously not to be seen as alternatives to powerful servers but rather as opportunistic platforms that can, occasionally, improve application response time by crowd-sourcing their processor and memory resources and take advantage of data locality.

Panoptic [36] is an app that seamlessly combines three different domains, edge comput- ing, computer vision and security. The idea is to demonstrate the possibility of running heavyweight algorithms on devices, without depleting the device’s battery and at same time enforcing a private layer that minimises information leakage. The application aims to help the authorities searching for missing people, following the principles presented on the Amber Alert system, where the authorities emit an emergency broadcast. It was designed with the new GDPR (General Data Protection Regulation) in mind, in which the authors pursue data minimisation, user empowerment and accountability/traceability. The devices are connected through WiFi or WiFi-Direct groups and each of them has images galleries. The objective is for the computation vision algorithms to run, on every device, and identify a missing person. The approach can be invasive, thus to minimise potential privacy breaches, this service uses encryp- tion techniques and secure protocols and minimise the amount of sensitive data that is transferred. Apart from the mobile devices, the system uses a trusted cloudlet that validates every search request in order to prevent potential misuses of the system, e.g, 104 CHAPTER 5. OTHER CONTRIBUTIONS stalkers.

5.4.2 Distributed Storage

Ephesus [91] is a distributed file storage service for mobile edge-clouds. It allows users to share files without requiring infrastructural communications, through a key-value store API implemented on top of a distributed hash table (DHT). To cope with churn, the service implements a best-effort approach to data consistency and persistence. The built-in mechanisms are adaptive to the popularity of data items, e.g., more frequently requested items are prioritised for replication among peers. Ephesus is also energy- aware, making a device disengage from the DHT when its battery level goes below a certain threshold. The system has been demonstrated by a shared photo gallery application (e.g., that can be used in party gatherings) where users take photos and share it with others.

5.4.3 Publish/Subscribe

Thyme [92] is a time-aware publish-subscribe service for mobile edge-clouds. It follows the typical operation of a publish/subscribe messaging system, but is time-aware in the sense that subscriptions have an associated time interval, i.e., are active only between specified start and end times. Geographical routing algorithms are employed, where devices are organised in spatial clusters, allowing messages to be routed without knowledge of the network topology or a priory route discovery. The point is to avoid the network communication overheads of standard routing protocols, and instead rely on the fact that device location has a strong connection to network topology in mobile edge-clouds.

Ramble [38] is an application for opportunistic generation, dissemination and visu- alisation of geo-tagged user-generated content. Ramble makes use of mobile edge clouds enabled by WiFi-Direct, lightweight proximity cloudlets that act as repositories and disseminators of user-collected data in small areas, plus, optionally, traditional cloud servers accessible via Internet. The peers are connected through WiFi-Direct (for D2D communication), WiFi (device-to-cloudlet), mesh networking (cloudlet-to- cloudlet), and WiFi or 3G/4G Internet if available (for cloud server communication). The main goal of the system is that user-generated content is disseminated during fortuitous connections among devices or between devices and cloudlets or cloud servers, as we envision that connections will be volatile as users roam in a reasonably large 5.4. EDGE-CLOUD SERVICES AND APPS 105 space. The system is symmetric in the sense that each peer may act as a content generator, consumer, cache, and disseminator, with no pre-established hierarchy in terms of functionality. The main functionalities are, local storage (SQlite), location- awareness (GPS), dynamic network formation (WiFi-Direct + B.A.T.M.A.N.), services discovery and content synchronisation. The motivational scenario was the gathering of local intelligence in the aftermath of catastrophic events (e.g., tsunamis, earthquakes, landslides, hurricanes), but far more (candid) applications can be of interest (e.g., exchange of information between roaming tourists in a city). 106 CHAPTER 5. OTHER CONTRIBUTIONS Chapter 6

Conclusions

Overview. In this dissertation we study the problem of crowd-sourcing the compu- tational and storage resources of networks of mobile devices at the edge of the Internet into local computational infrastructures called edge-clouds. These take advantage of data locality and low latency, high bandwidth, wireless technologies to provide services such opportunistic distributed computing, distributed storage, publish/subscribe and data streaming. In particular, we identify specific application scenarios for which such an edge infrastructure can be highly beneficial or simply the only reasonable option for implementation.

Based on these motivational examples, for which simple prototypes were implemented from scratch using Android’s API, we gained insight into the technical requisites and issues associated with the design and implement of a middleware for edge-clouds. Through a rich API, this software seamlessly allows programmers to form and manage mobile edge-clouds and provides basic computational and storage services to support crowd-sourcing applications.

The middleware’s design and implementation was an interactive process that was both influenced by the available literature and, mostly, by the experience acquired with the preliminary experiments and the implementation of simple prototypes for different scenarios. Of particular relevance was the initial evaluation of wireless technologies in the setting of video dissemination in a crowded venue with interaction with on-site wireless infrastructure. This was quickly followed by the implementation, from scratch, of a crowd-sourcing app for video dissemination which helped the identification of the main middleware abstractions and the definition of the API.

With the specification at hand we proceeded to implement the middleware and to

107 108 CHAPTER 6. CONCLUSIONS port existing prototypes still anchored in Android’s communication API into our middleware. This allowed several core services to be implemented as well as to upgrade existing prototypes to the point where real world experiments, such as the one at Nave Desportiva de Espinho, were possible. A thorough evaluation of the middleware performance and latency was done to characterize strong points and weaknesses to be worked on in future versions.

Main Conclusions. As a whole, we can summarize the main conclusions of this work in the following topics:

• multiple wireless technologies can be combined to provide a larger network, allowing to surpass the limitations of individual technologies;

• selection of technologies according to contents, e.g., Bluetooth for metadata and mesh control messages and WiFi/WiFi-Direct for video contents, is beneficial for performance and QoS in edge-clouds;

• the middleware provides a set of core services that readily provide the building blocks for new crowd-sourcing applications, fostering their agile development, as shown by the multiple apps already built on top of it;

• although it is an application-layer middleware, it does not introduce significant overhead in the execution and response time of crowd-sourcing applications;

• efficient data dissemination in untethered scenarios is possible and can be done efficiently at least for tens of users simultaneously;

• when used in conjunction with traditional wireless infrastructure, such crowd- sourcing applications can significantly remove load from access points by pro- viding local computational and storage (e.g., cache) services to neighborhoods of mobile devices;

• churn, often seen as a major problem in highly dynamic mobile networks, can be of major importance to improve data dissemination in ways similar to epidemic algorithms.

Future Work. While several services and apps have been developed on top of the middleware, a unified service layer API is still lacking. This layer, composed of several modules, would provide distributed computing, distributed storage, publish/subscribe 109 and streaming APIs for developers. We see this as the obvious next step in the evolution of the middleware.

Another interesting development related to the recent interest in crowd-sensing apps. In this context, the mobile devices could be an important source of information, given their numerous and often sophisticated builtin sensors. Moreover, the growth of the Internet of Things (IoT) paradigm raises the possibility that mobile devices, and networks of, may work as aggregators and/or streaming replays for data originating in devices such as home of office appliances. We believe that this line of work can trigger new game-changing applications that will improve daily life. 110 CHAPTER 6. CONCLUSIONS References

[1] M. Abolhasan, T. Wysocki, and E. Dutkiewicz. A review of routing protocols for mobile ad hoc networks. Ad Hoc Networks, 2(1):1 – 22, 2004.

[2] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. A survey on sensor networks. IEEE Communications Magazine, 40(8):102–114, 2002.

[3] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless sensor networks: a survey. Computer Networks, 38(4):393 – 422, 2002.

[4] A. Allavena, A. Demers, and J.E. Hopcroft. Correctness of a gossip based membership protocol. In Proceedings of the Twenty-fourth Annual ACM Symposium on Principles of Distributed Computing, PODC ’05, pages 292–301. ACM, 2005.

[5] Alljoyn Framework. https://openconnectivity.org/developer/ reference-implementation/alljoyn. last visited in 27/02/2018.

[6] Alljoyn Framework Documentation. https://allseenalliance.org/ framework/documentation. last visited in 22/03/2016.

[7] Android Market Share. https://developer.android.com/about/dashboards. last visited in 13/11/2018.

[8] Apple’s Implementation of Bonjour. https://www.apple.com/support/ bonjour/. last visited in 22/03/2016.

[9] Apple’s Multipeer Framework. https://goo.gl/332lwR. last visited in 22/03/2016.

[10] S. Basagni, R. Bruno, G. Mambrini, and C. Petrioli. Comparative performance evaluation of scatternet formation protocols for networks of bluetooth devices. Wirel. Netw., 10(2):197–213, 2004.

111 112 REFERENCES

[11] Bluetooth. https://www.bluetooth.com/specifications/ bluetooth-core-specification. last visited in 26/03/2018.

[12] Bluetooth 5. https://blog.bluetooth.com/bluetooth-5-is-here. last visited in 26/03/2018.

[13] Bluetooth low energy. https://blog.bluetooth.com/ bluetooth-low-energy-it-starts-with-advertising. last visited in 26/03/2018.

[14] A. Boutet, S. Frenot, F. Laforest, P. Launay, N. Le Sommer, Y. Maheo, and D. Reimert. C3po: A network and application framework for spontaneous and ephemeral social networks. In J. Wang, W. Cellary, D. Wang, H. Wang, S. Chen, T. Li, and Y. Zhang, editors, Web Information Systems Engineering – WISE 2015, pages 348–358. Springer International Publishing, 2015.

[15] Callback Hell. http://callbackhell.com/. last visited in 13/11/2018.

[16] C. Casetti, C.F. Chiasserini, L.C. Pelle, C.D. Valle, Y. Duan, and P. Giaccone. Content-centric routing in wi-fi direct multi-group networks. In 2015 IEEE 16th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), pages 1–9, 2015.

[17] F. Cerqueira, J.A. Silva, J.M. Louren¸co,and H. Paulino. Towards a persistent publish/subscribe system for networks of mobile devices. In Proceedings of the 2nd Workshop on Middleware for Edge Clouds & Cloudlets, MECC ’17, pages 2:1–2:6. ACM, 2017.

[18] Chacha. https://www.crunchbase.com/organization/chacha# section-overview. last visited in 20/02/2018.

[19] Z. Chen, E. A. Yavuz, and G. Karlsson. What a juke! a collaborative music sharing system. In 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pages 1–6, 2012.

[20] Stuart Cheshire. Multicast dns. http://www.multicastdns.org/. last visited in 01/03/2018.

[21] Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016–2021 White Paper. https://www.cisco.com/c/en/us/solutions/ collateral/service-provider/visual-networking-index-vni/ mobile-white-paper-c11-520862.html. last visited in 07/03/2018. REFERENCES 113

[22] T. Clausen, C. Dearlove, P. Jacquet, and U. Herberg. Olsr rfc. https://tools. ietf.org/html/rfc3626. last visited in 09/11/2018.

[23] Bram Cohen. Bittorrent. http://bittorrent.org/beps/bep_0052.html. last visited in 01/03/2018.

[24] B.J. Donegan, D.C. Doolan, and S. Tabirca. Mobile message passing using a scatternet framework. International Journal of Computers Communications & Control, 3(1):51–59, 2008.

[25] D.C. Doolan, S. Tabirca, and L.T. Yang. Mobile parallel computing. In 2006 Fifth International Symposium on Parallel and Distributed Computing, pages 161–167, 2006.

[26] D.C. Doolan, S. Tabirca, and L.T. Yang. MMPI a Message Passing Interface for the Mobile Environment. In Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia, MoMM ’08, pages 317–321. ACM, 2008.

[27] D.C. Doolan, S. Tabirca, and L.T. Yang. Mmpi a message passing interface for the mobile environment. In Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia, MoMM ’08, pages 317–321. ACM, 2008.

[28] U. Drolia, R. Martins, J. Tan, A. Chheda, M. Sanghavi, R. Gandhi, and P. Narasimhan. The case for mobile edge-clouds. In Ubiquitous Intelligence and Computing, 2013 IEEE 10th International Conference on and 10th International Conference on Autonomic and Trusted Computing (UIC/ATC), pages 209–215, 2013.

[29] U. Drolia, N. Mickulicz, R. Gandhi, and P. Narasimhan. Krowd: A key-value store for crowded venues. In Proceedings of the 10th International Workshop on Mobility in the Evolving Internet Architecture, MobiArch’15, pages 20–25. ACM, 2015.

[30] Y. Duan, C. Borgiattino, C. Casetti, C.F. Chiasserini, P. Giaccone, M. Ricca, F. Malabocchia, and M. Turolla. Wi-fi direct multi-group data dissemination for public safety. In WTC 2014; World Telecommunications Congress 2014, pages 1–6, 2014. 114 REFERENCES

[31] N. Fernando, S.W. Loke, and W. Rahayu. Mobile crowd computing with work stealing. In 2012 15th International Conference on Network-Based Information Systems, pages 660–665, 2012.

[32] N. Fernando, S.W. Loke, and W. Rahayu. Honeybee: A programming framework for mobile crowd computing. In K. Zheng, M. Li, and H. Jiang, editors, Mobile and Ubiquitous Systems: Computing, Networking, and Services, pages 224–236. Springer Berlin Heidelberg, 2013.

[33] N. Fernando, S.W. Loke, and W. Rahayu. Computing with nearby mobile devices: a work sharing algorithm for mobile edge-clouds. IEEE Transactions on Cloud Computing, 2016.

[34] Firebase. https://firechat.firebaseapp.com/. last visited in 17/03/2016.

[35] OpenGarden’s FireChat App. http://opengarden.com/firechat/. last visited in 19/02/2016.

[36] T. Freitas, J. Rodrigues, D. Bogas, M. Coimbra, and R. Martins. Panoptic, Privacy over Edge-Clouds. In IEEE 6th International Conference on Future Internet of Things and Cloud (FiCloud’18), pages 325–332, 2018.

[37] C. Funai, C. Tapparello, and W.B. Heinzelman. Supporting multi-hop device-to-device networks through wifi direct multi-group networking. CoRR, abs/1601.00028, 2016.

[38] M.A.F. Garcia. Ramble: Opportunistic Content Dissemination for Infrastructure-Deprived Environments. Master’s thesis, Department of Com- puter Science, Faculty of Sciences, University of Porto, 2018.

[39] P. Gardner-Stephen, R. Challans, J. Lakeman, A. Bettison, D. Gardner-Stephen, and M. Lloyd. The serval mesh: A platform for resilient communications in disaster amp; crisis. In Global Humanitarian Technology Conference (GHTC), 2013 IEEE, pages 162–166, 2013.

[40] P. Gardner-Stephen and S. Palaniswamy. Serval mesh software-wifi multi model management. In Proceedings of the 1st International Conference on Wireless Technologies for Humanitarian Relief, ACWR ’11, pages 71–77. ACM, 2011.

[41] Paul Gardner-Stephen. The serval project: Practical wireless ad-hoc mobile telecommunications. Flinders University, Adelaide, South Australia, Tech. Rep, 2011. REFERENCES 115

[42] Google Nearby. https://developers.google.com/nearby/. last visited in 29/03/2016.

[43] Zygmunt J. Haas, Marc R. Pearlman, and Prince Samar. The Zone Routing Protocol (ZRP) for Ad Hoc Networks. http://people.ece.cornell.edu/ haas/wnl/Publications/draft-ietf-manet-zone- zrp-04.txt, 2002. last visited in 05/04/2016.

[44] K. Habak, M. Ammar, K. A. Harras, and E. Zegura. Femto clouds: Leveraging mobile devices to provide cloud service at the edge. In 2015 IEEE 8th International Conference on Cloud Computing, pages 9–16, 2015.

[45] Apache hadoop. http://hadoop.apache.org/. last visited in 20/02/2018.

[46] T. Hamma, T. Katoh, B. B. Bista, and T. Takata. An efficient zhls routing protocol for mobile ad hoc networks. In Database and Expert Systems Applications, 2006. DEXA ’06. 17th International Workshop on, pages 66–70, 2006.

[47] A. Hayes and D. Wilson. Peer-to-peer information sharing in a mobile ad hoc environment. In Sixth IEEE Workshop on Mobile Computing Systems and Applications, pages 154–162, 2004.

[48] A. Heinemann, J. Kangasharju, F. Lyardet, and M. M¨uhlh¨auser. iclouds – peer-to-peer information sharing in mobile environments. In H. Kosch, L. B¨osz¨orm´enyi, and H. Hellwagner, editors, Euro-Par 2003 Parallel Processing, pages 1038–1045. Springer Berlin Heidelberg, 2003.

[49] O. Helgason, S.T. Kouyoumdjieva, L. Pajevi´c,E.A. Yavuz, and G. Karlsson. A middleware for opportunistic content distribution. Computer Networks, 107:178 – 193, 2016. Mobile Wireless Networks.

[50] O.R. Helgason, E.A. Yavuz, S.T. Kouyoumdjieva, L. Pajevic, and G. Karlsson. A mobile peer-to-peer system for opportunistic content-centric networking. In Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds, MobiHeld ’10, pages 21–26. ACM, 2010.

[51] X. Hong, K. Xu, and M. Gerla. Scalable routing protocols for mobile ad hoc networks. IEEE Network, 16(4):11–21, 2002.

[52] jdeferred. https://github.com/jdeferred/jdeferred. last visited in 13/11/2018. 116 REFERENCES

[53] D. Johnson, Y. Hu, and D. Maltz. Dsr rfc. https://tools.ietf.org/html/ rfc4728. last visited in 09/11/2018.

[54] Patrick Kirk. Gnutella rfc. http://rfc-gnutella.sourceforge.net/. last visited in 10/10/2018.

[55] G. Kortuem, J. Schneider, D. Preuitt, T. G. C. Thompson, S. Fickas, and Z. Segall. When peer-to-peer comes face-to-face: collaborative peer-to-peer computing in mobile ad-hoc networks. In Proceedings First International Conference on Peer-to-Peer Computing, pages 75–91, 2001.

[56] S. Kouyoumdjieva, E.A. Yavuz, O. Helgason, L. Pajevic, and G. Karlsson. Opportunistic content-centric networking: The conference case demo. demon- stration at IEEE Infocom, 2011.

[57] C. Law, A.K. Mehta, and K.Y. Siu. A new bluetooth scatternet formation protocol. Mob. Netw. Appl., 8(5):485–498, 2003.

[58] M. Lin, K. Marzullo, and S. Masini. Gossip versus Deterministically Constrained Flooding on Small Networks, pages 253–267. Springer Berlin Heidelberg, 2000.

[59] Little Proxy. https://github.com/adamfisk/LittleProxy. last visited in 13/11/2018.

[60] I. Marfisi-Schottman, G. Karlsson, and J. Celander Guss. Opphos - a participative light and sound show using mobile phones in crowds. In ExtremCom International Conference, pages 47–48, Th´orsm¨ork,Iceland, August 2013.

[61] E.E. Marinelli. Hyrax: Cloud computing on mobile devices using mapreduce. Master’s thesis, Master’s Thesis, Carnegie Mellon University, 2009.

[62] C. Mbarushimana and A. Shahrabi. Comparative study of reactive and proactive routing protocols performance in mobile ad hoc networks. In Advanced Information Networking and Applications Workshops, 2007, AINAW ’07. 21st International Conference on, volume 2, pages 679–684, 2007.

[63] OpenGarden’s MeskKit SDK. https://www.opengarden.com/meshkit.html. last visited in 19/07/2018.

[64] Gy. Mikl´os,A. R´acz,Z. Tur´anyi, A. Valk´o,and P. Johansson. Performance aspects of bluetooth scatternet formation. In Proceedings of the 1st ACM International Symposium on Mobile Ad Hoc Networking & Computing, MobiHoc ’00, pages 147–148. IEEE Press, 2000. REFERENCES 117

[65] A. Moghadam, S. Srinivasan, and H. Schulzrinne. 7ds - a modular platform to de- velop mobile disruption-tolerant applications. In 2008 The Second International Conference on Next Generation Mobile Applications, Services, and Technologies, 2008.

[66] MPI. https://www.open-mpi.org/. last visited in 21/02/2018.

[67] Amazon mechanical turk. https://www.mturk.com/. last visited in 20/02/2018.

[68] D.G. Murray, E. Yoneki, J. Crowcroft, and S. Hand. The case for crowd computing. In Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds, MobiHeld ’10, pages 39–44. ACM, 2010.

[69] Nfc. http://nearfieldcommunication.org/about-nfc.html. last visited in 26/03/2018.

[70] L. Oliveira, L. Lopes, and F.M.A. Silva. P: Parallel peer to peer. In Revised Papers from the NETWORKING 2002 Workshops on Web Engineering and Peer-to-Peer Computing, pages 274–288. Springer-Verlag, 2002.

[71] Open Peer Sdk. http://openpeer.org/. last visited in 22/03/2016.

[72] P2p kit. http://p2pkit.io/. last visited in 25/03/2016.

[73] M. Papadopouli and H. Schulzrinne. Effects of power conservation, wireless coverage and cooperation on data dissemination among mobile devices. In Proceedings of the 2nd ACM International Symposium on Mobile Ad Hoc Networking &Amp; Computing, MobiHoc ’01, pages 117–127. ACM, 2001.

[74] C. Perkins, C. Perkins, and S. Das. Aodv rfc. https://www.ietf.org/rfc/ rfc3561.txt. last visited in 09/11/2018.

[75] C.E. Perkins and P. Bhagwat. Highly dynamic destination-sequenced distance- vector routing (dsdv) for mobile computers. SIGCOMM Comput. Commun. Rev., 24(4):234–244, 1994.

[76] C. Petrioli, S. Basagni, and M. Chlamtac. Configuring bluestars: multihop scatternet formation for bluetooth networks. IEEE Transactions on Computers, 52(6):779–790, 2003.

[77] Promises. http://jdeferred.org/. last visited in 13/11/2018. 118 REFERENCES

[78] Protocol Buffers. https://developers.google.com/protocol-buffers/. last visited in 13/11/2018.

[79] M.K. Rafsanjani, S. Asadinia, and F. Pakzad. Communication and Networking: International Conference, FGCN 2010, Held as Part of the Future Generation Information Technology Conference, FGIT 2010, Jeju Island, Korea, December 13-15, 2010. Proceedings, Part II, chapter A Hybrid Routing Algorithm Based on Ant Colony and ZHLS Routing Protocol for MANET, pages 112–122. Springer Berlin Heidelberg, 2010.

[80] D. Rem´edios, A. Te´ofilo, H. Paulino, and J. Louren¸co. Mobile device-to- device distributed computing using data sets. In Proceedings of the 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services on 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, MOBIQUITOUS, pages 297–298. ICST, 2015.

[81] Rfc3927. https://tools.ietf.org/html/rfc3927. last visited in 28/04/2016.

[82] B. Richard, D. Mac Nioclais, and D. Chalon. Clique: A transparent, peer-to- peer replicated file system. In M. Chen, P.K. Chrysanthis, M. Sloman, and A. Zaslavsky, editors, Mobile Data Management, pages 351–355. Springer Berlin Heidelberg, 2003.

[83] J. Rodrigues, E.R.B. Marques, L.M.B. Lopes, and F. Silva. Towards a middleware for mobile edge-cloud applications. In Proceedings of the 2nd Workshop on Middleware for Edge Clouds & Cloudlets, MECC ’17, pages 1:1– 1:6. ACM, 2017.

[84] J. Rodrigues, E.R.B. Marques, J. Silva, L. Lopes, and F. Silva. Video dissemination in untethered edge-clouds: a case study. In Proc. 18th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), DAIS’18, pages 137–152. Springer, 2018.

[85] J. Rodrigues, J. Silva, R. Martins, L. Lopes, U. Drolia, P. Narasimhan, and F. Silva. Benchmarking wireless protocols for feasibility in supporting crowdsourced mobile computing. In Proceedings of Distributed Applications and Interoperable Systems (DAIS’16), pages 96–108. Springer, 2016.

[86] R. Roy, M. Kumar, N.K. Sharma, and S. Sural. A self-organising protocol for bluetooth scatternet formation. European Transactions on Telecommunications, 16(5):483–493, 2005. REFERENCES 119

[87] Serval Project. http://www.servalproject.org/. last visited in 17/03/2016.

[88] Serval Project Wifi. http://developer.servalproject.org/dokuwiki/doku. php. last visited in 17/03/2016.

[89] J. Silva, D. Silva, E.R.B. Marques, L. Lopes, and F. Silva. P3-mobile: Parallel computing for mobile edge-clouds. In Proceedings of the 4th Workshop on CrossCloud Infrastructures & Platforms, Crosscloud’17, pages 5:1–5:7. ACM, 2017.

[90] J.A. Silva, J. Leit ao, N. Pregui¸ca,J.M. Louren¸co,and H. Paulino. Towards the opportunistic combination of mobile ad-hoc networks with infrastructure access. In Proceedings of the 1st Workshop on Middleware for Edge Clouds & Cloudlets, MECC ’16, pages 3:1–3:6. ACM, 2016.

[91] J.A. Silva, R. Monteiro, H. Paulino, and J.M. Louren¸co.Ephemeral data storage for networks of hand-held devices. In 2016 IEEE Trustcom/BigDataSE/ISPA, pages 1106–1113, 2016.

[92] J.A. Silva, H. Paulino, J.M. Louren¸co, J. Leit˜ao, and N.M. Pregui¸ca.Time-aware publish/subscribe for networks of mobile devices. CoRR, abs/1801.00297, 2018.

[93] P.M.P. Silva, J. Rodrigues, J. Silva, R. Martins, L. Lopes, and F. Silva. Using edge-clouds to reduce load on traditional wifi infrastructures and improve quality of experience. In 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC), pages 61–67, 2017.

[94] N. Le Sommer, P. Launay, and Y. Mah´eo. A framework for opportunistic networking in spontaneous and ephemeral social networks. In Proceedings of the 10th ACM MobiCom Workshop on Challenged Networks, CHANTS ’15, pages 1–4. ACM, 2015.

[95] Wifi-tdls. http://www.wi-fi.org/news-events/newsroom/ wi-fi-alliance-now-certifying-tunneled-direct-link-setup. last visited in 26/03/2018.

[96] A. Te´ofilo, D. Rem´edios, H. Paulino, and J. Louren¸co. Group-to-group bidirectional wi-fi direct communication with two relay nodes. In Proceedings of the 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services on 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, MOBIQUITOUS, pages 275–276. ICST, 2015. 120 REFERENCES

[97] Andrew Tridgell. The rsync algorithm. https://rsync.samba.org/tech_ report/. last visited in 01/03/2018.

[98] R. van Renesse, Y. Minsky, and M. Hayden. A Gossip-Style Failure Detection Service, pages 55–70. Springer London, 1998.

[99] A.I. Wang, T. Bjornsgard, and K. Saxlund. Peer2me - rapid application frame- work for mobile peer-to-peer applications. In 2007 International Symposium on Collaborative Technologies and Systems, pages 379–388, 2007.

[100] Z. Wang, R. J. Thomas, and Z. Haas. Bluenet - a new scatternet formation scheme. In Proceedings of the 35th Annual Hawaii International Conference on System Sciences, pages 9 pp.–, 2002.

[101] webrtc. https://webrtc.org/. last visited in 28/04/2016.

[102] Wifi. http://www.wi-fi.org/discover-wi-fi/wi-fi-certified-ac. last visited in 26/03/2018.

[103] Wifi direct. http://www.wi-fi.org/discover-wi-fi/wi-fi-direct. last visited in 26/03/2018.

[104] WiFi P2P reference. http://cse.iitkgp.ac.in/~bivasm/sp_notes/wifi_ direct_2.pdf. last visited in 28/03/2018.

[105] T. Yan, M. Marzilli, R. Holmes, D. Ganesan, and M. Corner. mcrowd: A platform for mobile crowdsourcing. In 7th ACM Conference on Embedded Networked Sensor Systems (SenSys’09), pages 347–348. ACM, 2009.

[106] G. V. Zaruba, S. Basagni, and I. Chlamtac. Bluetrees-scatternet formation to enable bluetooth-based ad hoc networks. In Communications, 2001. ICC 2001. IEEE International Conference on, volume 1, pages 273–277 vol.1, 2001.

[107] ZombieChat. http://getzombiechat.com/. last visited in 17/03/2016.