Performance Analysis of Opportunistic Content Distribution via Data-Driven Mobility Modeling

LJUBICA PAJEVI∆

Doctoral Thesis Stockholm, Sweden, 2018 School of Electrical Engineering TRITA-EECS-AVL-2018:76 and Computer Science ISBN 978-91-7729-980-6 KTH, Stockholm, Sweden

Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till oentlig granskning för avläggande av doktorexamen onsdagen den 21 november 2018 i Hörsal F3, Kungl Tekniska högskolan, Lindstedtsvägen 26, Stockholm.

© Ljubica PajeviÊ, November 2018

Tryck: Universitetsservice US AB Abstract For the most part of the developed world, ubiquitous connectivity has become a norm in human daily lives. Globally, the access to information on the Internet and telecommunication services has been recognized as one of the accelerators of human progress, promoting the right to education and facilitating access to globally shared knowledge. The increasing capacity and user experience demands on the one side, and the need for increased net- work coverage and the availability of aordable telecommunication services on the other side, pose significant challenges for the current network archi- tectures. While the next generation wireless technologies and the pervasive deployment of the currently available communication systems are poised to solve the above challenges, the main inhibitors of immediate action are still socio-economical factors. Due to the pervasiveness of end-user mobile devices, device-to-device communication (D2D) can be leveraged as an intermediate— and a complementary—solution. D2D communication enables networking in an opportunistic manner. An opportunistic network is formed by co-located mobile users in order to ex- change data via direct wireless links when their devices are within transmis- sion range, without relying on the use of fixed network infrastructure. In this thesis we investigate the capabilities of these opportunistic networks and cover two main areas: data-driven modeling of user mobility and analytic performance evaluation of location-aware opportunistic content distribution. The first part of the thesis focuses on mobility modeling. We collect and analyze a dataset of user associations in a university wireless network. The dataset was recorded at the KTH Royal Institute of Technology, and documents the prevalence of smartphone network users, as compared to laptop users. We characterize the mobility of users in this dataset both from the network and from the user perspective. From the network perspective, we model the aggregate mobility and access patterns to dierent parts of the network. To characterize individual mobility, we explore how mobile the users are, and how accurately movements can be predicted in the near future. We combine the findings on the access patterns with the analysis of several other mobility traces to propose a mobility model for populations with churn, that is specifically tailored for the evaluation of opportunistic content distribution. In the second part of the thesis, we evaluate the performance of op- portunistic content distribution in urban environments. Our focus is on ephemeral, location-aware networks where content is stored only on the user devices within the locale of interest. We develop a framework for modeling the performance of content distribution; the framework allows modeling of the spread of information as a stochastic process and accurate capturing of the stochastic fluctuations in the number of distributed content items. We utilize the system model to study the feasibility of opportunistic content distribution and by means of stochastic stability analysis we assess how the mutable sys- tem parameters can be engineered to ensure content persistence. Our results demonstrate that the content persistence strongly depends on the density of users, and that the requirements for user resources are relatively low already for moderate densities. This finding can help incentivize users to participate in opportunistic networking applications. Sammanfattning För de större delarna av den industrialiserade världen har oupphörlig nätverkstillgänglighet blivit regel i människors liv medan, globalt sett, så har internet och telekommunikationstjänster erkänts som en accelerator av samhällsutveckling, främjande rätten till utbildning och möjliggörande till- gången av världsomspännande information. De växande kraven på kapacitet och användarupplevelse å den ena sidan och behoven av nätverkstäckning och tillgängligheten av prisvärda telekommunikationstjänster å andra sidan ger upphov till signifikanta utmaningar för dagens nätverksarkitekturer. Dessa utmaningar angrips i dagsläget av nästa generationens trådlösa teknik och omfattande utbyggnad av fasta kommunikationssystem. Trots detta häm- mas de omedelbara genomslagen av i huvudsak socioekonomiska faktorer. Då tillgången till mobiltelefoni är omfattande kan enhet-till-enhets kommunika- tion Device-to-Device (D2D) användas som en alternativ och komplementär lösning. D2D-kommunikation möjliggör opportunistiska nätverk som bildas mel- lan samlokaliserade mobilanvändare. Via direkta trådlösa anslutningar kan data delas utan att förlita sig på fast nätverksinfrastruktur. Denna avhan- dling undersöker potentialen för opportunistiska nätverk inom två områden: datadriven modellering av användares mobilitet respektive analysmetoder för att utvärdera platsmedveten opportunistisk informationsfördelning. Den första delen av avhandlingen fokuserar på mobilitetsmodellering. Vi samlar in och analyserar en datamängd av associationer mellan användare i ett universitetets trådlösa nätverk. Datamängden samlades in på Kungliga Tekniska Högskolan (KTH) och dokumenterar prevalensen av smartphonean- vändare i nätverket jämfört med bärbara datorer. Vi karaktäriserar mobilitet av användara i denna datamängd både från ett nätverks- och användarper- spektiv. Från perspektivet av nätverket så modellerar vi sammaslagen mo- bilitet och åtkomstmönster till olika delar av nätverket. För att karaktärisera individuell mobilitet undersöker vi hur mobila mobilanvändarna är och hur förutsägbara rörelser är på kort sikt. Vi kombinerar insikterna om åtkomst- mönster med analysen av flera rörelsespår för att föreslå en mobilitetsmodell för populationer med bortfall, som är skräddarsydd för att utvärdera oppor- tunistisk informationsfördelning. Del två av avhandlingen utvärderar hur opportunistisk informationsfördel- ning presterar i stadsmiljöer. Vårt fokus är kortlivade platsmedvetna nät där information endast lagras på användarenheter inom ett studerat om- råde. Vi utvecklar ett ramverk för att modellera prestationen av informations- fördelning. Ramverket tillåter att modellera informationsspridningen som en stokastisk process och fångar noggrannt de stokastiska fluktuationerna i an- talet fördelade informationssposter. Vi använder denna systemmodell för att utforska potentialen för opportunistisk informationsfördelning och med hjälp av stokastisk stabilitetsanalys härleder vi hur reglerbar systemparametrar kan väljas för att garantera kvarlevnad av information. Våra resultat visar att in- formationens kvarlevned kraftigt beror på koncentrationen av användare samt att kraven på användarresurser är relativt låga redan för måttliga koncentra- tioner. Detta resultat kan vara till hjälp för att uppmuntra användare att delta i opportunistiska nätverkstillämpningar. Contents

Contents iii

Acknowledgments v

1 Introduction 1 1.1 Thesis Contributions ...... 4 1.2 Thesis Outline ...... 5

2 Understanding and Modeling Human Mobility 7 2.1 Foundations of Mobility Modeling ...... 8 2.2 Human Mobility Through the Space-Time Prism ...... 9 2.3 Historical Overview of Approaches to Mobility Modeling ...... 14 2.4 Scenario-Based Modeling ...... 15

3 Collecting and Analyzing Mobility Data 19 3.1 Trace Acquisition and Source Origins ...... 19 3.2 WLAN Mobility Traces ...... 23 3.3 Mobility and Network Simulators ...... 25

4 Opportunistic Networking: Applications, Platforms and Perfor- mance Evaluation 27 4.1 Mobile Ad Hoc Networks and Delay Tolerant Networking ...... 27 4.2 Data Delivery in Opportunistic Networks ...... 29 4.3 Location-Based Opportunistic Networks ...... 31 4.4 Opportunistic Communication Systems: Prototypes and Applications 33 4.5 Mathematical Tools for Modeling Content Dissemination and of Per- formance Evaluation ...... 35

5 Summary of Original Work 39

iii iv

6 Conclusion and Future Work 45

Bibliography 49

Paper A: Revisiting the Modeling of User Association Patterns in a University Wireless Network 63

Paper B: Predicting the Users’ Next Location from WLAN Mobil- ity Data 81

Paper C: Epidemic Content Distribution: Empirical and Analytic Performance 99

Paper D: Modeling Opportunistic Communication with Churn 115

Paper E: Ensuring Persistent Content in Opportunistic Networks via Stochastic Stability Analysis 143 Acknowledgments

My gratitude extends beyond the limits of my capacity to express it.

Iain M. Banks, The Player of Games

Although shorter than the next introductory, technical chapters of the thesis, this chapter has taken proportionally more time to write and has involved deeper thought with a considerable amount of reflection. Incidentally, a productive mo- ment of reflection occurred while I was listening to Evgeny Kissin’s live performance several months ago. This is when I started phrasing the following sentences: they flow straight from the heart like Debussy’s notes thorough the pianist’s fingers. First and foremost, I would like to thank my main supervisor, Prof. Viktoria Fodor for her unwavering support and exceptional availability during the last two years of my studies. Collaboration with a person of such perspicacity, determina- tion, and a cheerful personality has been an invaluable, prolific and fun experience. I wish every student researcher had such a devoted, supportive and helpful super- visor, and I firmly believe that professors like her make the academic community more productive and a better place. I am grateful to my second supervisor, Prof. Gunnar Karlsson, for giving me the opportunity to join his research group at the former Laboratory of Communi- cation Networks (LCN), for introducing me to the world of academic research and for numerous fruitful discussions which resulted in formulating research problems addressed in this thesis. My sincere gratitude goes to Prof. Håkan Hjalmarsson who provided the necessary guidance and support when my research path started deviating from its destination. I am greatly indebted to Prof. Jörg Ott for welcom- ing me at the Chair of Connected Mobility of the Technical University of Munich, where a substantial part of this thesis was written. I would like to thank Prof. Thrasyvoulos Spyropoulos for accepting to act as an opponent to this dissertation, and to Prof. Björn Landfeldt, Prof. Karin Anna Hummel, and Dr Aline Carneiro Viana for taking time to act as committee members.

v vi

I wish to express my appreciation to the KTH School of Engineering for their financial support of my studies through the Program of Excellence. At LCN, as well as at the Chair of Connected Mobility, I had a pleasure to learn from and work with a number of excellent researchers. To all the members of the two research groups—professors and students, current and past: thank you for creating a stimulating work environment and also for being a part of my PhD journey! While living in Stockholm, I had a fortune to meet some wonderful people and make new friends; I could not possibly omit mentioning a few of them who have been exceptionally supportive in one way or the other: Vladimir, Ognjen, Slaana, Ivana and Cenny, thank you! I am endlessly and forever grateful to my mother Zorica and my father Radiöa for their unconditional love and support, and for always trusting in me and encouraging me to pursue my goals. I dedicate this thesis to them. I wish to thank the rest of my family, especially my brother Aleksandar for providing motivation with a healthy dose of competitiveness, and to my grandfather Radenko, for his enthusiasm and curiosity about academic life in a far-away country. A demanding research work necessitates occasional breaks, and I am thankful to my brother’s family: Jelena, Jovana, and Simona, for careless and joyful family moments, to my parents-in-law, Hannele and Seppo, for providing the most relaxing atmosphere during my holiday trips and weekend breaks, and to my close Serbian friends, from Belgrade and from abroad, for staying in touch and for their remote support. Finally, it is not an exaggeration to say that completing this thesis would not have been possible without the inexhaustible encouragement, understanding, and support of my better-half, Teemu. He has been my (system-oriented) academic companion on this journey, an advice-giver, and the constant inspiration and reason for continuous personal and professional development. I consider myself incredibly fortunate to have him by my side.

Ljubica PajeviÊ Munich, September 2018 Chapter 1

Introduction

Where should I start? Start from the statement of the problem.

György Pólya, How to Solve it

Ubiquitous connectivity has now become a norm in our daily lives while the access to information on the Internet has been recognized as one of the basic hu- man rights [1], promoting the right to education and globally accelerating human progress. The digital divide, the gap between those with access to the Internet and those still without or with limited access, started to shrink thanks to the enormous eorts to bridge this gap by deploying the necessary telecommunication infrastruc- ture, especially in rural regions, and by extending communication services. Yet, the progress of these eorts is slow and greatly limited by socio-economic factors. At one end of the digital divide are people without aordable network access and with- out this access to globally shared knowledge, communication services and health care; at the other end users with growing demands for faster networks and larger data plans. The current network architectures will struggle to keep pace with the increasing demand for capacity and user experience demands on one end, and the need for increased network coverage and the availability of aordable telecommunication services on the other end. Whereas the fifth generation (5G) networks hold the promise of meeting the capacity requirements through network densification, i.e., through the deployment of more base stations, deploying such solutions poses great financial and operational burdens on mobile network operators, while reaching only a small portion of users in the initial phase. However, the lack of availability of the network infrastructure is still one of the key barriers to high-speed Internet access in developing regions.

1 2 Chapter 1. Introduction

Due to the pervasiveness of end-user mobile devices, device-to-device commu- nication (D2D) can be leveraged for a more immediate solution to the above chal- lenges. Mobile devices such as smartphones, digital tablets and readers are today equipped with ample computing resources and often support several communication technologies (Wi-Fi/Wi-Fi Direct, Bluetooth), which makes these devices suitable for ooading processing and networking tasks from the edge network to end de- vices. D2D communication can be exploited in cellular networks to increase usage of radio resources by utilizing both cellular and D2D links [2], to extend network coverage [3], and to achieve communication with extremely high bit rates and low latencies, thanks to the proximity of devices. Aside from the infrastructural mode of communication, e.g. to alleviate the trac load by ooading a portion of it to the end users, a pure D2D ad hoc communication mode has seen even more promises for potential applications. This mode is particularly well-suited when the infras- tructure is absent e.g., in remote rural areas, disrupted due to natural disasters or otherwise unavailable being subjected to censorship by oppressive authorities. D2D communication enables a particularly interesting type of mobile networks termed opportunistic networks, formed by co-located mobile users in order to ex- change data via direct wireless links when their devices are within transmission range. The concept of opportunistic networking is not entirely new—some of the fundamental principles are borrowed from the research on mobile ad hoc networks (MANETs) and delay tolerant networks (DTNs) as conceptual predecessors. In a MANET, network topology is created from mobile devices with a primary goal to mimic infrastructure networks and ensure end-to-end connectivity between a source and a destination device, in a purely wireless domain. This principle of autonomy and independence on the infrastructure has been adapted to opportunistic commu- nication. Acknowledging that MANETs are prone to frequent link breakages due to the node mobility, DTNs are envisioned to cope with frequent network reconfigura- tions and partitioning. Mobile DTNs—opportunistic networks being one instance of such—consider an alternative approach to exploit intermittent connectivity of devices; the nodes’ mobility is seen as a feature rather than a limitation, and it is utilized to transfer messages in a store-carry-forward manner. Opportunistic networks and DTNs may not rely on the existence of end-to-end paths and there- fore do not provide guarantees of the message delivery. Furthermore, there may be one or more potential recipients rather than a dedicated single recipient. For these reasons, opportunistic networking is well suited for content dissemination on abest-eort basis and for applications that do not have strict latency requirements.

Use-cases Let us review a few use-cases that may significantly benefit from using opportunistic communications. In opportunistic social networks users interested in exchanging delay tolerant data, such as media files and personal profiles, or in messaging and decentralized mobile networking, can form user groups based on their mutual interests and social 3 ties. Examples are university students in the same class, communities of friends, conference attendees, and people attending massive public events or business meet- ings. This type of social networking is also seen as advantageous with respect to privacy concerns when compared to social networks that require infrastructure and support of some kind of centralized entity [4]. Distributed vehicular data sensing is another representative example of oppor- tunistic use-cases. Crowdsourcing is becoming a valuable way of gathering infor- mation, both for users and service providers. One of the currently most exploited sources of data is floating car data (FCD), which includes positioning data such as GPS localization, information about cellular handos of mobile devices, and cov- erage of Wi-Fi networks. FCD together with user-provided updates can be used to improve real-time trac estimates. Currently, there exists many infrastructure- based solutions, however, they require centralized data processing, accompanied with significant computational complexity, and non-negligible latencies. In this context, opportunistic vehicle-to-vehicle communication can be beneficial for faster gathering, processing and disseminating trac updates in a distributed fashion. Similarly to the previous case, data sensing in smart cities is a scenario in which mobile devices can be used to collect sensing data beyond positioning and network connectivity information. Smartphones are already equipped with a rich set of powerful (and inexpensive) embedded sensors, such as camera, microphone, GPS, accelerometer, gyroscope, while new features including pressure, temperature and humidity sensors are being added. These features can be exploited for generating fine-grained maps of city areas with measured levels of noise, reporting infrastruc- ture problems and public safety issues, measuring flows of pedestrian and crowds for the purpose of urban planning or positioning inside buildings, shopping-malls and airport terminals. Location-based opportunistic services lie at the intersection of the previous cases. Irrespective of the type of content being distributed (sensing data, trac updates or social media updates), the mobility modality (pedestrian or vehicular mobility), and the underlying social structure (connected communities or random travelers), the common feature of these services is that the content is only of local and perhaps of time relevance, while considered irrelevant outside of the given time and spatial windows. In contrast to urban scenarios where opportunistic networking can be used to augment already existent infrastructure, opportunistic services in developing and rural regions can be employed to enable communication and provide asynchronous Internet access in regions that are characterized by very sparse or non-existent net- working infrastructure. A few initiatives have been launched to provide “connectiv- ity for everyone”, namely Google’s project Loon [5] or Facebook’s Free services [6]. Their development is still ongoing and it is yet unclear how much interest the big initiator companies will have to keep the projects running, or even how (socially) beneficial these projects are for their users [7]. 4 Chapter 1. Introduction

1.1 Thesis Contributions

In this thesis we investigate the capabilities of opportunistic networks by means of analytic modeling and performance evaluation. Towards this end, we cover two main areas. The first area is mobility modeling, which is essential for performance studies since mobility patterns determine how opportunistic networks perform. The second area concerns modeling and analysis of opportunistic networks with a fo- cus on one specific case—location-aware opportunistic content distribution. The contributions in these areas are the following.

Modeling Mobility of Populations with Churn Despite a plethora of research in the field of mobility modeling for opportunistic communication, a specific property of populations of mobile users has been over- looked: the population turnover—also known as churn—where users can join the system and depart from it, thus the number of nodes in the system is changing. In practice, such dynamics realistically depicts scenarios in small urban areas, such as a grid of streets, city square, or subway station where users arrive to the area, move inside for a while and then exit (or leave the system). In this research area, we combine analysis and synthesis to devise a mobility model.

• First, in Paper A we take a top-down approach and analyze aggregate proper- ties of the user mobility in a campus environment. We do so by collecting and mining a trace of user associations in a university wireless network. The trace analysis reveals the prevalence of mobile users, as well as—from the modeling perspective—an advantageous property that the arrivals to network areas can be treated as Poisson arrivals, which holds valid for certain time windows. • From this point, we move onto an analysis of individual mobility of users from the same trace, presented in Paper B. We first characterize how mobile the users are, and then we investigate how accurately their movements can be predicted in the near future. This is inspired by the concern whether the predictability of user mobility should be considered as a feature to be taken into account when devising future mobility models. • We extend our search for mobility properties with the aim to devise a mobil- ity model for a specific scenario—localized epidemic content dissemination. Thus, in Paper C, we consider several real-world traces and we evaluate the performance of epidemic content spreading with the underlying mobility pat- terns. These mobility patterns are representative of conference and oce building settings. We are particularly interested in understanding what are the implications of the assumption of the users’ homogeneous mobility for accurately capturing the content spreading process. • The homogeneous mobility of users entails homogeneous contact patterns among them. We infer that these simplifying assumptions may be inade- quate for the examined mobility traces, due to the existence of social ties 1.2 Thesis Outline 5

among users. Therefore, in Paper D we investigate a new set of mobility scenarios capturing a variety of contexts: movements on city streets and open squares, in a subway station and in buildings. This second set of scenarios is characterized by more random contact patterns as compared to the traces in Paper C. We show that even though the nodes exhibit heterogeneous mobility patterns, a moderate heterogeneity can be approximated by a homogenous model. Combining these results, we propose a model that incorporates churn and that is able to capture the observed heterogeneity.

Performance Modeling and Feasibility Analysis of Opportunistic Content Distribution In the second sub-area of this thesis, we investigate the feasibility and evaluate the performance of content dissemination in urban environments. Our focus is on ephemeral, location-aware networks where content is stored only on the user devices within the locale of interest—a scenario that can be seen as an infrastructure-less mobile cloud storage. We develop a framework for modeling the performance of content dissemination; the framework is configurable to capture dierent communi- cation and user participation scenarios. Specifically, we employ the model of users with limited contribution to the content spreading process and assess the minimum requirements for the viable system. • In Paper D we introduce the model for opportunistic content spreading, by overlaying the epidemic content dissemination scheme on the proposed open- population mobility model. We model the spread of information as a stochas- tic process, showing that the model is able to reflect stochastic fluctuations in the number of disseminated content items. Furthermore, we confirm that the requirement of capturing heterogeneity can be relaxed, and the eect of heterogeneous node interactions can be approximated by an averaged eect.

• In Paper E we use the stochastic model of the system to study the feasibil- ity of opportunistic content dissemination and by means of stochastic sta- bility analysis we assess how the mutable system parameters which can be engineered—in this setup the forwarding time—can be tuned to ensure that the content can be sustained in the area exclusively relying on the presence of nodes carrying and forwarding the content item, and without relying on any infrastructural support.

1.2 Thesis Outline

The remainder of this thesis is structured as follows. Chapter 2 gives an overview of the mobility modeling endeavors, from the earliest eorts to the current state- of-the-art models. The motivation for our contribution in the area is also stated. In Chapter 3 we argue about the necessity to revisit the established results, in the light of newly available, massive amounts of data capturing human mobility. 6 Chapter 1. Introduction

Chapter 4 describes our vision of an opportunistic communication scheme, based on the publish/subscribe paradigm and principles of epidemic content spreading; we also survey similar concepts and other proposed, mature opportunistic networking solutions. Chapter 5 summarizes the contributions of papers included in this thesis. With Chapter 6, we conclude this thesis by discussing its main results, and by presenting challenges and possible directions for future studies. Chapter 2

Understanding and Modeling Human Mobility

To understand is to perceive patterns.

Isaiah Berlin

Modeling the mobility of users in wireless networks enables the evaluation of network protocols, services and applications, and the calibration of the protocol parameters in the development phase. In opportunistic and mobile networks, mo- bility models are necessary to assess the system feasibility and to evaluate system performance. The choice of a mobility model, however, can have a significant eect on the performance investigation and even put the credibility of the study in ques- tion [8, 9, 10, 11]. Therefore, a careful choice of the model is essential for a sound analysis and evaluation. Historically, mobility models have been divided into two main categories: ana- lytic models, that can be explained by a set of mathematical formulas, and trace- based models comprising of the movement traces collected from experiments, which are directly fed into simulations. The latter group represents real mobility, however it is restricted to specific scenarios where the traces were obtained; moreover, the scenarios often represent very specific contexts such as a university campus, an en- terprise building, a part of a city or a theme park, or taxis in a particular city. As opposed to this, analytic models—also referred to as synthetic models—represent only a simplified abstraction of mobility but they are often necessary to formally analyze the impact of human movements on the design of network protocols. While the first used synthetic models bore little resemblance to any realistic movement, many researchers had to resort to using them due to the lack of deeper understand- ing of mobility and the unavailability of real traces. The emergence of location and communication technologies such as Global Positioning System (GPS), Blue- tooth and Wi-Fi-based tracking, streamlined the acquisition of large mobility traces,

7 8 Chapter 2. Understanding and Modeling Human Mobility which further opened new opportunities for redesigning synthetic models and for incorporating statistical properties observed from real traces. Consequently, the veracity gap between the two categories of models started to shrink. In this chapter, we are concerned with analytical approaches to mobility mod- eling; we consider data-driven approaches in the next chapter. Thus, herein we provide an overview of the current classifications, explore the space of the state-of- the-art models and position our work in this space.

2.1 Foundations of Mobility Modeling

One of the earliest models, the random walk model [10] was first introduced into science by mathematician Karl Pearson in a letter to Nature [12] in 1905: AmanstartsfromapointO and walks l yards in a straight line; he then turns through any angle whatever and walks another l yards in a straight line. He repeats this process n times. I require the probability that after these n stretches he is at a distance between r and r + ”r from his starting point O. In the same volume of the journal, Lord Rayleigh provided the solution, noticing the analogy between the given problem and the problem he solved 25 years earlier, while considering superposition of sound waves of equal frequency and amplitude. When applied to the random walk problem, the sought probability distribution is a function of the number of steps n and the radius r from the origin O, which can 2r r2/n be expressed as P (r) e≠ for a very large n. The probability is therefore ¥ n highest in the area around the origin1. Yet some decades later, Albert Einstein mathematically described this problem, today also known as Brownian motion. To this day, random walk models remain relevant in various scientific and engineering disciplines. In the early studies of mobile networks, mobility models considering random movements were the most commonly used models, due to their simplicity. A va- riety of random models, termed as generic and loosely based on observations of real mobility and its properties (for example node speed or group mobility), have been proposed [10]. Perhaps the most well known random mobility model is the random waypoint (RWP) model, an extension of the random walk model with pauses between changes in direction or speed. Mobile ad hoc network protocols and algorithms were routinely evaluated based on their performance under simu- lated generic mobility models. The underlying mobility model, however, is known to greatly aect the evaluation results [8]. These findings, as well as the strive to devise more realistic models, have motivated many research eorts in the mobile networking area, over nearly two decades. In other scientific disciplines, however, human mobility has already been attracting considerable attention for a longer period of time. 1Anecdotally, Prof. Pearson noticed that “[...] the lesson of Lord Rayleigh’s solution is that in open country the most probable place to find a drunken man who is at all capable of keeping on his feet is somewhere near his starting point!” 2.2 Human Mobility Through the Space-Time Prism 9

!"#$%&' ,- . 01 / 02 1

6478 34#$5/ 34#$5. ()*+$

Figure 2.1: Space-time paths for two nodes meeting at their common location of work.

2.2 Human Mobility Through the Space-Time Prism

Understanding human mobility has been an objective of multiple research disci- plines: from public health (epidemiology) to urban planning and transportation research to, most recently, mobile networking. As it is usually the case with in- terdisciplinary fields, many concepts get rediscovered or introduced in parallel, but also there is a disarray in terminology. The obtained knowledge is inevitably dis- persed, and thus there is a need to relate interdisciplinary models. In this section we attempt to bridge the gap by introducing the terminology used in transportation research and relating it to the terminology used in networking research. In transportation research, the dominant approach to examining mobility is the activity-based approach. This approach is based on the concept of space-time geog- raphy, proposed by Hägerstrand in 1970’s [13] to demonstrate how human spatial activity is often governed by limitations, and not by independent decisions of spa- tially or temporally autonomous individuals. In this conceptual framework, human mobility is constrained by the individual’s capability,bythecoupling of individuals, and by the restrictions imposed by authorities over them. Capability constraints refer to the limitations on human movement due to physical or biological factors, such as the walking speed, access to cars or public transportation and the need to return to a given location after a journey. Coupling constraints refer to the need to be in one particular place for a given length of time, often in interaction with other people. Authority constraints relate to the limitations and control of access exerted on nodes to restrict their possible mobility, such as trac or safety rules (cars are not allowed to drive through pedestrian zones, people are not normally permitted to enter a military base). Examples of space-time paths are given in Fig. 2.1, which shows paths of two in- dividuals, a pedestrian and a driver, during a work day. These two individuals have a common work location. Authority over them imposes a restriction on work hours, locating them in vertical segments of their paths, while individuals’ capabilities— access to cars, walking or driving speed—determine the positions of other segments of their paths in the space-time graph. While an individual path represents the 10 Chapter 2. Understanding and Modeling Human Mobility

Table 2.1: Examples of constraints on mobility at dierent levels of abstraction.

Level Capability Coupling Authority Strategic Available time Work locations; Regulated work hours; areas for shopping shopping hours Tactical Means of transport Transport routes Schedules of public transportation Operational Speed of walking Queuing and crowding Speed limits; or driving trac rules actual path taken by the pedestrian or by the driver, a set of all points that can be reached by an individual, given a maximum possible speed from a starting point to an ending point in the space-time graph determines the individual’s space-time prism. Any kind of mobility, either trace-based or synthetic, can be formalized in this manner. Consider synthetic mobility in the form of RWP: It is characterized by mobility of a fixed number of nodes in a closed area with a convex boundary and without internal obstacles. The capability is given by the distribution of the speed of the flights the nodes make and the distribution of their pause times. There are not any coupling constraints since nodes do not have a physical size or any other form of interaction, and there are no authority-imposed restrictions that limit the movements other than the boundary of the simulation area. In addition to constraints on mobility, there are travel behavior levels. Human mobility can be seen as consisting of three levels [14]: At the strategic level humans decide the activities they would like to perform such as going to work, shopping, or taking a walk in a park as well as the schedule of those activities—these decisions define their daily movements. The tactical level considers the implementation of a strategic decision, such as choosing a mode and a way of travel, taking into consideration which is the shortest or fastest path as given by environmental factors like obstacles and congestion. At the operational level, the physical process of human movement is considered, including the walking or driving speed, the physical size of nodes and the interaction with other trac due to queuing for avoiding collisions. Table 2.1 exemplifies the constraints and capabilities at the three levels of abstraction. The above terminology is more relevant and common in, for example, trans- portation research. In the networking research, models are usually classified with respect to the scale at which mobility is observed. Thus, there are microscopic, mesoscopic or macroscopic mobility models. Microscopic mobility describes in de- tail individual behavior and interaction with other nodes and the environment. Phenomena observed in such detail pertain to tactical and operational levels. At the macroscopic scale, a model concerns system behavior in its entirety and with- out any reference to its underlying microscopic nature. Flows of vehicles on a highway, or streams of pedestrians on a street are examples of macroscopic models. An individual is represented only as a contribution to the mean density, and can not be followed from its origin to its destination. This corresponds to mobility at the strategic level. At the intermediate position between the macroscopic and 2.2 Human Mobility Through the Space-Time Prism 11 microscopic representations is the mesoscopic approach, which aggregates several individuals, with the same properties and moving in the same environment, into groups [15]. This means that mesoscopic models do not attempt capture mobility of every single individual, but rather group mobility, where every group has its own rules of behavior. Similarly to the microscopic models, this level of abstraction concerns tactical and operational levels; however, since individual behavior is deter- mined by the group properties, the operational level is represented with less detail than in micro-mobility. The mesoscopic models in this regard represent a compro- mise between macro- and microscopic models, aiming to achieve the computational eciency and analytical tractability of macroscopic models, but at the same time being able to represent some of the diversity and individuality of real users. Micro- scopic mobility is the most precise abstraction of how individuals move, but the use of such models is limited to analysis of small areas (or small number of nodes) and to agent-based simulations. Besides, for the majority of practical applications to wireless networking, namely in cellular and traditional wireless local area networks, modeling an individual’s behavior in a high level of detail has limited usefulness, since the details of interaction between individuals are not considered essential for the system performance [16]. Mesoscopic models, therefore, should be preferred to microscopic models, for finer granularity representations, however they are insu- cient when describing an entire system consisting of heterogeneous mobility groups, moving in larger spaces and observed during long temporal scales. In such cases, the use of macroscopic models is necessary, since this category represents mani- festations of strategic and tactical decisions such as the route choice and activity scheduling, given certain constraints. To summarize, the structure of mobility can be considered in dierent aspects: in the sense of constraints and capabilities, levels of abstraction and the scale. Next we consider measurable properties of mobility.

Metrics of Mobility

In the wide spectrum of existing mobility models, many of these models were tai- lored to a specific networking application and resulted from dierent modeling objectives. Due to a possibly significant overlap across the models, eectively com- municating the models and modeling approaches is beneficial to systematize the knowledge in the field. Despite numerous eorts to introduce a standardized ter- minology to this end, there has not been an unanimously accepted classification. Nevertheless, there is some consistency in the metric space that can be used as a reference. These metrics can be classified in three categories based on the number of nodes considered for the metric: individual, pairwise and community metrics. The provided listing of the metrics is only illustrative; we use the introduced met- rics to exemplify dierent modeling approaches and to define metrics that will be used in the contributed papers. 12 Chapter 2. Understanding and Modeling Human Mobility

Individual Metrics Individual mobility metrics are used to describe behavior of a single node, that is, its trajectory in space and time. These metrics consider the nodes as independent, but aim at capturing the diversity of the individual patterns. To this end, metrics in this class are usually represented via probability distributions over possible sets of parameter values that the overall node population may exhibit.

• Visiting time, also pause or sojourn time, is the time that a user spends in a given location, before moving to the next location. • Flight length or jump size is a measure of the length of each segment in a user’s path. • Number of visited locations [17] describes the spatial dimension of mobility by measuring how much of the simulation space individuals cover. • Location visiting preference and periodic re-appearance at a single location capture the distribution of the time that a user spends at a given location and the frequency of returning to a given location. These two metrics have a significant role in distinguishing realistic movement from random motion. • Entropy and entropy rate represent the measures of uncertainty of the future locations that the user will visit. The entropy considers only the distribution of probabilities of visited locations, ignoring the order in which the user visited these locations. Given the sequence of visited locations L, the number of distinct locations in the sequence N, and the probability pi of being at a location i, the entropy is measured as = N p log p . The entropy H ≠ i=1 i 2 i rate observes the temporal correlation between visits to dierent locations and is defined as a conditional entropy of the last visitedq location given the past, r=limn (Xn Xn 1,Xn 2,...,X1) where the user’s location at time i is H æŒ H | ≠ ≠ assumed to be generated by a random variable Xi,i=1,...,n.

Pairwise Metrics This group is also referred to as encounter, as well as connectivity metrics. We can broadly define contact as the basic unit of interaction in mobile ad hoc networks; a contact captures information about a pair of nodes interacting and the time and duration of interaction. • Contact duration is the time that two nodes spend in each other’s transmission range. In opportunistic communication, this metric relates to throughput: the shorter the duration, the fewer messages can be transmitted. • Inter-contact time is the time between two consecutive contacts for a pair of nodes. In some use-cases, the inter-any contact time is relevant to measure how often a node interacts with any other node. 2.2 Human Mobility Through the Space-Time Prism 13

• Contact rate is defined as the total number of contacts that a node establishes during its lifetime normalized by the node’s lifetime, i.e., it is the number of contacts per unit time. Since our studies are focused on systems with churn (open populations), we consider contact rate to be a more suitable metric for such settings than the number of contacts, which is usually measured in closed, fixed population systems.

Community Metrics At the macroscopic level of mobility, pairwise patterns get aggregated and we can observe the emergence of the group or community patterns. Community met- rics come from the network theory, which is currently mostly developed for static networks, thus somewhat disconnected from our main focus. These metrics are especially interesting for designing social opportunistic networks.

• Node degree represents a cumulative metric of the number of contacts a user encounters. Studying the distribution of node degrees in the network helps identify the type and the structure of the network, i.e. whether it is a small- world, scale-free or a random network. The distribution of node degree is an important property, since it helps answering, based on this simple classifica- tion, questions about processes occurring on a network, for instance how fast an epidemic would spread on it.

• Centrality measures the importance of a node in the network and there are dierent types of centrality: degree centrality, directly related to the previous degree metric, which assesses how many direct neighbors a node can aect. Betweenness centrality is a measure of the extent to which a node is connected to other nodes that are not connected to each other—in other words, this metric measures how critical the node is to keep the entire network connected. One application is in finding the shortest paths between two nodes in the network. Some advanced opportunistic routing protocols leverage the node’s betweenness centrality in order to find bridge connections to otherwise poorly connected network partitions.

Developing a realistic mobility model involves determining which metrics are essential for the target scenario and fitting the model accordingly. However, by adjusting the model to exhibit observed statistical properties of metrics in one group, for instance individual metrics, the properties of the model in the same or in the two other groups, pairwise or group metrics, may be aected. Consequently, with such alteration of metrics, the model may yield either undesirable properties (e.g. too short or missed contacts) or infeasible parameters (unrealistically high speeds of movement). The former case requires refinement of the metric space, while the latter case mandates re-evaluation and prioritization of the most important metrics that need to be captured. 14 Chapter 2. Understanding and Modeling Human Mobility

2.3 Historical Overview of Approaches to Mobility Modeling Over the last two decades, mobility modeling has been a prolific area. Herein we do not survey or classify particular models, instead we provide an overview and a timeline of relevant surveys; each survey, at the time of its compilation, documented the state of the art models and progress in the area. A first classification of random walk models has been proposed by Bettstet- ter [18] in 2001. Camp at el. [10] survey and discuss several synthetic models, which are mostly based on random node movement and classified as either entity or group models. This work has been formative both for its historical value as being one of the earliest surveys of models used for simulation of ad hoc networks and for raising concerns about the reproducibility of performance results when dierent mobility models, or even dierent parameters for the same models, are used. Bai and Helmy [8] further investigated the reproducibility of results for routing proto- cols performance, documenting a stark variation in the ordering of protocols based on the underlying mobility models. The first survey that addresses social ties in mobility models is provided in [19]. The authors give an overview of three model categories: random movement-based models, which they denote as purely synthetic, the category of models that incorpo- rate some properties observed from real traces, such as the social network structure, and the third category of models based on the real world traces. The survey also presents a few pointers to network and connectivity simulation tools, and raises important—yet still not entirely answered—questions regarding the necessity for standardized benchmarks for protocol and system evaluation and for standardized formats of mobility datasets. As trace collection eorts caught up the momentum, by 2011 many publicly ac- cessible libraries of mobility traces have become available. Aschenbruck et al. [20] survey these libraries and traces, and categorize them by the primary purpose for collection, i.e., for analysis or for derivation of statistical properties. In this article, the authors also classify models in terms of model dependencies and restrictions of node movement. With a particular interest in opportunistic networking scenar- ios, Karamshuk et al [21] observe manifestations of human movements and project the extracted properties along the temporal, spatial and social dimensions. In the same networking domain, Thakur and Helmy [9] propose a framework that includes a multi-dimensional mobility metric space to investigate the adequacy of existing mobility models, but with the dierence that metrics are projected on the individ- ual, pairwise and collective (communal) axes. These all are protocol-independent metrics used for evaluation and comparison of protocols performance in oppor- tunistic setting. The fundamental idea behind the framework [9]istoexplicitly synchronize events leading to nodes visiting the same location at a given time, hence forcing the emergence of contact opportunities and explicitly targeting the application scenario. Remark that this principle of event synchronization imple- ments coupling constraints from Section 2.2. 2.4 Scenario-Based Modeling 15

Considering mobility modeling from a broader perspective, that is, for mobile networking research in general and not only limited to opportunistic networking, two recent surveys aim to establish systematic approaches to developing and com- municating mobility models. The survey by Treurniet [22] focuses on microscopic models and introduces classification of models with respect to implemented model- ing decisions: target and path selection (strategic and tactical mobility decisions), motion determination and obstacle avoidance (operational mobility decisions), and group dynamics. Hess et al. [23] devise a framework for developing and validating new models and provide guidelines for choosing among already available models. Finally, with the proliferation of machine learning techniques for mobility pat- tern mining, numerous new methods have been devised to extract information from large-scale data and multiple sources. These methods are surveyed in [24]. The ref- erence also proposes a taxonomy for the methods based on the application of the analysis: whether it is for user, place or trajectory modeling.

2.4 Scenario-Based Modeling

This section proposes a concise roadmap with the main steps for devising or choos- ing a mobility model to fit the target application. Herein we also highlight the need for a specific model to be used for the performance evaluation of opportunistic content distribution. A scenario gives a description of node behaviors in a given setting. These behaviors must be reflected in the corresponding mobility model. When developing a mobility model, or choosing among existing ones, a modeler first needs to consider the specifics of the physical area where the communication takes place. The scale of the area determines whether the population should be considered open or closed: in smaller areas, the mobility of nodes is unlikely to be confined within the area boundary, and the node arrival process must be included in the mobility model. In addition, the function of the space can be instrumental for characterizing the arrival process and the dynamics of an open population. These properties are important to be captured since they are likely to aect the performance of the communication scheme, e.g., bursty arrivals may aect the performance dierently from arrivals with smaller variations. If the area of interest is large, for example if the goal is to model movements of an entire city population, a single model may not be able to capture mobility in sucient detail. Multiple models can then be applied hierarchically. On a larger (macro) scale, the primary mobility model governs the selection of the city region [25, 26] where the nodes travel to in order to perform certain activities. The secondary, lower lever mobility model is next used to move nodes inside the chosen region, until the primary model recalls the node by selecting a new region. Depending on the intended level of detail, there may even be need for the third-level mobility model responsible for the operational mobility. As a result, the composite model would comprise one model for each of the structural levels of mobility (see Table 2.1). This is especially true for generating opportunistic 16 Chapter 2. Understanding and Modeling Human Mobility networking scenarios, where each of the levels can impact the network performance dierently. Namely, the strategic and tactical mobility determine the length of inter-contact times, while the operational mobility may aect the occurrence of contacts and their durations. Upon deciding on the structural levels of mobility that should be captured in the targeted scenario, the next step concerns the exploration of the parameter space and the choice of mobility parameters. Helgason et al. [16] conclude that the inter- action between nodes caused by mobility and the structure of the space in a given scenario have larger eect on the system performance and are hence more important than the detailed estimation of the mobility parameters, namely, the node arrival process, speed distributions and walking behavior of the nodes. However, for the performance of applications built on top of mobility models, the particularities of a scenario such as the internal structure of the space, may not be crucial, as we show in Paper D. That is, accurately capturing connectivity patterns is more important than recreating the movement caused by the physical constraints of the space where communication takes place. In dierent networking scenarios, dierent properties of mobility must be captured. Below we explore a few motivating examples.

Opportunistic Networks The main objective of generating opportunistic networking scenarios is to capture contact events among the nodes. The structure of an opportunistic network can be represented via temporal graph with nodes as vertices and intermittent contacts between two nodes represented by graph edges. Depending on the use-case where opportunistic communication takes place, several sub-cases can be distinguished. An important aspect of opportunistic social networks [27, 28, 29]istheunderlying social graph, which should emerge from the recreated mobility patterns. As opposed to this, ephemeral opportunistic networks serving to disseminate contents in urban areas [30] or one-time events [31] do not rely on the social connections, in fact, social structure may not even exist and the mobility model only needs to reflect the dynamics of the network graph changes.

Modeling Dynamic Mobile User Populations One of the use-cases for opportunistic communication that we described previously is location-aware content distribution in urban areas. In this context, the population of mobile users is subjected to frequent changes in the number of users currently residing in the locale. It is therefore important to take churn into account, that is, the mobility model must incorporate the process of users joining the population, i.e., the user arrival process and the process that explains how users leave the system and are no longer considered relevant for communication, the user departure process. Motivated by the need to evaluate the performance of this particular type of opportunistic applications, we have identified the lack of simple, analytically tractable models for populations with churn (also termed open populations). The 2.4 Scenario-Based Modeling 17 contribution in this respect is provided in Papers D and E where we propose, parametrize and validate a model based on queuing theory.

Future Networking Applications Planning deployments of future generation mobile networks requires rethinking how user mobility will aect connectivity and handover. In 5G technologies, for instance, the promise of high data rates can be eciently realized by reducing the coverage areas, that is the radius of network cells. Moreover, the key technologies in 5G, millimeter wave wireless transmission technologies, further reduce the transmission distance between the cell tower and the users. The impact of the user mobility therefore increases as the cells shrink, and to support seamless connectivity in spite of frequent handovers, mobility should be considered at a finer scale then in the current cellular networks. The main parameters to be modeled accurately are the patterns of user arrivals to the cells, the cell visiting times, also termed pause times, and the transition probabilities between neighboring cells.

Chapter 3

Collecting and Analyzing Mobility Data

Data is a precious thing and will last longer than the systems themselves.

Tim Berners-Lee

The purpose of collecting location data is manifold: to observe mobility and its driving forces, to parametrize mobility models, to replicate mobility scenarios by importing the traces to mobility simulators, and most recently, to perform ad- vanced data mining, for example for activity recognition or mobility prediction of users. In this chapter, we scan the space of methods for collecting mobility data, survey several well-known traces and trace libraries, and summarize the statistical properties extracted from trace analysis.

3.1 Trace Acquisition and Source Origins

Understanding and modeling human mobility, as we have seen in the previous chapter, has a long history. For demystifying the laws and forces that drive human movement, empirical data—gathered both at the population and at the individual level of mobility—has been an indispensable resource of information. The first attempts to systematically describe human movement can be traced to even before Pearson’s mathematical model for a random walker, to the 1885 publication The Laws of Migration [32]. This study used census data, one of the first and still relevant data sources for modeling human mobility at population levels [33, 34]. Complementary to this, survey data about individuals’ activities has been analyzed in [35, 36]. Census and survey data from these examples share the commonality of the data gathering procedure—which is that they do not rely on

19 20 Chapter 3. Collecting and Analyzing Mobility Data any technological means to record human mobility. Another such source of data is transportation and travel surveys. Representative examples are the Travel Tracker Survey1 for the metropolitan area of Chicago, documenting detailed trips (including trip purpose, transport mode, departure and arrival times) for each of the members of more than ten thousand households, and a similar survey conducted for the city of Los Angeles2 [37]. In the last decades, the availability of communication and localization tech- nologies as well as the pervasiveness of human-carried devices such as smartphones have been the main enablers for collecting large datasets of human movements. Carried by their owners throughout the day, smartphones are generally seen as proxies for the personal identity and location of their owners. A game-changing methodology for uncovering mobility patterns was introduced with the analysis of Call Data Records (CDRs) [38]. CDRs contain attributes about telecommunica- tion transactions, such as a voice call or message exchange, including, among others, identifiers and locations of the two participating end devices. In addition to po- sitioning information provided by cell towers, mobile phones utilize measurements of Global Positioning System (GPS) and the locations of nearby Wi-Fi networks. Being collected by devices, these measurements are considered as the ground truth for tracking the users’ locations. Services and application interfaces, such as Google Location Services [39] further facilitate accurate localization by leveraging multiple sources of user’s digital footprints. Next we survey common methods for collecting traces, from smaller, controlled experiments to large collection campaigns. Two principal methods to collect traces are [20]: 1) by recording user locations directly from their devices by means of dedicated or built-in tools, and 2) by collecting records that users leave in commu- nication systems—digital artefacts. We discuss these methods in more detail.

Recording Device Location Recording GPS coordinates of users devices is the most widely used way to obtain location data. However, GPS depends on the quality of satellite signals and factors including weather conditions, physical obstacles such as buildings, which all limit its outdoor accuracy to several meters at best, while performing poorly indoors. For accurate indoor positioning, alternative localization techniques are available, for instance, techniques that utilize measurements from Wi-Fi signals or infer proximity from Bluetooth Low Energy (BLE) enabled devices. While monitoring location directly from devices—when accurate—is beneficial, since it gives the ground truth, running large scale experiments is challenging either because it is cumbersome to carry out, e.g., if it requires recruiting participants and deploying dedicated devices, or raises privacy concerns, when participants are us- ing their personal devices. Taking necessary measures to insure the privacy of user

1Available: http://www.cmap.illinois.gov/travel-tracker-survey. 2Available: http://www.scag.ca.gov/travelsurvey. 3.1 Trace Acquisition and Source Origins 21 identities and data, several research groups have conducted experiments employing hundreds of participants and made the collected traces publicly available: Nokia Mobile Data Challenge [40, 41], Geolife [42], and SensibleDTU [43]. By collecting measurements from multiple sources such as device sensors (for proximity detec- tion) and phone usage records, these studies sought to understand life patterns of individual participants, for example, what places they visited, as well as the social interactions among the participants, the number of which ranged from one hundred to nearly one thousand; the duration of experiments spanned from several months to two years. Participants in these campaigns were mostly student volunteers and researchers who were running the experiment, thus the studies gave a biased view on human mobility in general. Still limited to a specific scenario, but in a dierent context, mobility traces were obtained from theme parks and country fairs [44, 45]. For the Locaccino study [46], Cranshaw et al. conducted an experiment in which they collected locations of 489 users and their social network ties, both in physical world and in location-based online social networks. This study aims to identify the implicit social link between physical interaction and online connections among users; the dataset, however, is not publicly available. There also exist studies on movement patterns following an emergency and disaster situations, such as after an earthquake or natural disasters (storms, flooding) [47, 48]. The specifics of the networking scenario which we aim to recreate determine the sampling resolution for obtaining the location, as well as the format of the trace and what additional information is necessary for the analysis and mobility model development. For instance, observing an individual’s location independently of other users suces for use-cases when services are provided to that individual only, but is probably less interesting for studying social interactions. As opposed to this, understanding contact patterns is critical for designing and evaluating op- portunistic communication schemes. While the contact information, for a pair or group of individuals, can be inferred for example, from geographical coordinates, this requires high spatial and temporal granularity of the examined trace, and assumes that two users are able to communicate if located within a certain dis- tance, which may not be the case due to physical obstacles or interference. Thus, some measurement studies combine recording both users’ locations and contacts be- users. Contact measurements provide information about people co-located in time and space and as such, this type of measurements is especially interesting for opportunistic communication. Beyond applications in networking, capturing and characterizing contact patterns has an important application in other disciplines, e.g., epidemiology, for monitoring and controlling the spread of infectious diseases. The exact definition of a contact varies for dierent applications. In opportunis- tic networking, we are concerned with user proximity within their devices’ wireless transmission range, which depends on the used technology, whereas in epidemiology only contacts within certain (short) distances matter. Dierent objectives of the experimenters have resulted in capturing human proximity at various scales: from several meters—a typical range of Wi-Fi and Bluetooth communication [49, 50, 51, 52, 53, 54], to few meters long distances [55] and close face-to-face interactions [56]. 22 Chapter 3. Collecting and Analyzing Mobility Data

Mining Cellular and Wireless Networks Data The second method exploits data from existing communication systems and net- works, including massive communication records from cellular networks (previously mentioned CDRs), but also smaller data sources from university and enterprise wireless networks. Since the data can be pulled from a single, or a few points in the network, this approach is more practical for acquiring large-scale data, in terms of the number of users and the trace duration. However, the location accuracy of such traces depends on the density of network elements, i.e., base stations in the cellu- lar network or access points (APs) in the wireless network; moreover, it is limited by the assumption that users are connected to the nearest AP or a base station which may not always be true. Another large source of mobility information is user-contributed content, for instance, geo-tagged photos and tweets, and check-ins to location-based social networks [57]. As the measurement from a wireless network was the primary method to collect traces analyzed in this thesis, we will continue and extend the discussion in Section 3.2. CDRs contain timestamped geographical coordinates for a user for every call made and text message sent or received. These records represent a vast amount of data about every mobile network user, automatically generated by telecommu- nication systems and archived for billing purposes and network troubleshooting. The great advantage of using CDRs is the amount of data, both in terms of the number of users and in terms of number of communication events. One of the first CDR-based studies were [58, 59]; Onnela et al. [58] used the dataset that covers seven million users to reconstruct social structure and infer the strength of commu- nication ties between the users in the set. Gonzalez et. al. [59] investigated a set of hundreds of thousands of users to infer individual’s trajectories and calling activi- ties. The dataset analyzed in [60] is probably the largest CDR trace—and perhaps mobility trace in general—comprising records of more than 60 million people over three months in 2013. However, CDRs on their own do not provide suciently high granularity for strong conclusions about users movement since they exhibit vast uncertainty of users’ whereabouts in-between calls. Calabrese et al. survey existing filtering and processing techniques to extract insights from cellular network data [61], while Chen et al. [62] propose adaptive techniques to reduce temporal sparsity in the CDR traces.

Databases and Trace Libraries While most of the collected traces were meant to serve for specific studies and, due to the nature of the traces and privacy concerns, have not been publicly disclosed, there have been initiatives to gather datasets and make them available to the en- tire research community. One such initiative resulted in the CRAWDAD project [63], a community archive for wireless network data, housing around 120 datasets collected for purposes such as network trac characterization, monitoring of signal strength, bit-error rates and energy eciency in wireless local and sensor networks, positioning of pedestrians and vehicles. 3.2 WLAN Mobility Traces 23

Aside from CRAWDAD, several stand-alone repositories also provide open data: a sizable and significant contribution is the multi-source datasets provided by Tele- com Italia [64] comprising two months of telecommunications, weather, news, social networks and electricity data from the population of the city of Milano. Other cam- paigns, for instance Orange Data for Development (D4D) [65, 66] were open only for limited time and upon request, but their significance is in the volume of the data and the fact that they comprise large populations of users—up to 9 million mobile network subscribers. Cellular network datasets are dicult to keep publicly available since the data contains not just locations of the users but also call and messaging details, and records of user data trac.

Concerns About Data Privacy and Anonymity Conducting research on users’ private data, such as location, has to deal with the contrasting goals of reproducibility and disclosure of sensitive data–which is why there are very few publicly shared datasets. There are two ways to attack this challenging issue, depending on the the trace collection method. First, if the data has been collected experimentally, with consenting participants, it may be possible to release at least a part of it, depending on the agreements and the level of anonymity in the released subset. Furthermore, participants must be able to understand the implications of sharing their data, but also have the opt-out possibility, if they wish to stop their data to be used in some future studies, for which they have not consented to at the time of trace collection. The second case concerns datasets for which obtaining consent from all involved individuals would be impossible, examples are CDRs and WLAN traces. Thus the first step towards obfuscating users private data is to anonymize the users’ identifiers, but this is insucient, since the data providers should guarantee that users cannot be re-identified with even anonymized data. The problem of re-identification is also known as the “power of four”, that is, only four data points is all that is needed to uniquely identify, with a high probability any individual [67, 68]. Fortunately, security research has been very active in this field and several solutions have been proposed including: k-anonymity for user trajectories [69], a method which “hides” a user in a cluster of users with similar movements, location obfuscation [70], i.e., irreversibly altering users locations so that they do not represent real locations and a solution achieving dierential privacy [71] by generating a synthetic model that closely reflects a real CDR trace.

3.2 WLAN Mobility Traces

As a special case of large datasets from communication networks, we consider wireless local area network (WLAN) measurements and mobility inferred from these data. The context where the data acquisition takes place is usually a single institution—such as a university or a company—providing access to its residents. 24 Chapter 3. Collecting and Analyzing Mobility Data

The merit of WLAN measurements is that they can be relatively easy to acquire, for instance if the access data is regularly logged for statistical purposes, such as in public networks providing Eduroam roaming services [72]. A few main advantages of collecting and processing these traces compared to records from cellular networks should be noted. First, the cell size in mobile networks can be at least an order of magnitude larger than the coverage area of access point in WLANs, thus using mobile network records may result in coarser location granularity as compared to WLAN traces [62]. The second is the possibility to directly collect the traces from the network, and to protect the privacy of users by anonymizing their identities, lo- cation and trac data during the acquisition procedure. The same procedure may be much more dicult to carry out in the case of cellular network data: due to the business-sensitive nature of this type of data, mobile operators are often reluctant to share the records with researchers. Below we survey to main groups of WLAN mobility traces, aquired in campus WLANs and in public hot-spots.

Campus Traces

The procedure for acquiring traces of user access and trac patterns from WLANs can take several forms: from the simplest one consisting of pulling access and ac- counting records from the authorization server, to the more complicated procedures which assume setting up syslog event triggers, polling APs through Simple Network Management Protocol (SNMP), and the implementation of tcpdump sniers [73]. Even though the latter procedures are cumbersome due to the vast amounts of generated data, the goal of obtaining measurements (possibly with the assistance of network administrators) is attainable. Thus, majority of the analyzed WLAN traces come from research institutions and university networks. One of the earli- est studies, from 2000, was published by Tang and Baker [74] who analyzed the behavior of users of a wireless network in an oce building at Stanford Univer- sity, in particular patterns of how often and at what locations users access the network. Reference [75] studied the user behavior and network performance in a campus environment, however in the context of users accessing a public WLAN network since the trace captures activity of a more diverse set of users—conference attendees. Chincilla et al. [76] analyzed wireless information locality and access patterns at the University of North Carolina. Kotz [77] and Henderson [78] ana- lyzed, to this day, one of the most popular wireless network traces—the University of Dartmouth trace. Tuduce and Gross proposed a model of users moving inside a campus [17]. The above references incorporate trace analysis of both handheld devices and laptops; the latter type of devices however is clearly not best suited for inferring the users’ mobility with high temporal granularity. Focusing on the wireless network traces generated by users’ personal digital assistants (PDAs) [79], the authors pursued to understand and model mobility traces and usage patterns of users with hand-held devices, contrasting those and comparing with—at the time— dominant laptop users. All the above pioneering works established the methodology 3.3 Mobility and Network Simulators 25 for recording and analyzing users movement and network usage patterns. As a con- tribution in this thesis, the traces collected from our university campus network at KTH are analyzed in Papers A and B. Many researchers continued to examine traces from campus networks [44, 50, 80, 81, 82]. The major limitation of university wireless network traces is that the scenario is limited to users on the campus, e.g., students and employees, who may not even be on the campus all the time during the day and if they are on the campus, there is a high degree of repeatability of their behaviors.

City-Wide Traces Public WLANs capture a wide range of user profiles, thus traces from these networks can provide a more general view of the user mobility and network usage than the university traces. The analyses of such traces have implications for network trac modeling and provisioning, and investigating users movement patterns, in particular for considering hot-spot occupancies. Afanasyev et al. [83, 84] analyzed trac and usage of the Google’s metropoli- tan Wi-Fi network deployed in Mountain View. Ile Sans Fils [85] is a dataset of wireless sessions collected for six years from Wi-Fi hotspots deployed in the city of Montréal, Québec. The trace contains session records for around 115 thousand users connected at 352 hot-spots over a period of six years. The trace is explored in dierent contexts: Isaacman et al [86] use the traces to explore the design space for networks operating with constrained connectivity, Bao and Liang [87]tomodel user transitions between hot-spots by means of queuing network analysis, while [88] examined the trace to predict locations that users will visit next. Ghosh et al [89] analyze and model trac and association patterns of users at hot-spots in two large US cities, and highlight the dierences for the modeled quantities across dierent venues (e.g., coee shops versus enterprises).

3.3 Mobility and Network Simulators

The two main applications of mobility traces are in analytic and in simulation- based studies. As previously explained, the aim of analytic studies is to infer properties of human mobility—either universal (e.g., flight lengths, speed distribu- tions) or scenario-specific (visiting time at a given location). In simulation-based studies, mobility traces are used to replicate real-world mobility in a simulation environment, which is a crucial first step in the testing and development of network protocols and in the evaluation of mobile systems performance. This section ad- dresses the use of mobility and integrated (mobility and network) simulators in the domain of opportunistic communication. We limit the focus on the two simulators that we use in this thesis: the first one is a pure mobility simulator, and the second one can be used to simulate both mobility and networking operations. 26 Chapter 3. Collecting and Analyzing Mobility Data

Pedestrian Mobility Simulator Recognizing that opportunistic communication strongly depends on microscopic mobility and aiming to fill the gap in the space of high resolution mobility traces, VukadinoviÊ et al. propose the use of a pedestrian mobility simulator for generating mobility patterns of users in urban scenarios, such as people walking on a street. The traces were generated by Legion Studio [90], an agent-based pedestrian mo- bility simulator used for commercial applications and design of large public spaces such as subway and railway stations, stadiums and airports. The simulator is in- tended to recreate very realistic operational mobility of individual nodes, taking into account dierent behavioral models and walking speeds, thus implementing the capability constraints, as well as the coupling constraints manifested through the interaction with other nodes (due to queuing or congestion) or with physical obstacles in the environment. The same approach towards modeling microscopic mobility was applied in [16] to study the impact of pedestrian operational mo- bility on connectivity of users in two scenarios: pedestrians in a grid of streets and passengers in a subway station. We use the same traces in Papers D and E to parametrize our open population model and to evaluate the performance of content spreading schemes in these urban scenarios. We note that pedestrian mo- bility simulators are a valuable source of realistic mobility traces since they allow for generating customized, scenario-specific mobility of arbitrary large user pop- ulations, while faithfully recreating both social and physical forces driving users’ mobility. On the down side, sophisticated commercial solutions may not be avail- able to most researchers, for example Legion Studio has been superseded by Legion SpaceWorks, a software no longer open for academic purposes. Nevertheless, elab- orate open-source frameworks such as Vadere [91] are also emerging in the space of pedestrian micro-mobility simulators.

Opportunistic Network Simulator The above discussed mobility simulators are essential for generating realistic and easily configurable mobility scenarios. However, for the purpose of testing the op- eration on an entire opportunistic mobile system, we need yet another type of a software tool—a network simulator. Several such simulators are available and we use an integrated mobility and network simulator, the Opportunistic Networking Environment (ONE) [92], to obtain contact patterns from the aforementioned Le- gion mobility traces. The ONE simulator is a simulation environment capable of generating movements of nodes and handling delay-tolerant networking operations including event generation, message exchange, DTN routing and application pro- tocols, visualization and analysis. The generated events include connection events (link up/down) and messaging events (message received/transferred/deleted). The simulator is built as an extensible framework with interfaces for importing and ex- porting mobility traces, connectivity events and messages. From the perspective of mobility simulation, node movements can be generated from predefined (synthetic) models or from real traces. Chapter 4

Opportunistic Networking: Applications, Platforms and Performance Evaluation

The Internet has introduced an enormously accessible and egalitarian platform for creating, sharing and obtaining information on a global scale. As a result, we have new ways to allow people to exercise their human and civil rights.

Vinton G. Cerf

The main application area considered in this thesis falls under opportunistic networking. In this chapter we introduce the context for research problems that pertain to the design and evaluation of these specific types of networks, and we motivate a subset of research questions addressed in our publications. First, we provide a historical overview of ad hoc communication networks, positioning op- portunistic networks in this landscape. Next, we describe the system setup and elaborate on a particular type of opportunistic networking application in the con- text of a location-aware content distribution scenario. This scenario will be the focal point of the feasibility study and the performance evaluation in Papers D and E. In this chapter, we also survey mobile opportunistic system prototypes and (commercially or experimentally) deployed systems in this area. Finally, we intro- duce mathematical tools used for modeling and analyzing the performance of the investigated networking application.

4.1 Mobile Ad Hoc Networks and Delay Tolerant Networking

The concept of wireless multi-hop, self-organizing communication networks is as old as the first packet-switching radio networks. An example of the latter is the DARPA’s Packet Radio Network Project (PRNet) [93] (1973). PRNet featured a

27 Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 28 Evaluation distributed architecture of broadcast radios and relied on the combination of Aloha and carrier-sense multiple access (CSMA) protocols for the dynamic sharing of the broadcast radio channel. The system however revealed scalability, security and en- ergy management issues. To address these issues, in 1983 DARPA launched the Survivable Radio Networks (SURAN) program [94], aiming to develop techniques that would provide survivable communication for thousands of nodes in a dynami- cally changing network, under severe stress conditions. After these initial attempts, in 1997 the U.S. Army started a project called Tactical Internet [95] with the goal to develop the architecture and routing protocols for seamless data communication across heterogeneous networks. The project provided an important feedback on the use of modified commercial Internet networking protocols in the tactical environ- ment, based on extensive modeling, simulations as well as the evaluation of a large scale deployment including several thousands of radio devices and controllers. The emergence of portable communication devices and the introduction of wire- less technologies such as Wi-Fi and Bluetooth in the early 2000’s, initiated nearly two decades of research eorts on mobile ad hoc networks (MANETs) and the shift from military deployments to the commercial sector and the standardization community. MANETs were envisioned as general-purpose in the sense that their design should support any legacy TCP/IP application, rather than to adhere to any specific application. Possible applications spanned from tactical operations to emergency and disaster recovery services, to home and enterprise networking. The core of MANET research focused on enabling legacy Internet services in infras- tructureless, autonomous networks—in eect, replicating the functionality of wired networks by replacing wired links with wireless ones. MANETs inherited common characteristics found in wireless networks: varying signal-to-noise ratio, dierent transmission ranges of dierent nodes, interference; and, in addition, exhibited characteristics specific to ad hoc networking: an autonomous and infrastructureless topology, the mobility of the nodes, multi-hop communication where each node acts also as a router, and the uncoordinated sharing of the wireless medium by all nodes in the transmission range. Routing protocols developed for wired networks were no longer suitable for networks with frequent topology changes; new protocols had to be invented, but the problem was being solved by using wrong tools, that is, by trying to find end-to-end paths under the assumption that such paths always exist. In practice, however, the existence of end-to-end connectivity turned out to be feasible, at best, only in dense deployments and at added costs of decentralized network management. Concurrently with the main thrust of research eorts in the MANET area, the concept of delay-tolerant networking (DTN) started to develop. The DTN concept was built on the foundations of the Interplanetary Internet [96], aiming to achieve interoperability between diverse network architectures, especially those that suf- fer from frequent network partitioning and large communication delays (e.g. deep space communication or underwater acoustic networks), but also the types of net- works that are targeted for extreme environments (sensor-based and military ad hoc networks). In 2003, Fall [97] proposed the DTN architecture, which was based 4.2 Data Delivery in Opportunistic Networks 29 on the abstraction of message switching and used the store-and-forward operation. The DTN architecture was built as an “overlay” network on top of the transport and below application layers. This overlay, termed a bundle layer, maps the appli- cation data units into protocol units—message aggregates called bundles.Devices implementing the bundle layer, called DTN nodes, are responsible for handling bundle messages by enabling persistent in-network storage (to combat network partitioning) and by forwarding bundles in a hop-by-hop fashion to achieve reliable delivery. DTN concepts however did not pertain exclusively to disconnected mobile networks, but in general to any type of networks that experience challenges such as intermittent connectivity, large delays or include network-level heterogeneities. Expanding on the DTN paradigm and motivated by the idea that physical paths and mobility can be treated as a replacement of network links between the nodes, researchers have embarked on developing the concept of store-carry-forward com- munication. Borrowing conceptual ideas of MANETs that no network infrastructure should be required to enable communication between mobile nodes, opportunistic networks have emerged as a particular type of DTN networks. As the term implies, opportunistic communication exploits intermittent connectivity of devices which initiate message forwarding when, due to their mobility, they come into direct transmission range of other devices. Whereas in [98] the authors showed that mo- bility in MANETs can be utilized to improve capacity, in general, mobility in this type of networks is seen as a communication inhibitor. In opportunistic networks, the situation is the opposite: mobilityenables communication between otherwise disconnected network partitions.

4.2 Data Delivery in Opportunistic Networks

Data delivery in opportunistic networks takes two main forms: routing and content dissemination. The former communication mode can be seen as source-destination, or user-centric communication, and the latter one as broadcasting/multicasting or content-centric communication.

Routing in Opportunistic Networks The aim of routing protocols in opportunistic networks is essentially the same as those of MANETs—to find a path from the source node to the destination over a dynamic network topology. Approaching the routing problem equipped with DTN tools, the problem is reformulated such that the assumption of connectivity is re- laxed, where paths are formed by intermittently connected intermediary nodes, exchanging messages in a pairwise manner. One of the first proposed solutions was the epidemic routing protocol [99], which modeled message spreading as an infection targeting all nodes in the network, and therefore ensuring the minimum delay and maximum probability that the message finally reaches the destination. An obvious limitation of this approach was excessive usage of network resources, which moti- vated other researchers to seek resource-ecient solutions, while trading o delay Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 30 Evaluation and delivery probability against resource consumption. The acceptable trade-o is achieved by considering a utility function to find the best possible next hop, i.e., the forwarding node. Various utility functions have been defined that concerned, most frequently, contact information (last seen contact, frequency or duration), node characteristics and resources (battery, computing power), or social relations. Notable among the first utility-based protocols are PRoPHET [100], a history-based protocol that replicates content items over high likelihood paths towards destina- tion, and Spray-and-Wait [101], which limits the number of copies generated by aected nodes. Socially-aware protocols optimize the selection of forwarding nodes by utilizing the information on social relations; this information is derived from the analysis of the users’ social network interactions. Two notable examples are SimBet [102] and BUBBLE Rap [103]. While SimBet uses the similarity among nodes as a metric, BUBBLE Rap bases the decision on the community membership and node’s betweeness centrality. Devising ecient routing algorithms attracted much attention in the opportunistic networking community, and resulted in a large body of work, as surveyed in [104]. Nevertheless, these protocols support the same, unicast communication mode, in the sense that they are source-destination oriented.

Content Dissemination and the Publish/Subscribe Paradigm Rethinking the application domain and the general trends of networking paradigms shifting from user-centric to content-centric, Karlsson et al. [105, 106] envisioned a broadcast communication mode (as opposed to unicast) for opportunistic networks, introducing podcasting as an application for mobile delay-tolerant networks. The first system tailored for opportunistic content dissemination was developed within the Podnet project [107]; the later enhancements of the system were addressed in [108, 109]. In this view, mobile nodes use a publish/subscribe scheme to exchange content items based on their subscriptions and in a peer-to-peer manner. The con- tent structure is divided into feeds—hierarchically organized logical containers used for grouping similar contents—and the dissemination protocol is receiver-driven, allowing the dissemination only of the content items that the user has requested. Thus, whenever two nodes with a subscription to the same feed meet, they will syn- chronize their views of the feed, and exchange the content items either one of the peers is missing. This system design [108]provedtobeecient not just for content dissemination, but also for mobile social networking [27], collaborative music shar- ing [110], and participative crowd entertainment [111]. Moreover, it conceptually inspired convergence of similar solutions [29, 112] towards the publish/subscribe paradigm. This is also the system model that we consider.

Performance Metrics The objective of opportunistic communication schemes is the delivery of messages either to a specific destination node or to all nodes interested in that piece of 4.3 Location-Based Opportunistic Networks 31 information. To assess the eciency of a communication scheme, several metrics have commonly been used.

• Message delivery delay is measured as the time required for a message to reach a single node or all nodes.

• Delivery probability estimates how likely is that a unicast delivery will be successful within a given deadline; alternatively, it can be used to quantify the fraction of nodes that received the message.

• Communication overhead is defined in terms of the number of hops—interme- diate nodes that relay the message, or the number of transmissions (message copies) per relay node.

• Probability of the content persistence represents the likelihood that a content item, stored only by the nodes participating in the opportunistic exchange and without relying on the external infrastructural storage, will remain available for a given (possibly very long) time period.

The goals of an ecient routing or content spreading algorithm—to minimize the overhead while maximizing the delivery probability are clearly at odds. Most of the proposed algorithms seek for some trade-o by using additional information and heuristics to decide when, that is, to which next-hop node and what messages to forward, nevertheless still relying on epidemic spreading algorithms as a primitive. We will not further discuss routing algorithms as they are outside of the topic of this thesis—a thorough examination can be found in [113, 114]—but we acknowledge that some commonalities with content spreading schemes permeate both perfor- mance modeling directions. In addition to the aforementioned epidemic forwarding principle, the assumptions of the mixing among nodes need to be postulated. We will shortly return to these modeling assumptions in Section 4.5.

4.3 Location-Based Opportunistic Networks

In diverse networking scenarios, ranging from occasional massive public gatherings (e.g., sport events) to static scenarios such as urban participatory sensing, large amounts of data can be generated. This data is often relevant only locally, and per- haps even temporally, therefore storing them in some remote server in the network core is not well justified. Location-aware opportunistic communication emerges as a potential solution, bringing an array of additional benefits. First, the tempo- ral relevance of the content allows the users to control the content availability by storing it on their devices, in an entirely decentralized manner. Once the content is no longer deemed relevant, it will be removed from the user devices and lost irreversibly. The lack of a centralized storage also makes it more dicult to store the data in a server where it could be easily susceptible to censorship. Second, the use of location-based services entails that the users disclose their location, at Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 32 Evaluation least with some level of accuracy, leading to privacy breach in the users’ physical locations. With opportunistic communication, the users’ location privacy can be preserved through obfuscation [115]. Moreover, for applications such as participa- tory sensing, ensuring the validity of the collected and disseminated information, including the verification of the physical origin, is crucial. Possible solutions for the verification of the user-generated content have been proposed [116]. Having recog- nized the above benefits, researcher have investigated location-based opportunistic networking in the context of vehicular [117, 118, 119, 120, 121] and pedestrian scenarios [122], in a purely wireless, infrastructure-less domain [123, 124, 125, 126] and in mixed domains [105, 127]. In this thesis we focus on location-aware opportunistic content spreading where the content relevance is geographically limited to a dedicated zone. This type of applications use geofencing to create a virtual boundary—a geofence—around the zone of interest and to trigger the location awareness of of the users when they enter or leave the geofenced location. In particular, we consider urban areas, densely populated with mobile nodes and an opportunistic service for spreading local information, such as news, tourist information, transportation schedule or trac alerts. The service can also target opportunistic social networks in transient communities [31]. In this scenario, users generate content and assign to it an area of relevance. The dedicated area can be specified either explicitly, by defining the coordinates of a geofenced polygon, or implicitly e.g., “the city square” and relying on third- party APIs to determine the region of interest. Alternatively, content items can be generated by non-mobile entities, and injected into the wireless domain either via cellular networks or edge nodes to a set of visiting users. An example can be a city transportation agency that wishes to disseminate real-time trac updates and changes to public transportation routes due to an unexpected event. In accordance with the generic architecture of opportunistic podcasting [105], content items are published on channels with descriptive meta-data, where the location also represents an implicit filter. As shown on Fig. 4.1, the mobility of users is such that they enter the area, traverse through it and then leave. While inside the area, a user occasionally gets in contact with other users and if any of the two users participating in the contact event does not have the content item they are both interested in, an exchange occurs. All users are willing to contribute to content spreading for a certain time period; upon expiry of this period or after leaving the locale, users stop spreading the information. This is not an entirely novel use-case since several earlier studies have tar- geted similar scenarios. Terms such as hovering information [122] and floating content [123, 124, 118] have been used to described the property of content residing in a locale without a dedicated storage. Floating content is envisioned to support infrastructure-less, distributed content sharing over a certain area called anchor zone. The necessary conditions for providing such service are established in [125] and [128] for the case of stationary, respective mobile nodes. Focusing on vehicu- 4.4 Opportunistic Communication Systems: Prototypes and Applications 33

Figure 4.1: Location-aware content dissemination. lar communication, [120] presents a simulation-based study under what conditions vehicular mesh networks (highway and city-wide cases) can sustain information around the region where the information was generated. In [119] the authors pro- posed an overlay for vehicular delay-tolerant networks called Locus. The common objective of [119, 122] is to assess the system performance and dierent caching algorithms for the content survival. Ali et. al [126] perform an empirical study by deploying a floating-content application to investigate the feasibility and per- formance of the service in a campus/oce environment. While the above systems consider an area of interest marked by an anchor zone with a fixed radius, our service domain is somewhat dierent: we consider a geofenced area. Although such dierence is not essential to the system operation, it allows for dierent—and perhaps easier to human understanding—semantics.

4.4 Opportunistic Communication Systems: Prototypes and Applications

Perhaps the most rewarding recognition for the potential of device-to-device com- munication came from the 3GPP consortium which proposed support for proximity services (ProSe) in LTE networks (as of Release 12). Meanwhile, for more than a decade, researchers have been developing opportunistic system prototypes, mid- dlewares and proof-of-concept applications, some of these have been transformed into commercial products. Our main focus is on opportunistic systems for urban deployments, therefore next we survey most relevant systems in this category and present their main, distinct features. The Haggle [129] project defined the architecture for opportunistic, data-centric networking, in which application logic is separated from the underlying network. The idea behind the design of the system architecture is to enable seamless network connectivity and application functionality by taking advantage of both infrastruc- Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 34 Evaluation ture and ad hoc networking possibilities. Thus, in the absence of infrastructure connectivity, mobile devices using the system can dynamically organize ad hoc topologies by forming connections via available wireless technologies, e.g., Blue- tooth or Wi-Fi. Similarly to the Podnet system [108], Haggle uses publish/sub- scribe model to facilitate search and retrieval of data from other users; however, the dierence between the designs of these systems is that content in Haggle is unstructured and matching is performed by using keywords. The SCAMPI architecture (Service Platform for Social-Aware Mobile and Per- vasive Computing) [29, 130] defines an opportunistic router called SCAMPI router, based on the protocol and architecture specifications of the DTN Research Group [131], as well as some platform-specific extensions, e.g., for creating application layer units—messages that map to lower layer bundle units. The router uses vari- ous mechanisms to discover peers and nearby services, such as IP multicast/broad- cast beaconing and static IP discovery for direct peer sensing, and transitive peer discovery supported by learning remote peers from their direct neighbors. Content discovery is based on message metadata, whereas the content exchange is supported by the publish/subscribe interface. Liberouter [132] represents an entire system for do-it-yourself local networking, built around the SCAMPI stack and ported to small, inexpensive computing platforms, such as Raspberry Pis1. CAMEO [28] is a context-aware middleware for opportunistic mobile social net- working, where the contextual information is derived from the local user, their device and interaction with other users and physical environment. This makes applications built on top of CAMEO also social-aware, and helps optimizing for- warding decisions. The above platforms have been accompanied with mobile services and applica- tions, to demonstrate the platforms’ functionalities. Here & Now [31] and Social Pal [133] are two applications built on top of the SCAMPI platform. The former application enables opportunistic social networking and experience sharing, while the latter one provides techniques to estimate the social path length between two users of a social network. SmartCitizen [134] uses the CAMEO dissemination pro- tocol to stimulate citizens’ active participation in collecting environmental data and to facilitate sharing useful contents related to the quality of life in their city. In addition, there are several commercial applications available from the applica- tion market, relying on or enhancing opportunistic networking mode. Among these, Open Garden’s FireChat [135] is probably the best known mobile social application for o-the-grid communication between smartphone users, having gained popular- ity during political protests, natural disasters and large festivals. Twimight [136] is an Android-based client for disseminating messages and sensor data in opportunistic manner, if no other connectivity is viable, for example, in disaster recovery situations. Finally, let us mention several notable systems engineered for rural deploy- ments. Experimental DTN deployments in rural areas have been used to enable

1http://www.raspberrypi.org/ 4.5 Mathematical Tools for Modeling Content Dissemination and of Performance Evaluation 35 communication in remote villages with nomadic population [137] or in developing regions [138], as well as for wildlife tracking and environmental monitoring [139, 140, 141].

4.5 Mathematical Tools for Modeling Content Dissemination and of Performance Evaluation

A rigorous and credible performance evaluation of opportunistic networking appli- cations often requires both analytic and simulation-based studies. By means of simulations, one can achieve higher levels of realism for the investigated scenarios, especially when based on real mobility data. Simulation-based studies, however, entail certain computational and time expenses—due to repetitions through which statistical confidence is achieved—and prohibit qualitative, more general assessment of the solution at hand. Analytical performance evaluation is therefore required to complement the simulation studies. In this section we survey analytical approaches to modeling the spread of in- formation on networks from two perspectives. A general approach, inspired by mathematical modeling of epidemic spreading as well as the specific case of model- ing opportunistic networks are presented. Then, we elaborate on the mathematical tools that we used to model opportunistic content dissemination in a dynamic net- work of nodes and to evaluate the feasibility and performance of the investigated dissemination schemes. These tools are applied in Papers C–E. In the following, we also provide rationale for the chosen modeling approach. First, the system model needs to consider contact processes and the underlying pairwise (or otherwise group) mobility. This is achieved by making assumptions about individuals’ mobility, where most often individuals are considered to be sta- tistically equivalent, their movements mutually independent and governed by an ergodic and spatially uniform mobility process. As a result, nodes exhibit homo- geneous mixing,where each node is equally likely to meet any other node. For the sake of mathematical tractability, homogeneous mixing is modeled as a Poisson pro- cess representing sequence of contact events, with the same intensity across dierent node pairs. While, on the one hand, the exponential property of inter-contact times is justified by empirical analysis of real-life traces, the assumption of homogeneity, on the other hand, runs the risk of oversimplifying the contact process and, as we showed as an example in Paper C, may lead to inaccurate assessment of the sys- tem performance. As a remedy to this oversimplification, several works introduced frameworks with dierent levels of heterogeneity: from moderately heterogeneous populations with a small number of mobility classes [142] to extreme cases treating contact processes for each pair of nodes separately [143, 144], as well as account- ing for contact dynamics which depart from exponential inter-contact times [145], specifically for Pareto and hyper-exponential contact patterns. Although being able to capture diversity of interactions among nodes, these schemes have some limita- tions. The focus of these models is on the heterogeneity of contact patterns. Other Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 36 Evaluation research problems may consider the properties of the messages communicated, e.g. the age of information [146] or the version number [147]. Thus, re-purposing the heterogeneous mobility-based models to capture dierent protocol metrics and the heterogeneity (diversity) of the content states adds another level of complexity and renders them impracticable. In addition, such models may incur unnecessary com- plexity even for the problems they are aiming to address. As suggested in [148], modeling the spreading of information in a heterogeneous network with a Markov chain where every state represents not only the number of informed nodes, but also exactly which nodes are informed, can be simplified under certain assumptions. Specifically, if the pairwise contact rates are derived from the same probability dis- tribution, [148] shows that all the states with the same number of informed nodes are statistically equivalent. The main questions about epidemic message forwarding concern the number of infected (informed) nodes, the potential of content to persist in the network, the age of content and so on. Modeling epidemic processes can be approached from various angles, and below we discuss the two most common directions: the classical approach from mathematical epidemiology and the network-theory based approach. The first approach is inspired by the classical theory from the field of mathemat- ical epidemiology [149, 150], and is based on the assumption that the population can be divided into dierent groups or compartments depending on their infectious status (the stage of disease). Within a group, individuals are considered indistin- guishable. The most basic labeling includes susceptible nodes that have not yet contracted disease, but in contact with infected ones become infected and eventu- ally, after the incubation period expires, turn to recovered. Additional states can be included to denote, for instance exposed individuals—those that have been infected by the disease but cannot yet spread it to others, or individuals immune to the disease. By defining the basic, individual level processes that govern the transition of individuals from one compartment to another, we are able to model the evolu- tion of the number of individuals in each compartment as a function of the time. Depending on the chain of the states that an individual passes through, dierent models can be defined, most common being susceptible-infected-susceptible (SIS) or susceptible-infected-recovered (SIR). The extensions of the simple compartmental framework, including age-specific or spatially distinct groups, have been devised and studied, for example in [151]. This approach has been also adopted in other fields dealing with mean-field dynamic processes, such as physics and chemistry, where it was rephrased as a stochastic reaction-diusion process [152, 153]. Given the (nearly a century) long history of epidemic modeling, there has been a wealth of results for dierent configurations of the models, as well as the proofs of the models’ robustness and their predictive ability [149]. However, there has also been an evidence that the generalization and treatment of all individuals in the same compartment under the assumption of the identical mixing patterns, may be in- adequate in some real-world scenarios. This realization triggered eorts towards modeling with higher individual-level resolution. 4.5 Mathematical Tools for Modeling Content Dissemination and of Performance Evaluation 37 The second theoretical approach rests on the network theory [154] and aims to explicitly capture the diverse patterns of nodes’ interaction. In network theory, networks are mathematically described as graphs—structures comprising of points called nodes or vertices, and connections between the nodes called links or edges. Some important results are generally known, such that the degree distribution of the nodes (vertices) can determine the evolution of processes on the network (e.g. whether information, or a disease manages to spread) [154], or the eects of net- work topology on the rate and patterns of disease spread [155]. The contact network model explicitly represents interactions between any two individuals and these in- teractions mediate the spread of the disease in the network. Since building the exact network graph requires knowledge of every individual and every connection, characterizing the network model requires significant eort and may even become infeasible when the network scale increases. Hence, researchers typically have to work with approximate network models obtained through mean-field approxima- tions. Several mean-field approximation methods have been established to this end. The individual-based mean-field (IBMF) method [156] characterizes the spread of epidemic through the spectral radius of the adjacency matrix of the network graph. The adjacency matrix specifies all connections in the graph, with its elements taking value one for an edge connecting two neighboring nodes, and zero otherwise. IBMF models give solutions in the form of expectations of each network node being infected, where the state of a node is assumed to be independent of the state of its neighbors. For static networks, their solutions show reasonable fits to numerical results but the approximations tends to deteriorate when the independence assump- tion fails. The second, degree-based mean-field (DBMF) approach [157]describes the system in terms of probability that a node of degree k is infected at time t. It does so by regarding all nodes of the same degree as statistically equivalent, eectively reducing the complexity of the system. Compared to the individual- based approach, the DBMF approach does not assume independence of the states of neighboring nodes. This approach has proven to be, to some degree, suitable even for networks with time-varying connectivity patterns. The third approach maps the epidemic outbreak—the case when a large portion of the network gets infected—into a bond percolation problem in networks. This problem can be con- veniently tackled with generating functions [158]2 and it can be shown that, in the limit of the large graph size, there exists an exact solution of the average behav- ior of graphs under bond percolation. Percolation models are useful to determine the probability that an infected node belongs to an infinite cluster and have been applied in [125]. In addition to the mean-field approximations they rely on, perco- lation models however have other limitations as well. Such models can only answer whether the percolation happens, but do not allow for tracking the process over time.

2The title of this reference is Generatingfunctionology—possibly the best mathematics book title ever. Chapter 4. Opportunistic Networking: Applications, Platforms and Performance 38 Evaluation An additional complication in modeling epidemics is posed when the network topology changes at the rate comparable to (or higher than) the disease spreading rate. Topological changes could be due to mobility, causing frequent rewiring of graph edges, or when the network is constantly evolving, i.e., experiencing churn. A study of Volz [159] suggests that, in such scenarios, a static network graph may only serve as an approximate model to study the spread of diseases. This calls for the treatment of networks with time-varying graphs—a novel modeling approach based on the dynamic network theory [160] which considers not only the spatial, but also the temporal component of the network graph as its integral part. The main objective of our studies in Papers D and E is to devise a framework for modeling the content spreading process for location-based applications and, subsequently, to analyze what parameters determine the system viability. Further- more, we wish to assess what are the minimum requirements imposed on the nodes participating in content forwarding. To avoid working with time-varying network graphs, we reconcile compartmen- tal SIR models. However, we depart from the classical SIR models, which are commonly described by systems of equations in the mean-field regime and extend the system of equations by adding stochastic components to capture the evolution of the number of nodes in each compartment. The analytic system model features several essential parameters: the node arrival rate (to the entire system or a spe- cific compartment), node departure rate representing the rate at which nodes leave the area, recovery rate, which determines the time that nodes are actively spread- ing content and the rate of infection, derived via node contact rate. Introducing the stochastic component in our models allows for capturing fine-grained fluctua- tions in the number of nodes per each compartment. This approach can be further exploited, as we show, to establish conditions for content survival via stochastic stability analysis. In particular, since we are dealing with SIR systems described by a set of continuous-time dierential equations, we utilize Lyapunov theory to study the system stability. Chapter 5

Summary of Original Work

In science, if you know what you are doing—you should not be doing it. In engineering, if you do not know what you are doing—you should not be doing it. Of course, you seldom, if ever, see either pure state.

Richard Hamming, You and Your Research

This chapter summarizes the original work presented in this thesis. All papers included in the thesis appear in the same way they were published; minor formatting changes may apply. At the end of the chapter, we list other publications that the author of the thesis has published during the study period and that are not included in this thesis.

Paper A: Revisiting the Modeling of User Association Patterns in a University Wireless Network

Ljubica PajeviÊ, Viktoria Fodor and Gunnar Karlsson Published in Proc. IEEE Wireless Communications and Networking Conference (WCNC), 2018.

Summary: This paper presents an analysis of a large trace of user associations in a university wireless network, which includes around one thousand access points over five campuses. The trace is obtained from RADIUS authentication logs and its merit is in its recency, scale and duration. We propose a methodology for extract- ing association statistics from these logs, and look at visiting time distributions and processes of user arrivals to access points. We find that a large fraction of the network—around half of all access points—experiences time-varying Poisson arrival process, and association distributions can be modeled by two-stage hyper- exponential distributions at most of the access point. While network associations

39 40 Chapter 5. Summary of Original Work in campus wireless networks have been extensively studied in the literature, our study reveals changing patterns in user arrival processes and association durations, which seem to be characteristic for networks of predominantly mobile users, and allows the use of tractable network occupancy models.

The author of this thesis performed the work under the supervision of the second and third authors. The article was written in collaboration with the second author.

Paper B: Predicting the Users’ Next Location from WLAN Mobility Data

Ljubica PajeviÊ, Viktoria Fodor and Gunnar Karlsson Published in Proc. IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN), 2018.

Summary Accurate prediction of user mobility allows the ecient use of resources in our ubiquitously connected environment. In this work we study the predictability of the next location, considering a campus scenario with highly mobile users. We utilize Markov predictors, and estimate the theoretical predictability limits. Based on the mobility traces of nearly 7400 wireless network users, we estimate that the maximum predictability of the users is on average 82%, and we find that the best Markov predictor is accurate in 67% of the time. In addition, we show that mod- erate performance gains can be achieved by leveraging multi-location prediction.

The first two authors formulated the problem addressed in this publication. The au- thor of this thesis implemented the prediction framework and performed the analysis, under the supervision of the second and third authors. The article was written in collaboration with the second author.

Paper C: Epidemic Content Distribution

Ljubica PajeviÊ, Gunnar Karlsson and Ólafur Helgason Published in Proc. ACM International Conference on Modeling, Analysis and Sim- ulation of Wireless and Mobile Systems (MSWiM), 2013.

Summary: Epidemic content dissemination has been proposed as an approach to mitigate frequent link disruptions and support content-centric information dis- semination in opportunistic networks. Stochastic modeling is a common method to evaluate performance of epidemic dissemination schemes. The models introduce assumptions which, on one hand make them analytically tractable, while on the other, ignore attested characteristics of human mobility. In this paper, we inves- tigate the suitability and limitations of an analytical stochastic model for content 41 dissemination by comparison with experimental results obtained from real mobility traces. We conclude that, in the extreme cases of heterogeneous contact patterns, a homogeneous model is unable to capture the performance of content dissemination with respect to content delivery delays, while showing a reasonable fit in the cases of moderate contact heterogeneity.

The author of this thesis formulated and investigated the problem addressed in this publication based on the discussion with the third author. The writing was done by the first author under the supervision of the second and third authors.

Paper D: Modeling Opportunistic Communication with Churn

Ljubica PajeviÊ and Gunnar Karlsson Published in Elsevier Computer Communications, vol. 96, pp. 123–135, 2016.

Summary: In opportunistic networking, characterizing contact patterns between mobile users is essential for assessing feasibility and performance of opportunistic applications. There has been significant eorts in deriving this characterization, based on observations and trace analyses; however, most of the previously estab- lished results were obtained by studying contact opportunities at large spatial and temporal scales. Moreover, the user population is considered to be constant: no user can join or leave the system. Yet, there are many examples of scenarios which do not fully adhere to the previous assumptions and cannot be accurately described at large scales. Urban environments, such as smaller city districts, are character- ized by highly dynamic user populations. We believe that scenarios with varying population require further investigation. In this paper, we present a novel model- ing approach to study operation of opportunistic applications in scenarios where the population size is subjected to frequent changes, that is, it exhibits churn. We examine two location-based content sharing schemes: a purely opportunistic case and an infrastructure-supported content sharing scheme, for which we provide stochastic models based on stochastic dierential equations (SDEs). We validate our models in five scenarios: a city area, subway station, conference, campus, and a scenario with a synthetic mobility model and we show that the models provide good representations of the investigated scenarios.

The author of this thesis formulated the problem in collaboration with the second author. The model was proposed and validated by the first author. The article was written by the first author under the supervision of the second author. 42 Chapter 5. Summary of Original Work

Paper E: Ensuring Persistent Content in Opportunistic Networks via Stochastic Stability Analysis

Ljubica PajeviÊ, Viktoria Fodor and Gunnar Karlsson Published in ACM Transactions on Modeling and Performance Evaluation of Com- puting Systems (ToMPECS). vol. 3, issue 4, article no. 16, 2018.

Summary: The emerging device-to-device communication solutions and the abun- dance of mobile applications and services make opportunistic networking not only a feasible solution, but also an important component of future wireless networks. Specifically, the distribution of locally relevant content could be based on the com- munity of mobile users visiting an area, if long term content survival can be ensured this way. In this paper we establish the conditions of content survival in such op- portunistic networks, considering the user mobility patterns, as well as the time users keep forwarding the content, as the controllable system parameter. We model the content spreading with an epidemic process, and derive a stochas- tic dierential equations based approximation. By means of stability analysis we determine the necessary user contribution to ensure content survival. We show that the required contribution from the users depends significantly on the size of the population, that users need to redistribute content only in a short period within their stay, and that they can decrease their contribution significantly in crowded areas. Hence, with the appropriate control of the system parameters, opportunistic content sharing can be both reliable and sustainable.

The author of this thesis formulated the problem in collaboration with the second and third authors. The theorems for content extinction and content persistence were derived by the first author. The article was written in collaboration with the second author. 43

Publications not included in this thesis

• Viet Trinh Doan, Ljubica PajeviÊ, Vaibhav Bajpai and Jörg Ott, Tracing the path to YouTube – A quantification of path lengths and latencies towards content caches, Accepted for publication in IEEE Communications Magazine. • Jörg Ott, Ljubica PajeviÊ, Ermias Walelgne, Ari Keränen, Esa Hyytiä and Jussi Kangasharju, On the sensitivity of geo-based content sharing to location errors, in Proc. IEEE Conference on Wireless On-demand Network Systems and Services (WONS), 2017. • Sacha Trifunovic, Sylvia Kouyoumdjieva, Bernhard Distl, Ljubica PajeviÊ, Gunnar Karlsson and Bernhard Plattner, A decade of research in opportunis- tic networks: challenges, relevance, and future directions , in IEEE Commu- nications Magazine, 55(1), 2017. • Ljubica PajeviÊ and Gunnar Karlsson, Characterizing opportunistic commu- nication with churn for crowd-counting, in Proc. IEEE World of Wireless Mobile and Multimedia Networks (WoWMoM), Autonomic and Opportunis- tic Communications (AOC) workshop, 2015. • Ljubica PajeviÊ and Gunnar Karlsson, A zero dimensional mobility model for opportunistic networking, in Proc. IEEE World of Wireless Mobile and Multimedia Networks (WoWMoM), Autonomic and Opportunistic Commu- nications (AOC) workshop, 2011. • Ljubica PajeviÊ, Emre A. Yavuz, Ólafur Helgason, Gunnar Karlsson and Sylvia Kouyoumdjieva, Demo: Opportunistic mobile social networking,in Proc. 9th International Conference on Mobile systems, Applications, and Services (MobiSys), 2011. • Ólafur Helgason, Emre Yavuz, Sylvia Kouyoumdjieva, Ljubica PajeviÊ and Gunnar Karlsson, A mobile peer-to-peer system for opportunistic content cen- tric networking, in Proc. 2nd ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds (MobiHeld), 2010. • Ólafur Helgason, Emre Yavuz, Sylvia Kouyoumdjieva, Ljubica PajeviÊ and Gunnar Karlsson, Demonstrating a mobile peer-to-peer system for oppor- tunistic content-centric networking, demonstration at 2nd ACM SIGCOMM MobiHeld Workshop, 2010.

Chapter 6

Conclusion and Future Work

The scientific man does not aim at an immediate result. His work is like that of the planter—for the future. His duty is to lay the foundation for those who are to come, and point the way.

Nikola Tesla

This thesis investigates the feasibility and performance of content distribution in location-aware opportunistic networks. Carried out by means of either math- ematical analysis or simulations, credible performance evaluation of systems in- corporating mobility as their integral part is impossible without comprehensive understanding and realistic representation of human mobility. Theoretical and em- pirical results presented in this thesis help to fill some research gaps in the spaces of mobility models and of analytic tools for modeling spreading processes. In this chapter, we summarize the main conclusions of this thesis and present some open questions and guidelines for future work.

Mining, Analyzing and Modeling Human Mobility Patterns We collected and analyzed WLAN mobility traces of users in a wireless university network. By considering user mobility from the network perspective, we represented their aggregate mobility with queuing models. Specifically, we considered aggregate mobility as arrivals and visits to sections of the network, in particular to single access points; this method maps network access patterns to physical mobility and associates user movements to areas covered by access points. We found that for a significant number of cases, relatively simple, but time-varying Markovian models can be used to describe access patterns and we also identified the reasons, for example bursty arrivals of users, when more advanced models would be a better fit.

45 46 Chapter 6. Conclusion and Future Work

Having observed the changing access and mobility patterns of the users in this scenario as compared to previously established results extracted from other net- works and older traces—a phenomenon that can be attributed to the proliferation of mobile users—we posed a question how this trend would change the understand- ing of individual mobility. We then sought to predict movement of users and found that the predictability of the users’ next location has decreased, as compared to the results of the related, older studies. For this prediction, we used simple, yet ecient fixed-order Markov predictors which are proven to achieve reasonable accuracy in comparison with the baseline theoretical predictability results. We note that our findings have implications for a variety of use-cases, especially the emerging future mobile networking solutions, where mobility of users, in concert with the mobility of autonomous devices, will be one of the integral parts of the networked systems. To study opportunistic applications with a restricted location scope, we have devised an analytic model for individual mobility and for connectivity patterns of users in a population with churn (open population). We parametrized the model with the data from real-world datasets as well as from synthetic mobility traces, capturing several scenarios in dierent contexts of microscopic mobility, and val- idated, by means od simulations, that the underlying (synthetic) mobility model is able to reproduce the main performance metrics in a close accordance to the trace-based simulations.

Localized Content Distribution: Feasibility and Performance Evaluation We proposed a framework for modeling and evaluating the performance and the feasibility of localized epidemic content dissemination schemes. At the core of the framework is the open population queueing model, for the user arrivals and departures from the system, and the assumption of Poissonain contact processes between the nodes in the system. These assumptions allowed us to model the spread of epidemics (content items) with continuous-time SIR Markov processes, which we further approximated and treated with stochastic dierential equations (SDEs). By defining SDE models of two scenarios: an entirely distributed scheme of content sharing, and the scenario of opportunistic networks with infrastructural nodes, we showed, via simulations, that the framework is flexible enough to capture epidemic processes in these structurally dierent models. The two models however have a common denominator relating to the users’ contribution to the system operation: in both it is assumed that users are willing to contribute only a limited amount of their resources for the benefit of the entire system. Eectively, this is achieved by limiting the time period during which they forward the contents, after they have obtained it themselves. Based on the SDE model, we sought to engineer the content spreading scheme that would ensure that the content persists in ephemeral location-aware applications, relaying entirely on the presence and mobility of users in the area. Given the system model and the specifics of the user population in the considered geofenced areas, e.g., the density of users, their sojourn times and contact 47 patterns, the only system parameter that can be engineered is the forwarding time. Hence, we approach the system design problem with the objective to estimate the minimum forwarding time. We addressed the posed problem by means of stochastic stability analysis based on Lyapunov theory and we found that the requirements for forwarding time were, in the worst case of very sparse node densities, comparable to user sojourn times. In dense environments, on the other hand, users can be required to contribute to the system for only a fraction of the time they spend in the area.

Future Work

It is clear that the information about people’s locations, as well as their macro- scopic movement patterns, has a great potential and impact for transportation, urban planing and more recently, for activity tracking applications and for novel transportation sharing-economy platforms. In that sense, vast amounts of data primarily being collected for commercial use can also be exploited for research pur- poses. An example of this is the Uber Movement [161] initiative, which promises to allow the usage of traces collected from the cities where the company operates. As we have already emphasized, in this thesis we focused mainly on pedestrian mobility, not vehicular, but at larger scales the two cannot be entirely decoupled. Vehicular traces therefore can be exploited for origin-destination modeling, detecting popular locations, modeling occupancy and people flows and so on. Even more interesting and fundamental question one may ask is how are the new means of transportation sharing, let alone new emerging technologies such as autonomous vehicles, going to shape the future of human mobility—the question that must have preoccupied researchers at various times, from the invention of the telephone, to the availability of aordable air transportation to the emergence of social networks. Considering opportunistic device-to-device communication, it yet remains to see how the big stakeholders—mobile network operators and smartphone manufacturers— are going to respond to the ocial release of ProSe services in 3GPP Release 12. While waiting for this conundrum to unravel, and in the light of current aairs regarding exposures of users’ private data including location, we see an umbrella of opportunistic applications as particularly interesting. Namely, the location-aware, privacy-preserving data sharing applications and social networks. Even when the data privacy and protection is not of primary concern, some system deployments can benefit from opportunistic communication. For instance, for the deployment of Internet of Things (IoT) solutions [162], current information-centric based ap- proaches assume wireless coverage through out the environment where the devices will operate. On the one hand, to support extreme densification of IoT devices, centralized solutions may become too costly or dicult to plan and maintain, while on the other hand, such solutions may be also wasteful of networking (bandwidth) resources since most information that is generated by IoT devices is of local interest and likely to stay where it has originated. 48 Chapter 6. Conclusion and Future Work

The modeling framework we proposed can be used as a starting point towards performance evaluation of such deployments, however several questions have to be addressed first. Since we have studied the feasibility of the content distribution driven by mobile users, further investigations on the mobility of users immersed into “smart" environments may reveal new mobility and contact patterns, which need to be incorporated in the model. Subsequent modeling eorts may require more careful analysis of contact patterns and revisiting the assumption of homogeneous mixing, especially since the assumption of the spatial homogeneity may be violated; consequently, this may require adjusting the model to take into account dierent levels of mixing. The follow-up question is how the estimation of scenario-specific system parameters and the computation of the engineered parameters could be implemented in real applications. For this we see two possible solutions. The first solution would entail central entities that would monitor the node density, estimate and compute the desired parameters and would inform other nodes about the requirements. The second solution, in a fully distributed scheme, would have to be implemented by the nodes themselves. To this end, nodes would use local estimates of the system parameters (arrival and contact rate, sojourn time) and, based on the theoretical results that we derived in Paper E, compute the forwarding time. UrbanCount [163] could be one solution to estimate the size of the system and the contact rate in a distributed manner. In addition, practical applications will need to be able to work with the frequently updated estimation of the fluctuating node density and the dynamic adaptation to the changing environment. Thus, future work related to this thesis could be about designing the system with the above functionalities and implementing those in the content sharing applications. Bibliography

[1] United Nations Human Rights Council, The promotion, protection and en- joyment of human rights on the internet. 27 June 2016. [Online] Available: http://digitallibrary.un.org/record/845728/files/A_HRC_32_L-20 -EN.pdf. [Accessed: July 2018]. [2] G. Fodor, D. Erik, G. Mildh, S. Parkvall, N. Reider, G. Miklós, and Z. Turányi, “Design aspects of network assisted device-to-device communica- tions”, IEEE Communications Magazine, vol. 50, no. 3, pp. 170–177, 2012. [3] K. Doppler, M. Rinne, C. Wijting, C. B. Ribeiro, and K. Hugl, “Device- to-device communication as an underlay to LTE-advanced networks”, IEEE Communications Magazine, vol. 47, no. 12, pp. 42–49, 2009. [4] S. Buchegger and A. Datta, “A case for P2P infrastructure for social net- works – opportunities & challenges”, in Proc. IEEE/IFIP Wireless On- Demand Network Systems and Services (WONS), 2009, pp. 161–168. [5] Project Loon, [Online] Available: http://loon.co. [Accessed: July 2018]. [6] Free Basics Platform, [Online] Available: https://developers.facebook. com/docs/internet-org. [Accessed: July 2018]. [7] R. Sen, S. Ahmad, A. Phokeer, Z. A. Farooq, I. A. Qazi, D. Chones, and K. P. Gummadi, “Inside the walled garden: Deconstructing Facebook’s Free Basics program”, SIGCOMM Comput. Commun. Rev., vol. 47, no. 5, pp. 12– 24, 2017. [8] F. Bai, N. Sadagopan, and A. Helmy, “IMPORTANT: A framework to sys- tematically analyze the impact of mobility on performance of routing pro- tocols for adhoc networks”, in IEEE INFOCOM, vol. 2, 2003, pp. 825–835. [9] G. S. Thakur and A. Helmy, “COBRA: A framework for the analysis of realistic mobility models”, in Proc. IEEE INFOCOM, 2013, pp. 3351–3356.

49 50 Bibliography

[10] T. Camp, J. Boleng, and V. Davies, “A survey of mobility models for ad hoc network research”, Wireless Communications and Mobile Computing (WCMC): Special Issue on Mobile Ad Hoc Networking: Research, trends and applications, vol. 2, pp. 483–502, 2002. [11] S. Kurkowski, T. Camp, and M. Colagrosso, “MANET Scalesimulation stud- ies: The Identifyingncredibles”, SIGMOBILE Mobile Computing and Com- munications Review, vol. 9, no. 4, pp. 50–61, 2005. [12] K. Pearson, “The problem of the random walk”, Nature, vol. 72, p. 294, 27, 1905. [13] T. Hägerstrand, “What about people in regional science?”, Papers in Re- gional Science, vol. 24, no. 1, pp. 6–21, 1, 1970. [14] S. Hoogendoorn and P. Bovy, “Pedestrian route-choice and activity schedul- ing theory and models”, Transportation Research Part B: Methodological, vol. 38, no. 2, pp. 169–190, 2004. [15] A. Hanisch, J. Tolujew, K. Richter, and T. Schulze, “Online simulation of pedestrian flow in public buildings”, in Proc. of the Winter Simulation Con- ference., vol. 2, 2003, pp. 1635–1641. [16] O. Helgason, S. Kouyoumdjieva, and G. Karlsson, “Does mobility matter?”, in Proc. IEEE/IFIP Conference on Wireless On-demand Network Systems and Services (WONS), 2010. [17] C. Tuduce and T. Gross, “A mobility model based on WLAN traces and its validation”, in Proc. INFOCOM, vol. 1, 2005, pp. 664–674. [18] C. Bettstetter, “Mobility modeling in wireless networks: Categorization, smooth movement, and border eects”, SIGMOBILE Mobile Computing and Communications Review, vol. 5, no. 3, pp. 55–66, 2001. [19] M. Musolesi and C. Mascolo, “Mobility models for systems evaluation: A survey”, in Middleware for Network Eccentric and Mobile Applications,B. Garbinato, H. Miranda, and L. Rodrigues, Eds. Springer Berlin Heidelberg, 2009, pp. 43–62. [20] N. Aschenbruck, A. Munjal, and T. Camp, “Trace-based mobility modeling for multi-hop wireless networks”, Elsevier Computer Communications, vol. 34, no. 6, pp. 704–714, 2011. [21] D. Karamshuk, C. Boldrini, M. Conti, and A. Passarella, “Human mobility models for opportunistic networks”, IEEE Communications Magazine, vol. 49, no. 12, pp. 157–165, 2011. [22] J. Treurniet, “A taxonomy and survey of microscopic mobility models from the mobile networking domain”, ACM Computing Surveys, vol. 47, no. 1, 14:1–14:32, 2014. Bibliography 51

[23] A. Hess, K. A. Hummel, W. N. Gansterer, and G. Haring, “Data-driven human mobility modeling: A survey and engineering guidance for mobile networking”, ACM Computing Surveys, vol. 48, no. 3, 38:1–38:39, 2015. [24] E. Toch, B. Lerner, E. Ben-Zion, and I. Ben-Gal, “Analyzing large-scale hu- man mobility data: A survey of machine learning methods and applications”, Knowledge and Information Systems, pp. 1–23, [25] F. Ekman, A. Keränen, J. Karvo, and J. Ott, “Working day movement model”, in Proc. ACM SIGMOBILE Workshop on Mobility Models, 2008, pp. 33–40. [26] C. Boldrini and A. Passarella, “HCMM: Modelling spatial and temporal properties of human mobility driven by users’ social relationships”, Elsevier Computer Communications, vol. 33, no. 9, pp. 1056–1074, 2010. [27] S. Kouyoumdjieva, E. A. Yavuz, Ó. Helgason, L. Pajevic, and G. Karls- son, “Opportunistic content-centric networking: The conference case demo”, Demonstration at IEEE INFOCOM, 2011. [28] V. Arnaboldi, M. Conti, and F. Delmastro, “CAMEO: A novel context-aware middleware for opportunistic mobile social networks”, Elsevier Pervasive and Mobile Computing, vol. 11, pp. 148–167, 2014. [29] M. Pitkänen, T. Kärkkäinen, J. Ott, M. Conti, A. Passarella, S. Giordano, D. Puccinelli, F. Legendre, S. Trifunovic, K. Hummel, M. May, N. Hegde, and T. Spyropoulos, “SCAMPI: Service Platform for Social Aware Mobile and Pervasive Computing”, in Proceedings of the SIGCOMM Workshop on Mobile Cloud Computing (MCC), Helsinki, Finland, 2012, pp. 7–12. [30] L. Pajevic and G. Karlsson, “Modeling opportunistic communication with churn”, Elsevier Computer Communications, vol. 96, pp. 123–135, 2016. [31] T. Kärkkäinen, P. Houghton, L. Valerio, A. Passarella, and J. Ott, “Here & Now: Data-centric local social interactions through opportunistic networks: Demo”, in Proc. of ACM Workshop on Challenged Networks (CHANTS) at MobiCom 2016, 2016. [32] E. G. Ravenstein, “The laws of migration”, Journal of the Royal Statistical Society, vol. 52, no. 2, pp. 241–305, 1885. [33] S. Isaacman, R. Becker, R. Cáceres, M. Martonosi, J. Rowland, A. Var- shavsky, and W. Willinger, “Human mobility modeling at metropolitan scales”, in Proc. of the ACM International Conference on Mobile Systems, Applications, and Services (MobiSys), New York, NY, USA: ACM, 2012, pp. 239–252. [34] F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, “A universal model for mobility and migration patterns”, Nature, vol. 484, pp. 96–100, 26, 2012. 52 Bibliography

[35] Q. Zheng, X. Hong, and J. Liu, “An agenda based mobility model”, in 39th Annual Simulation Symposium (ANSS’06), 2006. [36] J. Kim, V. Sridhara, and S. Bohacek, “Realistic mobility simulation of urban mesh networks”, Elsevier Ad Hoc Networks, vol. 7, no. 2, pp. 411–430, 2009. [37] X. Liang, J. Zhao, L. Dong, and K. Xu, “Unraveling the origin of exponential law in intra-urban human mobility”, Scientific Reports, vol. 3, 2013. [38] V. D. Blondel, A. Decuyper, and G. Krings, “A survey of results on mobile phone datasets analysis”, EPJ Data Science, vol. 4, no. 1, p. 10, 5, 2015. [39] Google location services, [Online] Available: https://developer.android. com/training/location/index.html. [Accessed: July 2018]. [40] N. Kiukkonen, B. J., O. Dousse, D. Gatica-Perez, and J. K. Laurila, “To- wards rich mobile phone datasets: Lausanne data collection campaign”, in Proc. ACM Int. Conf. on Pervasive Services, 2010. [41] J. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T.-M.-T. Do, O. Dousse, J. Eberle, and M. Miettinen, “The mobile data challenge: Big data for mobile computing research”, in Proceedings of the Workshop on the Nokia Mobile Data Challenge, 2012, pp. 1–8. [42] Y. Zheng, X. Xie, and W.-Y. Ma, “Geolife: A collaborative social networking service among user, location and trajectory”, IEEE Data(base) Engineering Bulletin, 2010. [43] A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, M. M. Madsen, J. E. Larsen, and S. Lehmann, “Measuring large-scale social networks with high resolution”, PLOS ONE, vol. 9, no. 4, pp. 1–24, 2014. [44] I. Rhee, M. Shin, S. Hong, K. Lee, S. J. Kim, and S. Chong, “On the levy- walk nature of human mobility”, IEEE/ACM Trans. Netw., vol. 19, no. 3, pp. 630–643, 2011. [45] V. Vukadinovic and S. Mangold, “Opportunistic wireless communication in theme parks: A study of visitors mobility”, in Proceedings of the 6th ACM Workshop on Challenged Networks (CHANTS), 2011, pp. 3–8. [46] J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh, “Bridging the gap between physical location and online social networks”, in Proceedings of the 12th ACM International Conference on Ubiquitous Computing, 2010, pp. 119–128. [47] J. P. Bagrow, D. Wang, and A.-L. Barabási, “Collective response of human populations to large-scale emergencies”, PLOS ONE, vol. 6, no. 3, pp. 1–8, 2011. [48] X. Lu, E. Wetter, N. Bharti, A. J. Tatem, and L. Bengtsson, “Approaching the limit of predictability in human mobility”, Scientific Reports, vol. 3, 2013. Bibliography 53

[49] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau, CRAW- DAD trace cambridge/haggle/imote/infocom2006 (v. 2009-05-29), 2009. [50] N. Eagle and A. Pentland, “Reality mining: Sensing complex social systems”, Personal and Ubiquitous Computing, vol. 10, no. 4, pp. 255–268, 1, 2006. [51] L. Sun, K. W. Axhausen, D.-H. Lee, and X. Huang, “Understanding metropoli- tan patterns of daily encounters”, Proceedings of the National Academy of Sciences, vol. 110, no. 34, pp. 13 774–13 779, 2013. [52] E. Yoneki, P. Hui, and J. Crowcroft, “Bio-inspired computing and communi- cation”, in, Berlin, Heidelberg: Springer-Verlag, 2008, ch. Wireless Epidemic Spread in Dynamic Human Networks, pp. 116–132. [53] A.-K. Pietiläinen, E. Oliver, J. LeBrun, G. Varghese, and C. Diot, “Mo- biClique: Middleware for Mobile Social Networking”, in Proceedings of the 2Nd ACM Workshop on Online Social Networks (WOSN), 2009, pp. 49–54. [54] A.-K. Pietiläinen and C. Diot, “Dissemination in opportunistic social net- works: The role of temporal communities”, in Proceedings of the Thirteenth ACM International Symposium on Mobile Ad Hoc Networking and Comput- ing (MobiHoc), 2012, pp. 165–174. [55] C. Petro, CRAWDAD trace hope/amd (v. 2008-08-07), Downloaded from http://crawdad.org/hope/amd/20080807, 2008. [56] A. Panisson, L. Gauvin, A. Barrat, and C. Cattuto, “Fingerprinting tem- poral networks of close-range human proximity”, in Proc. IEEE Interna- tional Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), 2013, pp. 261–266. [57] L. Rossi and M. Musolesi, “It’s the way you check-in: Identifying users in location-based social networks”, in Proceedings of the Second ACM Confer- ence on Online Social Networks (COSN), 2014, pp. 215–226. [58] J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási, “Structure and tie strengths in mobile commu- nication networks”, Proceedings of the National Academy of Sciences, vol. 104, no. 18, pp. 7332–7336, 2007. [59] M. González, C. Hidalgo, and A.-L. Barabási, “Understanding individual human mobility patterns”, 7196, vol. 453, 2008, pp. 779–782. [60] I. Barnett, T. Khanna, and J.-P. Onnela, “Social and spatial clustering of people at humanity’s largest gathering”, PLOS ONE, vol. 11, no. 6, pp. 1– 12, 2016. [61] F. Calabrese, L. Ferrari, and V. D. Blondel, “Urban sensing using mobile phone network data: A survey of research”, ACM Computing Surveys, vol. 47, no. 2, 25:1–25:20, 2014. 54 Bibliography

[62] G. Chen, S. Hoteit, A. C. Viana, M. Fiore, and C. Sarraute, “Enriching sparse mobility information in call detail records”, Computer Communica- tions, vol. 122, pp. 44–58, 2018. [63] CRAWDAD: A community resource for archiving wireless data at dart- mouth. [Online] Available: http://crawdad.cs.dartmouth.edu.[Accessed: July 2018]. 2013. [64] G. Barlacchi, M. De Nadai, R. Larcher, A. Casella, C. Chitic, G. Torrisi, F. Antonelli, A. Vespignani, A. Pentland, and B. Lepri, “A multi-source dataset of urban life in the city of Milan and the Province of Trentino”, Scientific Data, vol. 2, 2015. [65] Y. de Montjoye, Z. Smoreda, R. Trinquart, C. Ziemlicki, and V. D. Blondel, “D4D-Senegal: The second mobile phone data for development challenge”, CoRR, 2014. arXiv: 1407.4885. [66] V. D. Blondel, M. Esch, C. Chan, F. Clérot, P. Deville, E. Huens, F. Morlot, Z. Smoreda, and C. Ziemlicki, “Data for development: The D4D challenge on mobile phone data”, CoRR, 2012. arXiv: 1210.0137. [67] Y.-A. de Montjoye, C. A. Hidalgo, M. Verleysen, and V. D. Blondel, “Unique in the crowd: The privacy bounds of human mobility”, Scientific Reports, vol. 3, 2013. [68] Y.-A. de Montjoye, L. Radaelli, V. K. Singh, and A. ı. Pentland, “Unique in the shopping mall: On the reidentifiability of credit card metadata”, Science, vol. 347, no. 6221, pp. 536–539, 2015. [69] B. Gedik and L. Liu, “Protecting location privacy with personalized k- anonymity: Architecture and algorithms”, IEEE Transactions on Mobile Computing, vol. 7, no. 1, pp. 1–18, 2008. [70] P. Wightman, W. Coronell, D. Jabba, M. Jimeno, and M. Labrador, “Eval- uation of location obfuscation techniques for privacy in location based infor- mation systems”, in 2011 IEEE Third Latin-American Conference on Com- munications, 2011, pp. 1–6. [71] D. Mir, S. Isaacman, R. Caceres, M. Martonosi, and R. Wright, “DP-WHERE: Dierentially private modeling of human mobility”, in 2013 IEEE Interna- tional Conference on Big Data, 2013, pp. 580–588. [72] Eduroam, [Online] Available: https://www.eduroam.org. [73] TCPDUMP, [Online] Available: http://www.tcpdump.org. [74] D. Tang and M. Baker, “Analysis of a local-area wireless network”, in Proc. of the 6th ACM Annual International Conference on Mobile Computing and Networking (MobiCom), 2000. Bibliography 55

[75] A. Balachandran, G. M. Voelker, P. Bahl, and P. V. Rangan, “Characterizing user behavior and network performance in a public wireless LAN”, in Proc. of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2002, pp. 195–205. [76] F. Chinchilla, M. Lindsey, and M. Papadopouli, “Analysis of wireless in- formation locality and association patterns in a campus”, in Proc. IEEE INFOCOM, vol. 2, 2004. [77] D. Kotz and K. Essien, “Analysis of a campus-wide wireless network”, Wire- less Networks, vol. 11, no. 1-2, 2005. [78] T. Henderson, D. Kotz, and I. Abyzov, “The changing usage of a mature campus-wide wireless network”, Elsevier Computer Networks, vol. 52, no. 14, 2008. [79] M. McNett and G. M. Voelker, “Access and mobility of wireless pda users”, SIGMOBILE Mobile Computing and Communications Review, vol. 9, no. 2, pp. 40–55, 2005. [80] X. Chen, R. Jin, K. Suh, B. Wang, and W. Wei, “Network performance of smart mobile handhelds in a university campus WiFi network”, in Proceed- ings of the ACM Internet Measurement Conference (IMC), 2012, pp. 315– 328. [81] A. Allahdadi, R. Morla, A. Aguiar, and J. S. Cardoso, “Predicting short 802.11 sessions from RADIUS usage data”, in IEEE Conference on Local Computer Networks – Workshops, 2013, pp. 1–8. [82] M. Zhou, K. Sui, M. Ma, Y. Zhao, D. Pei, and T. Moscibroda, “Mobicamp: A campus-wide testbed for studying mobile physical activities”, in Proceedings of the 3rd ACM International on Workshop on Physical Analytics (WPA), 2016, pp. 1–6. [83] M. Afanasyev, T. Chen, G. M. Voelker, and A. C. Snoeren, “Analysis of a mixed-use urban WiFi network: When metropolitan becomes neapolitan”, in Proc. 8th ACM SIGCOMM Conference on Internet Measurements (IMC), 2008. [84] ——, “Usage patterns in an urban WiFi network”, IEEE/ACM Transactions on Networking, vol. 18, no. 5, 2010. [85] M. Lenczner and A. G. Hoen, CRAWDAD dataset ilesansfil/wifidog (v. 2015- 11-06), Downloaded from https://crawdad.org/ilesansfil/wifidog/20 151106, 2015. [86] S. Isaacman and M. Martonosi, “Low-infrastructure methods to improve internet access for mobile users in emerging regions”, in Proceedings of the 20th International Conference Companion on World Wide Web (WWW), 2011, pp. 473–482. 56 Bibliography

[87] W. Bao and B. Liang, “Insensitivity of user distribution in multicell net- works under general mobility and session patterns”, IEEE Transactions on Wireless Communications, vol. 12, no. 12, pp. 6244–6254, 2013. [88] S. Scellato, M. Musolesi, C. Mascolo, V. Latora, and A. T. Campbell, “Nextplace: A spatio-temporal prediction framework for pervasive systems”, in Pervasive Computing, Springer Berlin Heidelberg, 2011, pp. 152–169. [89] A. Ghosh, R. Jana, V. Ramaswami, J. Rowland, and N. K. Shankara- narayanan, “Modeling and characterization of large-scale Wi-Fi tracin public hot-spots”, in Proc. IEEE INFOCOM, 2011, pp. 2921–2929. [90] LEGION Studio, [Online] Available: http://www.legion.com/legion- studio. [Accessed: June 2018]. [91] VADERE simulator, [Online] Available: http : / / www . vadere . org.[Ac- cessed: July 2018]. 2017. [92] A. Keränen, J. Ott, and T. Kärkkäinen, “The ONE Simulator for DTN protocol evaluation”, in SIMUTools ’09: Proceedings of the 2nd International Conference on Simulation Tools and Techniques, 2009. [93] R. E. Kahn, S. A. Gronemeyer, J. Burchfiel, and R. C. Kunzelman, “Ad- vances in packet radio technology”, Proceedings of the IEEE Special Issue on Packet Communication Networks, vol. 66, no. 11, pp. 1468–1496, 1978. [94] J. Westcott and G. Lauer, “Hierarchical routing for very large networks”, in Proc. IEEE Military Communications Conference (MILCOM), vol. 2, 1984, pp. 214–218. [95] D. A. Hall, “Tactical Internet system architecture for Task Force XXI”, in Proceedings of the 1996 Tactical Communications Conference. Ensuring Joint Force Superiority in the Information Age, 1996, pp. 219–230. [96] V. Cerf, S. Burleigh, A. Hooke, L. Torgerson, R. Durst, K. Scott, E. Travis, and H. Weiss, Interplanetary internet (IPN): Architectural definition,[On- line] Available: www.tools.ietf.org/html/draft-irtf-ipnrg-arch-00, 2000. [97] K. Fall, “A delay-tolerant network architecture for challenged internets”, in Proceedings of the ACM Conference on Applications, technologies, archi- tectures, and protocols for computer communications (SIGCOMM), 2003, pp. 27–34. [98] M. Grossglauser and D. N. C. Tse, “Mobility increases the capacity of ad hoc wireless networks”, IEEE/ACM Transactions on Networking, vol. 10, no. 4, pp. 477–486, 2002. [99] A. Vahdat and D. Becker, “Epidemic routing for partially-connected ad hoc networks”, Department of Computer Science, Duke University, Tech. Rep., 2000. Bibliography 57

[100] A. Lindgren, A. Doria, and O. Schelén, “Probabilistic routing in inter- mittently connected networks”, ACM SIGMOBILE Mobile Computing and Communications Review, vol. 7, no. 3, pp. 19–20, 2003. [101] T. Spyropoulos, K. Psounis, and C. S. Raghavendra, “Spray and wait: An ecient routing scheme for intermittently connected mobile networks”, in Proceedings of the ACM SIGCOMM Workshop on Delay-tolerant Networking (WDTN), 2005, pp. 252–259. [102] E. M. Daly and M. Haahr, “Social network analysis for routing in discon- nected delay-tolerant manets”, in Proceedings of the 8th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 2007, pp. 32–40. [103] P. Hui, J. Crowcroft, and E. Yoneki, “Bubble rap: Social-based forwarding in delay-tolerant networks”, IEEE Transactions on Mobile Computing, vol. 10, no. 11, pp. 1576–1589, 2011. [104] Y. Cao and Z. Sun, “Routing in delay/disruption tolerant networks: A tax- onomy, survey and challenges”, IEEE Communications Surveys Tutorials, vol. 15, no. 2, pp. 654–677, 2013. [105] G. Karlsson, V. Lenders, and M. May, “Delay-tolerant broadcasting”, in Proc. of the ACM SIGCOMM Workshop on Challenged Networks (CHANTS), 2006, pp. 197–204. [106] V. Lenders, G. Karlsson, and M. May, “Wireless ad hoc podcasting”, in Proc. IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, 2007, pp. 273–283. [107] PodNet Project: Mobile Distribution of User-generated Content,[Online] Available: http://www.podnet.ethz.ch. [108] Ó. Helgason, E. Yavuz, S. Kouyoumdjieva, L. Pajevic, and G. Karlsson, “A mobile peer-to-peer system for opportunistic content-centric networking”, in Proc. ACM SIGCOMM MobiHeld workshop, New Delhi, India, 2010. [109] Ó. Helgason, S. Kouyoumdjieva, L. Pajevic, E. A. Yavuz, and G. Karlsson, “A middleware for opportunistic content distribution”, Elsevier Computer Networks, vol. 107, pp. 178–193, 2016. [110] Z. Chen, E. A. Yavuz, and G. Karlsson, “What a juke! a collaborative mu- sic sharing system”, in 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), 2012, pp. 1–6. [111] I. Marfisi-Schottman, G. Karlsson, and J. Celander Guss, “Opphos - a partic- ipative light and sound show using mobile phones in crowds”, in ExtremCom International Conference, Thórsmörk, Iceland, 2013, pp. 47–48. [112] C. Boldrini, M. Conti, and A. Passarella, “ContentPlace: social-aware data dissemination in opportunistic networks”, in Proceedings of the ACM Inter- national Symposium on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM), 2008, pp. 203–210. 58 Bibliography

[113] T. Spyropoulos, K. Psounis, and C. S. Raghavendra, “Ecient routing in in- termittently connected mobile networks: The single-copy case”, IEEE/ACM Transactions on Networking, vol. 16, no. 1, pp. 63–76, 2008. [114] ——, “Ecient routing in intermittently connected mobile networks: The multiple-copy case”, IEEE/ACM Transactions on Networking, vol. 16, no. 1, pp. 77–90, 2008. [115] S. Zakhary, M. Radenkovic, and A. Benslimane, “Ecient location privacy- aware forwarding in opportunistic mobile networks”, IEEE Transactions on Vehicular Technology, vol. 63, no. 2, pp. 893–906, Feb. 2014. [116] S. Gisdakis, T. Giannetsos, and P. Papadimitratos, “SHIELD: A Data Ver- ification Framework for Participatory Sensing Systems”, in Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks (WiSec), 2015, 16:1–16:12. [117] A. Wegener, H. Hellbruck, S. Fischer, C. Schmidt, and S. Fekete, “Autocast: An adaptive data dissemination protocol for trac information systems”, in 2007 IEEE 66th Vehicular Technology Conference, 2007, pp. 1947–1951. [118] J. Ott, L. Kärkkäinen, E. A. Walelgne, A. Keränen, E. Hyytiä, and J. Kan- gasharju, “On the sensitivity of geo-based content sharing to location errors”, in Proc. IEEE/IFIP Conference on Wireless On-demand Network Systems and Services (WONS), 2017, pp. 25–32. [119] N. Thompson, R. Crepaldi, and R. Kravets, “Locus: A location-based data overlay for disruption-tolerant networks”, in Proceedings of the ACM Work- shop on Challenged Networks (CHANTS), 2010, pp. 47–54. [120] B. Liu, B. Khorashadi, D. Ghosal, C.-N. Chuah, and M. H. Zhang, “Assess- ing the VANET’s local information storage capability under dierent trac mobility”, in Proc. IEEE INFOCOM, 2010, pp. 1–5. [121] I. Leontiadis and C. Mascolo, “Opportunistic spatio-temporal dissemina- tion system for vehicular networks”, in Proceedings of the 1st International MobiSys Workshop on Mobile Opportunistic Networking (MobiOpp), 2007, pp. 39–46. [122] A. A. Villalba Castro, G. D. M. Serugendo, and D. Konstantas, “Hover- ing information – self-organizing information that finds its own storage”, in Autonomic Communication, Springer, 2009, pp. 111–145. [123] J. Ott, E. Hyytiä, P. Lassila, J. Kangasharju, and S. Santra, “Floating con- tent for probabilistic information sharing”, Pervasive and Mobile Computing, vol. 7, no. 6, pp. 671–689, 2011. [124] E. Hyytiä, J. Virtamo, P. Lassila, J. Kangasharju, and J. Ott, “When does content float? characterizing availability of anchored information in oppor- tunistic content sharing”, in Proc. IEEE INFOCOM, 2011, pp. 3137–3145. Bibliography 59

[125] E. Hyytiä, P. Lassila, J. Ott, and J. Kangasharju, “Floating information with stationary nodes”, in Eighth Workshop on Spatial Stochastic Models for Wireless Networks (SpaSWin), Paderborn, Germany, 2012. [126] S. Ali, G. Rizzo, V. Mancuso, and M. A. Marsan, “Persistence and availabil- ity of floating content in a campus environment”, in Proc. IEEE INFOCOM, 2015, pp. 2326–2334. [127] L. Pajevic and G. Karlsson, “Characterizing opportunistic communication with churn for crowd-counting”, in Proc. IEEE WoWMoM workshop on Au- tonomic and Opportunistic Computing (AOC), 2015, pp. 1–6. [128] E. Hyytiä and J. Ott, “Criticality of large delay tolerant networks via di- rected continuum percolation in space-time”, in Proc. IEEE INFOCOM, 2013, pp. 320–324. [129] J. Su, J. Scott, P. Hui, J. Crowcroft, E. De Lara, C. Diot, A. Goel, M. H. Lim, and E. Upton, “Haggle: Seamless networking for mobile applications”, UbiComp ’07, pp. 391–408, 2007. [130] T. Kärkkäinen, M. Pitkänen, P. Houghton, and J. Ott, “SCAMPI application platform”, in Proc. ACM International Workshop on Challenged Networks (CHANTS), New York, NY, USA, 2012, pp. 83–86. [131] Delay Tolerant Network Research Group (DTNRG), http://irtf.org/ concluded/dtnrgg. [132] T. Kärkkäinen and J. Ott, “Liberouter: Towards autonomous neighborhood networking”, in IEEE/IFIP Conference on Wireless On-demand Network Systems and Services (WONS), 2014, pp. 162–169. [133] M. Nagy, T. Bui, S. Udar, N. Asokan, and J. Ott, “Spotshare and near- bypeople: Applications of the social pal framework”, in Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Net- works (WiSec), 2015, 31:1–31:2. [134] F. Delmastro, V. Arnaboldi, and M. Conti, “People-centric computing and communications in smart cities”, IEEE Communications Magazine, vol. 54, no. 7, pp. 122–128, 2016. [135] O. Garden, Firechat, [Online] Available: https://www.opengarden.com/ firechat. [136] T. Hossmann, F. Legendre, P. Carta, P. Gunningberg, and C. Rohner, “Twit- ter in disaster mode: Opportunistic communication and distribution of sen- sor data in emergencies”, in Proceedings of the 3rd Extreme Conference on Communication (ExtremeCom): The Amazon Expedition, 2011, pp. 1–6. [137] S. Grasic and A. Lindgren, “Revisiting a remote village scenario and its dtn routing objective”, Elsevier Computer Communications, vol. 48, pp. 133– 140, 2014. 60 Bibliography

[138] A. Pentland, R. Fletcher, and A. Hasson, “Daknet: Rethinking connectivity in developing nations”, IEEE Computer, vol. 37, no. 1, pp. 78–83, 2004. [139] Z. J. Haas and T. Small, “A new networking model for biological applications of ad hoc sensor networks”, IEEE/ACM Transactions on Networking, vol. 14, no. 1, pp. 27–40, 2006. [140] P. Juang, H. Oki, Y. Wang, M. Martonosi, L. S. Peh, and D. Rubenstein, “Energy-ecient computing for wildlife tracking: Design tradeos and early experiences with zebranet”, SIGARCH Comput. Archit. News, vol. 30, no. 5, pp. 96–107, 2002. [141] G. Werner-Allen, K. Lorincz, M. Ruiz, O. Marcillo, J. Johnson, J. Lees, and M. Welsh, “Deploying a wireless sensor network on an active volcano”, IEEE Internet Computing, vol. 10, no. 2, pp. 18–25, 2006. [142] T. Spyropoulos, T. Turletti, and K. Obraczka, “Routing in delay-tolerant networks comprising heterogeneous node populations”, IEEE Trans. on Mo- bile Computing, vol. 8, no. 8, pp. 1132–1147, 2009. [143] A. Picu, T. Spyropoulos, and T. Hossmann, “An analysis of the information spreading delay in heterogeneous mobility DTNs”, in Proc. IEEE WoW- MoM, 2012, pp. 1–10. [144] Y. Kim, K. Lee, N. B. Shro, and I. Rhee, “Providing probabilistic guaran- tees on the time of information spread in opportunistic networks”, in Proc. IEEE INFOCOM, 2013, pp. 2067–2075. [145] C. Boldrini, M. Conti, and A. Passarella, “Performance modelling of oppor- tunistic forwarding under heterogenous mobility”, Elsevier Computer Com- munications, vol. 48, pp. 56–70, 2014. [146] A. Chaintreau, J.-Y. Le Boudec, and N. Ristanovic, “The age of gossip: Spa- tial mean field regime”, in Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems (SIGMET- RICS), 2009, pp. 109–120. [147] T. Kärkkäinen and J. Ott, “Shared content editing in opportunistic net- works”, in Proceedings of the 9th ACM MobiCom Workshop on Challenged Networks (CHANTS), 2014, pp. 61–64. [148] V. Sciancalepore, D. Giustiniano, A. Banchs, and A. Hossmann-Picu, “Of- floading cellular trac through opportunistic communications: Analysis and optimization”, IEEE Journal on Selected Areas in Communications, vol. 34, no. 1, pp. 122–137, 2016. [149] R. M. Anderson and R. M. May, Infectious Diseases of Humans: Dynamics and Control, ser. Dynamics and Control. Oxford University Press, Oxford, 1992. [150] M. J. Keeling and P. Rohani, Modeling infectious diseases in humans and animals. Princeton University Press, 2011. Bibliography 61

[151] F. Ball, D. Mollison, and G. Scalia-Tomba, “Epidemics with two levels of mixing”, The Annals of Applied Probability, vol. 7, no. 1, pp. 46–89, 1997. [152] N. G. van Kampen, Stochastic Processes in Physics and Chemistry (Third Edition), First Edition, ser. North-Holland Personal Library. Amsterdam, 1981. [153] D. T. Gillespie, “The chemical Langevin equation”, The Journal of Chemical Physics, vol. 113, no. 1, pp. 297–306, 2000. [154] M. Newman, Networks: An Introduction. New York, NY, USA.: Oxford Uni- versity Press Inc., 2010. [155] R. Pastor-Satorras and A. Vespignani, “Epidemic dynamics and endemic states in complex networks”, Physical Review E, vol. 63, no. 6, 2001. [156] P. Van Mieghem, J. Omic, and R. Kooij, “Virus spread in networks”, IEEE/ ACM Transactions on Networking, vol. 17, no. 1, pp. 1–14, 2009. [157] M. Boguñá and R. Pastor-Satorras, “Epidemic spreading in correlated com- plex networks”, Physical Review E, vol. 66, no. 4, 2002. [158] H. S. Wilf, Generatingfunctionology. Natick, MA, USA: A. K. Peters, Ltd., 2006. [159] E. Volz and L. A. Meyers, “Epidemic thresholds in dynamic contact net- works”, Journal of The Royal Society Interface, vol. 6, no. 32, pp. 233–241, 2009. [160] O. Michail and P. G. Spirakis, “Elements of the theory of dynamic networks”, Communications of the ACM, vol. 61, no. 2, pp. 72–81, 2018. [161] Uber movement, [Online] Available: https://movement.uber.com. [162] E. Borgia, R. Bruno, and A. Passarella, “Making opportunistic networks in IoT environments CCN-ready: A performance evaluation of the MobCCN protocol ”, Elsevier Computer Communications, vol. 123, pp. 81–96, 2018. [163] P. Danielis, S. Kouyoumdjieva, and G. Karlsson, “UrbanCount: Mobile crowd counting in urban environments”, in Proceedings IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2017, pp. 640–648.