Towards Context Aware Opportunistic Forwarding in Social

Pervasive Systems

By Soumaia Ahmed Al Ayyat

Department of Computer Science The American University in Cairo

Thesis Dissertation In Partial Fulfillment of the Requirements of Doctor of Philosophy in Applied Sciences

Thesis Advisors: Dr. Sherif G. Aly and Dr. Khaled A. Harras

September 2016 Acknowledgement

Praise be to Allah the Almighty all the time and everywhere. I would like to acknowledge a lot of those who gave me support and encouragement to complete the mission successfully. First of all, I have to pay deep gratitude to Mr. Youssef Jameel who offered me his generous PhD fellowship sponsorship without which I would not have been able to accomplish this significant milestone in my academic path.

Next, I would like to bow down to my parents who endorsed me and believed in my intellectual capabilities since my early years. They always pushed me to continue and gain more knowledge and acquire higher level of education. I owe them all the success I have reached so far and will always be.

May God bless them and reward them both in life and in heaven.

Furthermore, I definitely express my sincere gratitude to my advisors for their close and continuous support and precious advice in every step of my research. They have taught me morals in addition to academic research skills. No words can reward them for their great deeds.

I would also like to extend my warm appreciation to my husband and sons for bearing with me all this duration, and for understanding my ups and downs in mood, and for encouraging me to continue till the end. Not to mention my deep sense of obligation towards my dear sister for her continuous support and time in discussing my work and reviewing the conference papers I submitted. I wish her all the success and happiness in her career and studies.

I am also quite thankful to my dearest professor, director and friend Dr. Mona Kaddah for her continuous support and push to accomplish the mission at its best level. May God reward her for all her good deeds.

On another aspect, I would like to acknowledge Dr. Hasan Zaki, Dr. Ramadan and Eng. Amr

AbdelLatif from the Social Research Center for their time in providing me with advice on the statistical methods that suit my research work.

Also, many thanks to Dr. Ali Hadi and Dr. Noha Youssef from the Mathematics Department for their time in providing me with advice on the statistical methods that suit my research work. Special thanks to Dr. Hatem El Ayat who dedicated several consultation sessions to discuss the case of my statistical analysis and generously provided me with many advice.

Last but not least, I would like to pay gratitude to Dr. Fikry Botros from the Writing Center for his

i time and support in revising the style and linguistics of my journal paper. I also highly acknowledge Dr.

Iman Hamam’s long dedicated time to support me in revising the linguistics of this dissertation. She has been very supportive in enriching me with positive energy and self confidence. She is a blessing.

Finally, I need to mention an endless list of people who supported me emotionally and professionally, and encouraged me to continue giving me self confidence and believing in my capabilities. I am grateful to everyone of them and whatever I say will not pay them their deserved reward. I need to specifically mention Dr. Hassanein Amer and Dr. Awad Khalil who have always been encouraging me to start my

PhD studies. I owe them both a lot. May God reward them for their deeds.

ii Abstract

Towards Context Aware Opportunistic Forwarding in Social Pervasive Systems

By

Soumaia Ahmed Al Ayyat

Thesis Advisors: Dr. Sherif G. Aly and Dr. Khaled A. Harras

Recent advances in mobile device sensor technology, coupled with a wealth of structured and accessible data from social networks, have together formed a data-rich ecosystem. Such an ecosystem is very wealthy in a bi-directional context that can flow between the mobile and social worlds in order to promote the creation of an elitist breed of pervasive services and applications. We label the breed resulting from the merger as Social Pervasive Systems (SPS). We review literature of the domains of social networks and mobile pervasive systems to study prior research attempts to merge both domains as detailed in Chapter

2. We begin by presenting our observations in a timeline that illustrates the progress of the merger attempts. From this study, we are able to identify a collection of services and application families that can rise as a byproduct of the merger. We also identify a set of challenges that deter the formation of systems of this kind and propose solutions for them.

Although the access is pervasive and ubiquitous in the developed countries, it is scarce in the developing and the undeveloped economies. With the current setup in the developing countries where users own smart devices and demand access to the internet, but suffer from the poor network infrastructure, there rises the need for alternative network connectivity such as delay tolerant networks

(DTNs) and opportunistic networks. Alternative technologies have been used to compensate for the scarceness of the network infrastructure and the network disconnection. In this research, we focus on a subset of the SPS applications; namely, the social-based opportunistic forwarding algorithms that are highly recommended in the domain of areas with challenged network infrastructure coinciding with pervasive mobile usage and high demand for internet access and connectivity. We focus on the challenges facing such algorithms and the drawbacks in performance as relates to efficiency, effectiveness, power awareness, and utilization fairness. From there, we propose and experiment with solutions to improve

iii the performance of opportunistic forwarding algorithms that are much needed in environments which lack network infrastructure or those that are vulnerable to frequent disruptions. These solutions employ bi-directional context from the mobile and social worlds pertaining to user mobility, social interest, power awareness, and contact durations.

Four major contributions are proposed in this research. The first and second contributions demon- strate an improvement over existing popular opportunistic forwarding algorithms, such as the People

Rank algorithm, the Socialcast algorithm, and the Sensor Context-Aware Routing protocol (SCAR) by integrating interest awareness and power awareness into these algorithms. We propose the PI-SOFA framework as a framework for integrating interest and power awareness into social-aware opportunistic forwarding algorithms as detailed in Chapter 3; PI-SOFA integration implemented versions are described in detail in Chapter 5. We question the accuracy of Space syntax metrics in defining the attraction points in a given urban area and argue that this negatively affects the performance and the accuracy of for- warding decisions. This is the third proposed contribution which is presented in Section 3.2 and its proposed implemented versions are described in detail in Chapter 6. The fourth proposed contribution is proposing dynamic adaptive ranking that dynamically changes the weight of the factors controlling the ’s rank based on the current context. Details of the dynamic adaptive ranking are illustrated in

Section 3.3, and its implemented versions are described in Chapter 7.

All our contributions are empirically evaluated via our proposed simulator SAROS, our fifth contribu- tion, which is presented in detail in Chapter 4. Throughout our research, the simulations conducted with

SAROS utilize imported datasets that include both realistic and synthesized mobility traces, social pro-

files, social relationships, power consumption models, as well as data that are generated by the simulator itself. Detailed description of the used or generated datasets is presented in Section 4.2. The evaluation metrics that are used in the conducted experiments, along with the utilized scientific methodology are also provided. Finally, statistical analysis is conducted to produce the recommended regression model of the six main performance metrics of the dynamic adaptive ranking approach which is detailed in Section

7.7.

iv Contents

Acknowledgement ii

Abstract iii

1 INTRODUCTION 3

1.1 Vision ...... 3

1.2 Scope and Domain ...... 5

1.2.1 Social-based Opportunistic Forwarding Approaches ...... 6

1.3 Challenges ...... 7

1.4 Contributions ...... 11

1.4.1 Integrating Interest Awareness with Social-aware Opportunistic Forwarding Algo-

rithms ...... 11

1.4.2 Leveraging Power Awareness in Social-aware Opportunistic Forwarding Algorithms 13

1.4.3 Space Syntax-based Forwarding Approaches ...... 14

1.4.4 Dynamic Adaptive Ranking ...... 15

1.5 Roadmap ...... 16

1.6 Our Published Work ...... 17

2 Background 19

2.1 Context-Aware systems and Social Networks ...... 20

2.1.1 Context-Aware Systems ...... 21

2.1.2 Social Networks ...... 25

2.2 Social Pervasive Systems ...... 29

2.2.1 Enabling Technology ...... 29

2.2.2 Evolution of SPS ...... 30

2.2.3 Definition and Features of Social Pervasive Systems ...... 32

2.2.4 Applications ...... 33

2.2.5 Challenges ...... 36

v 2.3 Opportunistic Networks ...... 38

2.4 Social-Aware Opportunistic Forwarding Algorithms ...... 39

2.4.1 Power-oblivious, Social-Aware Opportunistic Forwarding Algorithms ...... 39

2.4.2 Social-oblivious, Power-Aware, and Energy-efficient Routing Algorithms ...... 40

2.4.3 Social-oblivious, Power and Context-Aware Opportunistic Forwarding Algorithms 41

2.5 Space-Syntax-based Forwarding Algorithms ...... 42

2.5.1 What is Space Syntax? ...... 42

2.5.2 Related Work ...... 42

2.6 Proposed Solutions ...... 45

2.7 Conclusion ...... 46

3 Proposed Frameworks 47

3.1 The PI-SOFA Framework ...... 48

3.1.1 Interest Awareness Integration ...... 49

3.1.2 Power Awareness Integration ...... 51

3.1.3 Threshold-based Opportunistic Selection Integration ...... 54

3.2 Space Syntax Framework ...... 55

3.2.1 Space Syntax Metrics ...... 55

3.2.2 Problem Definition ...... 55

3.3 Dynamic Adaptive Ranking ...... 58

3.3.1 Dynamic adaptive Ranking Cognitive Map ...... 62

4 The SAROS Simulator 67

4.1 Related Work ...... 69

4.2 Modular Architecture ...... 69

4.2.1 Simulation of Interest Distributions ...... 70

4.2.2 Realistic Power Consumption Modeling ...... 71

4.2.3 Mobility Simulation Module ...... 75

4.2.4 Implemented Algorithms ...... 82

4.2.5 Evaluation Module ...... 87

4.3 The Simulator Interface and Usability ...... 90

4.3.1 The SAROS Interface ...... 90

4.3.2 Output Graphs and Exported Data ...... 91

4.4 The Simulator Verification and Validation ...... 93

4.5 Conclusion and Future Work ...... 94

vi 5 PI-SOFA Proposed Implementations 98

5.1 PI-SOFA Integration with PeopleRank ...... 99

5.1.1 IPeR: Interest-Aware PeopleRank ...... 100

5.1.2 PIPeR: Power-and-Interest-Aware PeopleRank ...... 101

5.1.3 PIPeROp: Threshold Opportunistic PIPeR ...... 104

5.1.4 PIPeRDep: Depletion-Rate-Aware PIPeR ...... 105

5.1.5 Integrating Contact Duration Awareness with PIPeR ...... 107

5.2 PI-SOFA Integration with SocialCast ...... 107

5.2.1 ISCast: Interest-Aware SocialCast ...... 108

5.2.2 PISCast: Power-and-Interest-Aware SocialCast ...... 109

5.2.3 PISCastOp: Threshold-Opportunistic PISCast ...... 109

5.2.4 PISCastDep: Depletion-Rate-Aware PISCast ...... 110

5.2.5 PISCastOpDep: Depletion-Rate-Aware PISCastOp ...... 111

5.3 PI-SOFA Integration with SCAR ...... 111

5.3.1 ISCAR: Interest-Aware SCAR ...... 112

5.3.2 PISCAR: Power-and-Interest-Aware SCAR ...... 113

5.3.3 PISCAROp: Threshold-Opportunistic PISCAR ...... 113

5.3.4 PISCARDep: Depletion-Rate-Aware PISCAR ...... 113

5.3.5 PISCAROpDep: Depletion-Rate-Aware PISCAROp ...... 114

5.4 Evaluation ...... 114

5.4.1 Simulation Assumptions ...... 114

5.4.2 Methodology ...... 115

5.4.3 Simulation Environment Parameters ...... 116

5.4.4 Simulation Metrics ...... 118

5.5 Results ...... 119

5.5.1 Interest Awareness ...... 120

5.5.2 Power Awareness ...... 122

5.5.3 Contact Duration Expectation ...... 126

5.5.4 Normalized Performance Indices ...... 127

5.6 Analysis ...... 130

5.6.1 Simulation-based Analysis ...... 130

5.6.2 Statistical Analysis ...... 131

5.6.3 Conclusion ...... 137

vii 6 Space Syntax Forwarding Proposed Implementations 141

6.1 Space-Syntax-based Forwarding versus Real Traces based Forwarding ...... 141

6.2 Proposed Solutions ...... 142

6.2.1 MostPopAP ...... 143

6.2.2 MostFreqAP ...... 144

6.2.3 MostdurationAP ...... 144

6.2.4 CloseAPs ...... 145

6.2.5 WeightedCloseAPs ...... 145

6.2.6 WeightedCloseAPsPerX ...... 146

6.3 Evaluation Metrics ...... 146

6.4 Methodology ...... 147

6.4.1 Extract Streets and Hot Spots Coordinates ...... 147

6.4.2 Construct the Axial Maps and the Space Syntax Connectivity Graph ...... 147

6.4.3 Calculate Space Syntax Metrics ...... 147

6.5 Results and Analysis ...... 148

6.5.1 Interest Awareness ...... 148

6.5.2 Power Awareness ...... 150

6.5.3 Normalized Performance Indices ...... 152

6.5.4 Analysis ...... 153

7 Dynamic Adaptation Proposed Implementations 156

7.1 DynAdp ...... 156

7.2 DynAdpOp ...... 159

7.3 DynAdpSpSyn ...... 160

7.4 DynAdpOpSpSyn ...... 161

7.5 Evaluation Metrics ...... 161

7.6 Results ...... 161

7.6.1 Interest Awareness ...... 161

7.6.2 Power Awareness ...... 163

7.6.3 Normalized Performance Indices ...... 164

7.7 Statistical Analysis ...... 165

7.7.1 Statistical Significance of the Dynamic Adaptive Algorithm ...... 165

7.7.2 Regression Analysis of the Performance Metrics of the Dynamic Adaptive Ranking

Algorithm ...... 170

viii 8 Discussion and Conclusion 181

8.1 Effectiveness ...... 184

8.2 Efficiency ...... 185

8.3 Power Consumption Awareness ...... 188

8.4 Fairness ...... 189

8.5 Normalized Performance Indices ...... 189

8.5.1 Effectiveness across various distributions ...... 190

8.5.2 Efficiency across various distributions ...... 190

8.5.3 Power Awareness across various distributions ...... 191

8.6 Conclusion ...... 191

9 Future Work 196

9.1 Towards Fair Social-Aware Media Delivery with Quality of Experience Threshold in Op-

portunistic Networks ...... 197

9.1.1 Motivation ...... 198

9.1.2 Problem Definition ...... 199

9.1.3 Proposed Solution ...... 200

9.1.4 Methodology ...... 201

9.1.5 Related Work ...... 202

9.1.6 Evaluation ...... 204

9.2 Social Recommender Systems for Academic Social Networks ...... 204

9.2.1 Motivation ...... 206

9.2.2 Problem Definition ...... 207

9.2.3 Proposed Solution ...... 207

9.2.4 Methodology ...... 208

9.2.5 Related Work ...... 208

9.2.6 Evaluation ...... 210

9.3 Conclusion ...... 213

Appendices 214

A The Full Set of Simulation Results of PIPeR 215

A.1 SLAW Mobility Set ...... 215

A.2 Mobility Traces of SIGCOMM09 Dataset ...... 221

B The Full Set of Simulation Results of Dynamic Adaption 227

B.1 Interest Environments ...... 227

ix B.1.1 Normal Interest Distribution ...... 227

B.1.2 Two Distinct Interest Groups ...... 228

B.2 Battery Environments ...... 233

B.2.1 Normal Battery Distribution ...... 233

B.2.2 Heatmap Battery Distribution ...... 235

C The SPSS and R commands for Data Manipulation and Analysis 239

C.1 SPSS Commands to compute Factor Variables for the Parameters of the 50 Nodes . . . . 239

C.2 The R commands to construct the Scatter Matrix with the Correlation Coefficients . . . . 242

C.3 SPSS Commands for Multiple Linear Regression ...... 243

D IRB Approval Letter 244

E Consent for Participation in Research Form 246

F AUC Traces File Description 249

F.1 Record Description of the AUC Traces File Content ...... 249

G Certificates of Participation and Awards 250

H The SAROS Simulator Main Functions 254

H.1 SLAWSim ...... 254

H.2 EditINFOCOM ...... 256

H.3 EditSIGCOMM ...... 257

H.4 MallTracesSim ...... 258

H.5 StAndrewsTraces ...... 258

H.6 SpaceSyntax ...... 258

x List of Figures

1.1 Global Mobile Data Traffic from 2009 till 2020 (in Exabytes per month) ...... 4

1.2 Challenges and Proposed Solutions for a SPS Application ...... 8

2.1 The Evolution of both Social Networks and Context-Aware Systems from the 1980s to 2006 21

2.2 The Merger between Social Networks and Context-Awareness since 2007 ...... 24

2.3 Trend of Research in Social Networks, Pervasive Computing and Joint Research since 1975 31

2.4 Application Families in Social Pervasive Systems...... 33

2.5 Challenges and Proposed Solutions for Social Pervasive Systems ...... 36

3.1 Flow Diagram of the Interest Awareness Integration ...... 50

3.2 Flow Diagram of the Interest and Power Awareness Integration ...... 52

3.3 Flow Diagram of the Space Syntax Framework ...... 59

3.4 Factors determining the Activeness Component ...... 63

3.5 Factors determining the Opportunistic Selection Component ...... 63

3.6 Causal Relationship of the Location-based Popularity Component ...... 64

3.7 Preliminary Factors determining the Interest Awareness Factor ...... 64

3.8 Preliminary Factors determining the Power Awareness Factor ...... 65

3.9 Causal Relationship of the Social Popularity Component ...... 65

3.10 Cognitive Map of the Adaptive Ranking Function Parameters ...... 66

4.1 SAROS Simulator Modular Architecture ...... 70

4.2 Average Remaining Battery Capacity per hour [1] ...... 72

4.3 The Stochastic Modified KiBaM Model [2] ...... 73

4.4 User Density Heatmap extracted from the AUC Access Traces ...... 80

4.5 Space Syntax Representation of the AUC map ...... 81

4.6 4-category Battery Consumption over Time ...... 88

4.7 Main Form Menu ...... 91

4.8 StartSim Menu ...... 92

4.9 Cost versus Delivery Ratio ...... 94

xi 5.1 Flow Diagram of the Interest-Aware PeopleRank Algorithm ...... 102

5.2 Flow Diagram of the Interest-and-Power-Aware PeopleRank Algorithm ...... 103

5.3 Cost and Delivery Time of SCAR versions ...... 118

5.4 Effectiveness of SCAR versions ...... 119

5.5 Power Consumption of SCAR versions ...... 119

5.6 Fairness of SCAR versions ...... 120

5.7 Effectiveness ...... 120

5.8 Efficiency ...... 121

5.9 Power Consumption and Fairness ...... 122

5.10 Number of Produced Control Messages ...... 123

5.11 Percent of Consumed Power for the forwarded Control Messages ...... 124

5.12 Cost of Depletion-Rate-Aware PIPeR ...... 125

5.13 Delivery Ratio and Effectiveness of Depletion-Rate-Aware PIPeR ...... 125

5.14 Power Consumption of Depletion-Rate-Aware PIPeR ...... 126

5.15 Fairness of Depletion-Rate-Aware PIPeR ...... 127

5.16 Cost and Delivery Ratio of Depletion-Rate-Aware PIPeR - Heatmap Distribution . . . . . 128

5.17 Effectiveness of Depletion-Rate-Aware PIPeR - Heatmap Distribution ...... 129

5.18 Power Consumption of Depletion-Rate-Aware PIPeR - Heatmap Distribution ...... 129

5.19 Fairness of Depletion-Rate-Aware PIPeR - Heatmap Distribution ...... 130

5.20 Efficiency - Depletion Rate Awareness ...... 131

5.21 Effectiveness - Depletion Rate Awareness ...... 131

5.22 Power Consumption and Fairness - Depletion Rate Awareness ...... 132

5.23 Effectiveness Performance Index ...... 133

5.24 Efficiency Performance Index ...... 135

5.25 Power Awareness Performance Index ...... 137

5.26 The 8-Metric Analysis of PeopleRank Versions ...... 139

5.27 The 8-Metric Analysis of SCAR Versions ...... 139

5.28 The 8-Metric Analysis of SocialCast Versions ...... 140

6.1 Effectiveness of Space Syntax-based Forwarding Algorithms ...... 149

6.2 Efficiency of Space Syntax-based Forwarding Algorithms ...... 150

6.3 Efficiency of Space Syntax-based Forwarding Algorithms- Cont...... 151

6.4 Power Consumption of Space Syntax Forwarding ...... 152

6.5 Utilization Fairness of Space Syntax Forwarding ...... 153

6.6 Effectiveness Performance Index of Space Syntax Forwarding ...... 153

6.7 Efficiency Performance Index of Space Syntax Forwarding ...... 154

xii 6.8 Power Awareness Performance Index of Space Syntax Forwarding ...... 154

7.1 Effectiveness of the Dynamic Adaptive Algorithms - AUC Dataset ...... 161

7.2 Efficiency of the Dynamic Adaptive Algorithms - AUC Dataset ...... 162

7.3 Power Consumption of the Dynamic Adaptive Algorithms - AUC Dataset ...... 163

7.4 Utilization Fairness of the Dynamic Adaptive Algorithms - AUC Dataset ...... 163

7.5 Effectiveness Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset . . 164

7.6 Efficiency Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset . . . . 164

7.7 PowerAwareness Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset 165

7.8 Number of Produced Control Messages of the Dynamic Adaptive Algorithms - AUC Dataset165

7.9 Percent of Consumed Power for the forwarded Control Messages of the Dynamic Adaptive

Algorithms - AUC Dataset ...... 166

7.10 Scatter Matrix with Multi Correlation Coefficients for DeliveryRatio ...... 172

7.11 Scatter Matrix with Multi Correlation Coefficients for IntFWDRatio ...... 173

7.12 Scatter Matrix with Multi Correlation Coefficients for UnIntFWDRatio ...... 174

7.13 Scatter Matrix with Multi Correlation Coefficients for Cost ...... 176

7.14 Scatter Matrix with Multi Correlation Coefficients for ConsumedPower ...... 177

7.15 Scatter Matrix with Multi Correlation Coefficients for Fmeasure ...... 178

7.16 Scatter Matrix with Multi Correlation Coefficients for Fairness ...... 179

7.17 Scatter Matrix with Multi Correlation Coefficients among Dependent Variables ...... 180

8.1 Performance Comparison among Representative algorithms of all the Studied Categories

of Opportunistic Forwarding Algorithms ...... 183

8.2 Interest-based Effectiveness of All Proposed Algorithms - AUC Dataset ...... 184

8.3 F-measure of All Proposed Algorithms - AUC Dataset ...... 185

8.4 Cost of All Proposed Algorithms - AUC Dataset ...... 185

8.5 Cost per Unit Delivery Ratio of All Proposed Algorithms - AUC Dataset ...... 186

8.6 Cost per Unit Delivery Ratio and Interested Forwarders Ratio of All Proposed Algorithms

- AUC Dataset ...... 187

8.7 Delay of All Proposed Algorithms - AUC Dataset ...... 187

8.8 Power Consumption of All Proposed Algorithms - AUC Dataset ...... 188

8.9 Power Consumption per unit Delivery Ratio of All Proposed Algorithms - AUC Dataset . 189

8.10 Power Consumption per unit Delivery Ratio and Interested Forwarder Ratio of All Pro-

posed Algorithms - AUC Dataset ...... 189

8.11 Utilization Fairness of All Proposed Algorithms - AUC Dataset ...... 190

8.12 Effectiveness Performance Index - AUC Dataset ...... 193

xiii 8.13 Efficiency Performance Index - AUC Dataset ...... 194

8.14 Power Awareness Performance Index - AUC Dataset ...... 195

9.1 Collective Set of Challenges and Proposed Solutions for Two SPSs ...... 197

A.1 Power versus Delivery Ratio ...... 215

A.2 Cost versus Delivery Ratio ...... 216

A.3 Cost over time ...... 216

A.4 Delivery ratio over Time ...... 217

A.5 Categories of Consumed Battery ...... 217

A.6 Final Mean Battery and Standard Deviation ...... 217

A.7 Variance over Time ...... 218

A.8 Power Consumption over Time in Opportunistic fixed threshold ...... 218

A.9 Power Consumption over Time in w/o Opportunistic fixed threshold ...... 219

A.10 Power Consumption over Time in Opportunistic Adaptive threshold ...... 219

A.11 Power Consumption over Time in w/o Opportunistic Adaptive threshold ...... 219

A.12 Effectiveness: Interest-based node classification ...... 220

A.13 Effectiveness: Recall/Precision/Accuracy ...... 220

A.14 Power Consumption over time ...... 220

A.15 7-Metrics Space in Full Battery Distribution ...... 221

A.16 Power versus Delivery Ratio ...... 221

A.17 Cost versus delivery ratio ...... 222

A.18 Cost over time ...... 222

A.19 Delivery ratio over Time ...... 222

A.20 Categories of Consumed Battery ...... 223

A.21 Final Mean Battery and SD ...... 223

A.22 Variance over Time ...... 224

A.23 Power Consumption over Time in Opportunistic fixed threshold ...... 224

A.24 Power Consumption over Time without Opportunistic fixed threshold ...... 224

A.25 Power Consumption over Time in Opportunistic Adaptive threshold ...... 225

A.26 Power Consumption over Time without Opportunistic Adaptive threshold ...... 225

A.27 Effectiveness: Interest-based node classification ...... 225

A.28 Effectiveness: Recall/Precision/Accuracy ...... 226

A.29 Power Consumption over time ...... 226

B.1 Effectiveness of Normal Interest Environment - AUC Traces ...... 228

B.2 Efficiency of Normal Interest Environment - AUC Traces ...... 229

xiv B.3 Power Consumption of Normal Interest Environment - AUC Traces ...... 230

B.4 Effectiveness of 2-Interest Groups Environment - AUC Traces ...... 230

B.5 Efficiency of 2-Interest Groups Environment - AUC Traces ...... 231

B.6 Power Consumption of 2-Interest Groups Environment - AUC Traces ...... 232

B.7 Effectiveness of Normal Battery Environment - AUC Traces ...... 233

B.8 Efficiency of Normal Battery Environment - AUC Traces ...... 234

B.9 Power Consumption of Normal Battery Environment - AUC Traces ...... 235

B.10 Effectiveness of Heatmap Battery Environment - AUC Traces ...... 236

B.11 Efficiency of Heatmap Battery Environment - AUC Traces ...... 237

B.12 Power Consumption of Heatmap Battery Environment - AUC Traces ...... 238

xv List of Tables

1 List of abbreviations mentioned in this dissertation ...... 2

3.1 Comparison Matrix of Space Syntax-based Forwarding Related Work ...... 56

3.2 Mapping of the Basic Space Syntax Metrics and the Graph Theory terms [3] ...... 57

3.3 Pearson and Spearman Correlations between the Popularity Rank based on the Basic

Space Syntax metrics and the Real Traces Frequency of association ...... 58

4.1 Parameters of the KiBaM model ...... 73

4.2 Possible Transitions in the idle slot ...... 74

4.3 Usage Profile Time Consumption and Battery Life [4] ...... 74

4.4 Power Consumption of Various Activities of the Usage Profiles [4] ...... 75

4.5 Four Mobile Brands Power Consumption Values [5] ...... 76

4.6 Parameters of the SLAW Trace Generator ...... 77

4.7 Parameters of INFOCOM Dataset ...... 78

4.8 Mall Environment Dataset Parameters ...... 79

4.9 Grouping of the Participants by Major in the AUC Mobility Traces ...... 80

4.10 Parameters of the Main Form Menu ...... 96

4.11 Parameters of the StartSim Menu ...... 97

5.1 The Proposed Implemented Versions ...... 98

5.2 Common Simulator Environment Parameters ...... 116

5.3 Common Simulator Environment Parameters ...... 117

5.4 Parameters of the examined SCAR versions ...... 118

5.5 Contact-Duration-Awareness Wasted Power and Time ...... 127

5.6 Ranks of PeopleRank and PIPeROp ...... 134

5.7 Statistical Significance of PeopleRank and PIPeROp ...... 134

5.8 Ranks of SCAR and PISCAROp ...... 136

5.9 Statistical Significance of SCAR and PISCAROp ...... 136

5.10 Ranks of SocialCast and PISCastOp ...... 138

xvi 5.11 Statistical Significance of SocialCast and PISCastOp ...... 138

6.1 Correlations between Space-Syntax Metrics and Frequency of Association ...... 142

6.2 List of Algorithms simulated in the Space Syntax Experiments ...... 148

6.3 Performance Comparison among Representative algorithms of all the Studied Categories

of Space Syntax based Opportunistic Forwarding Algorithms ...... 155

7.1 Ranks of DynAdp and DynAdpSpSyn ...... 167

7.2 Statistical Significance of DynAdp and DynAdpSpSyn ...... 167

7.3 Ranks of DynAdpOp and DynAdpOpSpSyn ...... 168

7.4 Statistical Significance of DynAdpOp and DynAdpOpSpSyn ...... 168

7.5 Ranks of PIPeROp and DynAdpOpSpSyn ...... 169

7.6 Statistical Significance of PIPeROp and DynAdpOpSpSyn ...... 169

7.7 Regression Analysis ...... 175

7.8 Coefficients of the Regression Analysis of the Delivery Ratio ...... 175

7.9 Coefficients of the Regression Analysis of the Interest Forwarders Ratio ...... 175

7.10 Coefficients of the Regression Analysis of Cost ...... 175

7.11 Coefficients of the Regression Analysis of Consumed Power ...... 175

7.12 Coefficients of the Regression Analysis of Fairness ...... 175

7.13 Coefficients of the Regression Analysis of Fmeasure ...... 176

7.14 ANOVA for Delivery Ratio ...... 177

7.15 ANOVA for the Interested Forwarder Ratio ...... 177

7.16 ANOVA for Cost ...... 178

7.17 ANOVA for Consumed Power ...... 178

7.18 ANOVA for Fairness ...... 178

7.19 ANOVA for Fmeasure ...... 179

8.1 Examples of Useful applications ...... 192

F.1 Record Description of the AUC Traces Trap File Content ...... 249

xvii Table of Abbreviations

CSI:D in mobile networks based on the Stability

of the user’s behavioral profile to discover the receivers Implicitly:

Dissemination mode. It implements the ProfileCast paradigm

DTN Delay Tolerant Networks

DynAdp The Dynamic Adaptive Ranking algorithm

DynAdpOp The Dynamic Adaptive Ranking algorithm with

the Threshold-based Opportunistic forwarding option

DynAdpSpSyn The Dynamic Adaptive Ranking algorithm integrated with Space Syntax metrics

DynAdpOpSpSyn The Dynamic Adaptive Ranking algorithm integrated with Space Syntax metrics

with the Threshold-based Opportunistic forwarding option

Ep The Epidemic algorithm

IntOnly The Interest Only Algorithm

IntVal The Integration Value (One of the basic Space Syntax metrics)

IPeR The Interest aware PeopleRank algorithm

ISCAR The Interest-aware SCAR Alogithm

ISCast The Interest-aware Socialcast Algorithm

PAd The PeopleRank Adpative Battery Threshold version

PAdOp The PeopleRank Adpative Battery Threshold version with the Threshold-based

Opportunistic forwarding option

PeR The PeopleRank algorithm

PIPeR The Power and Interest-aware PeopleRank algorithm

PIPeRDep The Power and Interest-aware PeopleRank algorithm

with depletion-rate awareness

PIPeROp The Power and Interest-aware PeopleRank algorithm with the Threshold-based

Opportunistic forwarding option

PIPeROpDep The Power and Interest-aware PeopleRank algorithm with

depletion-rate awareness and the Threshold-based Opportunistic forwarding option

PISCAR The Power and Interest-aware SCAR Algorithm

PISCARDep The Power and Interest-aware SCAR Algorithm with

depletion-rate awareness

PISCAROp The Power and Interest-aware SCAR Algorithm with the Threshold-based

Opportunistic forwarding option

PISCAROpDep The Power and Interest-aware SCAR Algorithm with depletion-rate

1 awareness and the Threshold-based Opportunistic forwarding option

PISCast The Power and Interest-aware Socialcast Algorithm

PISCastDep The Power and Interest-aware Socialcast Algorithm with

depletion-rate awareness

PISCastOp The Power and Interest-aware Socialcast Algorithm with the Threshold-based

Opportunistic forwarding option

PISCastOpDep The Power and Interest-aware Socialcast Algorithm with depletion-rate

awareness and the Threshold-based Opportunistic forwarding option

PI-SOFA The Power and Interest-aware Social-based Opportunistic Forwarding Algorithms

SAROS The Social Aware Opportunistic Forwarding Simulator

SCAR The Sensor Context-Aware Routing Protocol

SCast The Socialcast algorithm

SPS Social Pervasive Systems

OSN Online Social Networks mostPopClsAP The most Popular current Close Access Point mostPopColAP The most Popular Colocated Access Point along the node’s path wgtClsAPs weighted average popularity of the current Close Access Points wgtClsAPsProx average popularity of the current close Access Points weighted

by distance from the node within a predefined proximity range wgtevery10 weighted average popularity of the current close spots every 10 meters

within the proximity range

PopIndex The Popularity Index Algorithm of the close spots

PopIndexStr The Popularity Index Algorithm of the close streets every10 Popularity of the current close spot boosted within 10 meters ranges

of the proximity range divDist The sum of the popularity of the current close spots attenuated by distance

within the proximity range mostdurAP The popularity of the spot co-located for the longest duration with the node mostfreqAP The popularity of the spot most frequently co-located with the node

Table 1: List of abbreviations mentioned in this dissertation

2 Chapter 1

INTRODUCTION

1.1 Vision

There are three main evolving trends that have all opened the door to novel research challenges; first, the evolution of sensor technology advancement that led to the reincarnation of pervasive systems; second, the proliferation of social networks, and, third, the great advancement in network infrastructure. In the first trend, sensor technology embedded in smart mobile devices has branded such devices as candidates for building innovative context-aware pervasive applications. According to the Mobile Economy, almost half the population of the earth now uses mobile communications which has contributed to 1 billion mobile subscribers who have joined the wireless internet in the last 4 years alone [6]. Moreover, this figure is forecasted to reach 4 billion by 2017. This mobile proliferation has been a main catalyst in the recent surge in mobile and pervasive systems research. In the second trend, the notable evolution in the shape and form of social networking and their seamless accessibility from mobile devices has founded a gold mine of contextual information. For instance, Facebook encompasses 1.18 billion monthly active users as of November 2016 [7] among which 1.09 billion are mobile users, while there are 1.3 billion YouTube users. This is in addition to the massive media content uploaded on YouTube such that 300 hours of video are uploaded every minute, and over 5 billion hours of video are viewed daily [8]. In the third trend, there is a vast growing advancement in network technologies and infrastructure that support seamless weaving of social and pervasive systems into everyday life. These recent network technologies come with great advancements such as fiber networks, swift Gigabit speed, high , WiFi connections and

LTE promising bitrates. Despite all these advancements, there is a set of challenges that impede users’ appreciation of the offered services. Among these obstacles is the mobile data demand that exceeds the available infrastructure support [6] along with the frequent contention due to the overwhelming number of uploads of media content and network access from such pervasive mobile devices. With such exponentially increasing mobile data traffic (see Figure 1.1), the network infrastructure becomes overloaded and users

3 experience occasional network service unavailability. This, along with the rising service delivery cost [9] discourage users from engaging in many of the services available. More importantly, not all people have predefined routes connecting them, and not all places are covered by an available network infrastructure.

All these challenges signal the need for ad hoc connections [10], delay tolerant connections [11] and opportunistic networks [12] [13] which would facilitate communication in such challenged environments.

Utilizing an ecosystem that combines both smart mobile devices and a big-data like environments in the form of social networks along with the full utilization of the available opportunistic networks would allow for the creation of an intelligent set of services and applications that merge the two domains.

Figure 1.1: Global Mobile Data Traffic from 2009 till 2020 (in Exabytes per month)

Given the expected continuous growth in the three trends above, we envision a world in which the environment and the user will seamlessly mingle in such a way that consumable information rather than data will find their way to applications. Within this envisioned new harmony, an unbounded set of social and pervasive systems will seamlessly weave themselves into the fabrics of our daily life. More specifically, we expect to witness a rise in what we define as Social Pervasive Systems (SPS), where an SPS is a system that intensively utilizes both primitive and fused contexts from both the mobile and social worlds.

In this research, we describe SPSs as systems that cross-pollinate a mutually influential mobile and social world with opportunities for new breeds of applications. Relying on the co-influence of its ancestors,

SPS will infer significant bidirectional user context and identify preferences between mobile and social systems, thus leading to a novel breed of wealthy services and applications. Applications enabled as a result of SPS will include monitoring social behavior systems, social persuasive applications, alerting systems, socially-influenced context-aware systems and social recommender systems.

4 1.2 Scope and Domain

Recent studies report that there is a noticeable increase in the number of the worldwide smart phone users from 1.59 billion users in 2014 to 2.08 billion users in 2016 with a projected 2.48 billion users in 2018 [14]. Although the internet access is pervasive and ubiquitous in the developed countries, it is scarce in the developing and the undeveloped economies. This is depicted by the fact that there are 207.2 million smart phone users in USA in 2016 [15] (63.85% of the USA population [16]), while there are 123.7 million users in the Middle East and Africa region [17], and their number in a developing country such as India is 204.1 million users [18] (15.5% of India population [19]). The Global Information Technology

Report 2015 of the World Economic Forum states that ”Citizens in developed economies take access to stable, high speed communications networks for granted. In many of these countries today, broadband

Internet connectivity is now seen as a basic utility on a par with energy or water. In developing countries, however, neither stability nor speed can be relied upon. In India and other developing economies, the mobile revolution - in which the rapid development of a mobile phone network did not wait for a landline rollout - is already having an impact on many social issues and endeavors” [20].

With the current setup in the developing countries where users own smart devices and demand access to the internet, but suffer from the poor network infrastructure, there rises the need for alternative network connectivity such as delay tolerant networks (DTNs) and opportunistic networks. Alternative technologies have been used to compensate for the scarceness of the network infrastructure and the network disconnection. For instance, there is a set of applications that rely on physical transportation systems equipped with mobile access points to connect between Internet hubs and internet kiosks in villages of India such as the DakNet application [21], and such as the TrainNet application through which a train carries hard disks and storage devices to transfer non-real time data (like videos) between railway stations where the sender(s) leave their data stored on these storage devices to be transferred across railway stations to reach the recipient at another railway station [21]. Delay tolerant networks establish a route of communication between the sender of information and the destination node(s) through a set of communications among intermediate nodes. DTNs are advantageous in that they are networks that tolerate the delay in content delivery. However, DTNs require a guarantied full connected routing starting from the source till the destination before the message transfer process starts. Opportunistic networking, on the other hand, exploits the opportunistic encounter of mobile devices to share each devices’ resources, services and exchange context information [13]. Opportunistic networking does not request an established route of communication between the sender of information and the destination nodes. Instead, opportunistic encounters between pairs of intermediate nodes suffice for the transfer of data till it reaches the destination nodes.

In this research, we will focus on a subset of the SPS applications; namely, the social-based oppor- tunistic forwarding applications (as reflected in the Scope and Domain section of Figure 1.2) that are

5 highly recommended in the domain of areas with challenged network infrastructure coinciding with per- vasive mobile usage and high demand for internet access and connectivity. In such environments, reliance on mobile opportunistic networks and the advanced technology of mobile devices enhance the efficiency and effectiveness of the opportunistic forwarding approaches. Furthermore, leveraging both the social profiles of the mobile users and the social connectivity among these users motivates the reliance on the social-based opportunistic forwarding approaches.

1.2.1 Social-based Opportunistic Forwarding Approaches

As a subset of the socially influenced context-aware systems, we will focus on forwarding algorithms that utilize social data of users, along with the current mobile context typically used in data forwarding decisions. Some existing popular approaches rank mobile candidates used in the forwarding process according to their social popularity. Social popularity is determined using the ’centrality’ social metric to select neighboring candidates with the highest rank, and designate them to be the next message carriers.

The PeopleRank algorithm [22] is one such popular example of this category of algorithms. Some other algorithms utilize the ’betweeness’ social metric to pick neighboring candidates with the highest rank. An example of such algorithms is SimBet [23]. The SimBet approach, however, targets certain destination nodes by their usernames instead of targeting nodes based on common profiles. Thus, SimBet approach cannot provide a solution for forwarded content that is profile-based. SocialCast [24] is another popular social forwarding algorithm that relies on building a utility function that ranks candidate nodes based on their previous history of co-location with the destination nodes. It also uses active connectivity with other neighbors - namely their rate of change of connectivity with other nodes. The algorithm relies on a publish-subscribe approach where destination nodes that are interested in receiving information subscribe in the service. SocialCast however provides service to destination nodes by their ID instead of the common social profile they belong to. This restriction hinders SocialCast from forwarding content to a group of nodes that share a common social profile which is the main target of social-based forwarding.

Some applications target destination nodes by their social profile instead of predefined IDs. Such applications cannot use the social forwarding algorithms that target destination nodes by their IDs. These applications require another category of social forwarding algorithms such as the popular ProfileCast algorithm [25] which targets destination nodes that have a certain common mobility profile. It assumes that common profiles indicate a high probability that these nodes would be interested in receiving similar messages. Despite this, ProfileCast and similar algorithms do not consider other social metrics that can eventually improve the delivery ratio with more efficient forwarding-selection choices.

6 1.3 Challenges

In order for us to realize the vision shared above, we identify a set of challenges that need to be addressed by the research community. These challenges are addressed in our work that is presented in this dis- sertation, and are clustered into seven main categories namely; real-time constraints, scalability, conflict resolution, extreme heterogeneity, power conservation, security and privacy, and finally intelligence and invisibility. These challenges are discussed in detail in section 2.2.5 and are accompanied by a broad vision of the set of solutions we propose to overcome these challenges as tracked in the related work chapter.

Of specific interest and importance, we will focus on the following subset of challenges that need to be addressed by any SPS applied within the defined domain of developing countries with the reliance on opportunistic networks: 1) There is a lack of interest awareness in social-based forwarding decisions. Also, the oblivion of utilizing interest in content as an incentive for participation in the forwarding process.

2) The integration of power awareness in interest-based forwarding approaches is often overlooked where these algorithms are oblivious to the power limitations of the available mobile devices in the place. 3)

These algorithms undermine the unbalanced processing and communication among the mobile nodes in terms of the capability of their resources. 4) These algorithms have to maintain user information privacy along with awareness of social, interest and power capabilities in order to motivate users’ participation in the delivery process. 5) Not all of the nodes’ information and properties are always accessible, so the algorithms have to adapt to the available information and the current context by relying on a dynamic adaptive ranking process in order to achieve intelligent decisions and services.

It is indeed challenging to be aware of content dissemination to only those interested in the forwarded information especially in ad hoc connection environments that lack a network infrastructure. In applica- tions of sensation propagation within environments lacking infrastructure or in wireless sensor networks, massive amounts of context information are frequently disseminated. Being oblivious to the importance of identifying interested recipients, massive efforts are wasted in propagating information that is of no value or interest to many recipients. This leads to inefficient and ineffective content-dissemination in addition to a significant waste of the precious limited resources of the participating nodes. In order to overcome this problem, some content-dissemination applications implement event-driven approaches such as publish-subscribe techniques [24] by which interested nodes subscribe to the service, thus registering interest in specific context information. In other event-driven approaches, the interested nodes establish listeners to a specific type of context.

On another dimension, without efficient and effective dissemination of the massive context information among those interested nodes, there will continue to be extensive waste of the nodes’ resources in all these types of applications. This waste may also deplete the nodes’ power and cause disconnections in the ad hoc network to occur, indicating a shorter network lifetime and failure to complete the context-

7 dissemination process. Power-oblivious social-aware forwarding algorithms in opportunistic networks may fail due to the unplanned waste of precious mobile resources and rapid opportunistic network failure.

This prompts the need to integrate power-awareness into these social-aware forwarding approaches. As an incentive, these algorithms should be fair in utilizing the participating nodes’ resources in order to avoid depleting some while only lightly utilizing others. Fairness in power consumption is a decisive metric in the efficiency of these forwarding algorithms.

On a third dimension, if privacy is not preserved, many users will refrain from participating in these services leading to delay or even failure in these services. This raises the call for preserving privacy through distributing the data processing among the nodes whereby each node computes its own rank using its own private information to produce a hashed rank. This hashed rank is then exchanged among peers and used in decision-making without enabling reverse hashing that reveals the node’s private properties.

Figure 1.2: Challenges and Proposed Solutions for a SPS Application

This research focuses on a subset of the socially influenced context-aware systems; namely Social- based Context-Aware Forwarding approaches. This class of forwarding approaches leverages the social information about the users in addition to the current context in order to make forwarding decisions.

We will elaborate on some of the research problems associated with social forwarding algorithms in opportunistic networks. From the study of the set of challenges that confront the social forwarding

8 algorithms in mobile opportunistic networks, this set of challenges are summarized in 6 main challenges as illustrated in Figure 1.2. These challenges are:

• Challenge 1: Lack of Forwarding Incentives: Current social-based opportunistic forwarding

algorithms do not take into consideration the overhead imposed on uninterested nodes as a result

of their participation in forwarding schemes. The lack of interest in content here means that the

nodes involved in the process have no significant incentive to be part of this system. The im-

portance of integrating interest in the process of forwarder selection, especially in environments

with mobility patterns that are not repetitive or where there are no social relationships connect-

ing the available nodes in the place, are often overlooked. In such environments, interest could

be a significant common social relationship that clusters nodes. Eliciting such interest-awareness

knowledge facilitates context-dissemination to interested nodes; thus reducing the cost of massive

uninteresting information dissemination that overwhelms nodes without gained benefit. It is nec-

essary to bring in an incentive to motivate nodes towards participation in the forwarding process

especially in environments lacking communication infrastructure. In such environments, reliance

on the available mobile nodes is highly substantial. However, the owners of these mobile devices

cherish the limited resources of their devices. Consequently, there is an essential need to introduce

incentives in forwarding approaches in order to gain forwarder nodes’ willingness to participate in

the forwarding process. Unfortunately, these algorithms are currently oblivious to the value of con-

sidering interest as an incentive. The user’s interest in the content can be an effective incentive to

persuade nodes to participate in the forwarding process as they will mutually benefit by receiving

the same content which is of partial interest to them.

• Challenge 2: Limited Power: This challenge pertains to the lack of power-awareness in the

forwarder selection process of the social forwarding approaches which poses a challenge. That

is, although interest-aware forwarding algorithms are aware of forwarders partial interest in the

disseminated information, they may not be aware of whether the node has the required battery

power to maintain the forward process till accomplishment; it is with some probability that the

selected forwarders are not always able to accomplish the content delivery process. This challenge

mainly occurs due to a lack of awareness on the part of the forwarding algorithm to select forwarder

nodes with sufficient power, thus to tolerate the forwarding process, giving them enough time to

deliver the forwarded content to the destination nodes.

Thus, it is important to take into consideration that the social opportunistic forwarding process

consumes forwarder nodes’ buffer and power resources. When the forwarding process takes place

in Ad hoc networks instead of the reliance on established network infrastructure, the power and

computation overhead is entirely consumed by the participating nodes’ resources. Accordingly, the

9 mobile nodes whose resources are limited inevitably will be reluctant to participate in this process.

This reaction hinders, or at least delays, the forwarding process.

• Challenge 3: Unfairness: This challenge is derived from the unbalanced resource utilization of

the participating forwarder nodes in the system. That is, some forwarder nodes are over utilized

while others are lightly utilized. Power-fairness-oblivious forwarding algorithms face this challenge

where the forwarder nodes that participate early in the process consume more power, buffer and

computational resources than the ones that join the process later on. This leads to unfair utilization

of the participating nodes’ resources. Mobile node users are thus discouraged from joining the

forwarding process in its early stages to avoid over-utilizing their limited resources.

These problems incentivize the creation of social opportunistic forwarding algorithms that are

power-aware and also fair in the resource utilization of participants. Power Awareness here includes

awareness of the depletion rate of the node’s power. Without awareness of the power-depletion-

rate, algorithms may select nodes for their high remaining power without paying attention to the

node’s future need. This in turn leads to power exhaustion and subsequently, may result in the

failure of the forwarding process via this node.

• Challenge 4: Limited Contact Duration: This challenge pertains to the lack of awareness

of whether the contact duration between the carrier node and the candidate node is enough for

a complete message transfer to occur. During incomplete message transfer, the forwarding algo-

rithm wastes non-trivial resources of the participating nodes. This mainly takes place because the

expected contact duration between the encountered nodes is overlooked. In some instances, the

nodes in contact might fail to transfer the complete message content between them due to sudden

disconnection.

• Challenge 5: Metrics Choice: This challenge pertains to the improper selection of the ranking

metrics, or the unavailability or discontinuity of the required information for decision making. This

challenge may occur due to the fact that not all of the nodes’ information is available all the time;

some of them might not keep a history of their behavior or mobility patterns, while some might

not agree to share their private data with others.

• Challenge 6: Assessment Metrics: The standard effectiveness and efficiency performance

evaluation metrics cannot assess the performance of the proposed approaches in terms of the

interest and power awareness perspective. Thus, there is a need for coining new interest and

power-related assessment metrics that serve this purpose.

In addition, social opportunistic forwarding approaches face the challenge of a trade-off between

power conservation and fairness on one side, and delayed delivery that may lead to the user’s

10 gradual loss of interest on the other side. That is, while paying attention to the selective choice

of forwarders in desire to preserve power consumption, and to avoid bothering uninterested users,

the social forwarding algorithm may suffer delay in delivering the content. Due to such delays,

destination nodes may lose interest in receiving the content, or may even receive content that is no

longer valuable. Such algorithms need to be aware of this trade-off and to find a balance between

the two conflicting goals.

1.4 Contributions

From the above mentioned challenges, we observe some common features that can be leveraged to improve the performance of the challenged systems. The main proposed solutions to these challenges as illustrated in Figure 1.2 are as follows:

• To resolve challenge 1, we propose integrating interest awareness in the social-aware forwarding

process.

• To resolve challenges 2, 3 and 4, we propose leveraging power-awareness in the social-aware for-

warding process.

• To resolve challenge 5, we propose a dynamic adaptive ranking function that adapts the ranking

process based on the currently available context information among the nodes. In addition, reliance

on minimum assumptions or the least required information would facilitate the participation of a

large number of people leading to better chances of effective and efficient service delivery and in-

formation dissemination. Finally, leveraging Space Syntax metrics and the node’s mobility pattern

among the ranking metrics, in addition to the above mentioned metrics and dynamically adapting

the ranking process according to the current context boosts the effectiveness and efficiency of the

ranking function towards making intelligent forwarding decisions.

• To resolve challenge 6, we introduce new interest and power-related evaluation metrics such as the

interest-based effectiveness evaluation metric, the effectiveness index, the efficiency index, and the

power-awareness index.

We elaborate on our proposed solutions in the rest of this section. These proposals will be discussed and examined in detail in the remaining part of this document.

1.4.1 Integrating Interest Awareness with Social-aware Opportunistic For- warding Algorithms

We hypothesize that by using a blend of contextual information from the social and mobile worlds, such as user’s interest in the forwarded content which is then integrated into the ranking process of the

11 candidate nodes, we gain several advantages in one shot:

The first advantage is reducing the ratio of contacted uninterested nodes in the forwarding process so they are neither bothered by uninteresting information nor engaged in unwanted power consumption.

This action enables achieving effectiveness in the forwarding process. The second is establishing an incen- tive technique that persuades engaged nodes to participate in the forwarding process as it provides them with opportunities to gain interesting information while they forward this information to the destination nodes. The third advantage is that focusing on interested nodes rather than being oblivious to interests in node selection leads to reduction in cost and overall power consumption, since less forwarder nodes are utilized in the forwarding process, and thus these algorithms achieve more efficiency in performance.

Finally, by emphasizing interest-awareness, we introduce another social metric besides the social friend- ship relationship. This interest-awareness integration to the social-aware forwarding process overcomes the deficiency of the interest-oblivious social-aware forwarding approaches when applied in communities that do not by nature adopt fully connected friendship graphs. Mall environments, conferences, and cin- ema halls are just some examples of such environments where the individuals that exist in these places need not be all friends or know each other. Nevertheless, they may have common interests that led to their co-existence in this small community. Leveraging this knowledge improves the performance of the applied social forwarding approaches.

To achieve interest awareness in social forwarding algorithms, we propose integrating this piece of information in the candidate nodes’ social rank upon which the selection process takes place. This can be achieved through maintaining an interest vector for each node that would indicate its particular set of interests. The forwarded message also should be accompanied by an interest vector that indicates the areas of interest which this message belongs to or covers. Next, whenever the message-carrier comes in proximity with any candidate node they exchange the message interest vector in order to compute the similarity interest vector (SInt) between the candidate node’s interest vector and the message interest vector. The candidate node whose SInt is above a certain preset threshold is classified as an interested node. Thus, the algorithm boosts their social rank by rewarding it, while any uninterested node’s rank is penalized to avoid selecting them as forwarders.

To sum up, we have an educated guess that integrating such type of contextual information; namely, the interest of the forwarder nodes, in the selection process will lead to efficient and effective social- aware forwarding algorithms. We hypothesize that interest-aware social-based forwarding algorithms will improve the delivery ratio, reduce cost and more importantly will avoid burdening uninterested nodes in the forwarding process.

12 1.4.2 Leveraging Power Awareness in Social-aware Opportunistic Forward- ing Algorithms

We speculate that power awareness of the participating nodes is valuable information that can boost the performance of the algorithm and can be a practical incentive for participant nodes. The limited resources of the mobile nodes in mobile opportunistic networks is a main cause for their reluctance to participate in forwarding information even if it is of high interest to them. Consequently, leveraging the power consumption information in ranking candidate nodes improves the forwarder node selection process by selecting them based on their remaining power resources. We propose setting a certain preset power threshold above which the candidate nodes whose remaining power resources are quite enough to conduct the delivery process are rewarded, while the nodes whose power resources are below this threshold are penalized. Such reward/penalty technique emphasizes the selection of power-capable nodes and avoids the power-depleting nodes. We suggest that power awareness has a positive effect on the cost paid when power-aware social forwarding is applied. Power-awareness integration does not only reduce paid cost, but it also maintains overall preservation of power consumption and power fairness. We believe that these gained benefits come with a comparable delivery ratio and proportionate delay.

Furthermore, paying attention to the depletion rate of the mobile nodes enables the forwarding process to select the nodes that can sustain till delivery process completion. With such selective processes, the approach would increase the probability of successful delivery. Moreover, selecting the candidate nodes with less depletion rate or light usage patterns leads to less power consumption of these nodes resources which ultimately decreases the power-level gap among the nodes in the community. This, in return, improves the fairness of the social forwarding algorithm since it will not over-utilize certain nodes till their resources are exhausted. With such power-awareness, the variance in remaining power resources among the nodes of the community is reduced, which ensures maintaining power utilization fairness in this community. Maintaining fairness in nodes’ resources utilization is an encouraging incentive for nodes to participate in the forwarding process. From this perspective, we propose integrating a device-specific contextual information such as the depletion-rate of the candidate node in the reward/penalty process of the node’s social rank. In the experimental section we demonstrate several ways of depletion-rate integration in the ranking equation.

On another front, we hypothesize that it is necessary for the social forwarding approach to be aware of the expected contact duration among the neighboring nodes to leverage this information in selecting the candidates expected to remain in contact giving enough time for the message transfer in such ad hoc connections environments. Social forwarding algorithms that are oblivious of this piece of information waste time and resources on incomplete message transfers where the neighboring nodes get disconnected before the message transfer is complete while they both wasted their time and resources in the time spent to transfer an incomplete portion of the message. We propose introducing the Kalman filter

13 prediction technique [26] [27] to predict the expected contact duration between two neighboring nodes

and then use this expected contact duration value in rewarding/penalizing the candidate node’s social

rank. We presume that contact-duration awareness in social forwarding approaches significantly reduces

the number of incomplete transfers which reduces the wasted power consumption and consequently

reduces the overall power consumption.

1.4.3 Space Syntax-based Forwarding Approaches

Space Syntax was initially proposed in the field of architecture by Hillier and Hanxon in 1984 to model

natural mobility patterns by analyzing spatial configurations. It accurately predicts natural movement

patterns in an area based on how its segments are connected [28] [29] [30]. Space Syntax opportunistic

forwarding exploits the infrastructure of a particular place and the popularity of specific attraction points

in that place in order to predict user mobility patterns, flows and areas of possible congestion. Space

Syntax has been integrated in city-wide content delivery applications. These applications rely on the

minimum information required which constitutes the static values of popularity of the attraction spots

in a specified location. These popularity values are pre-calculated based on the initial map of the place

(i.e. the initial design of the place). Accordingly, these applications claim that they rely on minimum

input and that they are able to preserve privacy since they do not request private information of users

in that place. We find that the Space Syntax approach can be utilized in micro-level maps and not only

on city-wide level maps. For example, it can be applied on a university campus map, a conference map,

a mall or an airport map.

However, we argue that the Space Syntax metrics do not reflect the real distribution of popularity

values. Thus, we pose the question: how do you define attraction points on a map to achieve more

accurate computation of the popularity of the place? We believe that the current Space Syntax approach does not properly define the real attraction points on a map. Instead, it imposes its predictions on the map which might not reflect reality when the place is in use. A proper definition of attraction points enables the correct computation of Space Syntax metrics which can then be integrated in computing mobile nodes’ ranks for proper forwarding decisions.

Another important aspect to consider when ranking nodes is the popularity that these nodes gain by visiting popular places. When a node frequently visits a popular place, it gains popularity since this increases the probability that the node will encounter targeted nodes. This research investigates several options for properly identifying popular places and for defining a node’s popularity based on the place popularity.

In addition, this research examines the question of how attraction points can be defined. Traditional

Space Syntax metrics do not accurately reflect the ranking of popular spots or attraction points and so we propose a new technique for doing so based on the frequency and duration of mobile node’s association

14 with these spots. In turn, we recommend how to assess the popularity of the mobile nodes based on these more accurate metrics. We conducted a study in which we compared the forwarding algorithm that relies on traditional Space Syntax metrics with those which rely on our recommended new Space

Syntax-based ranking metrics.

1.4.4 Dynamic Adaptive Ranking

In many case there is always a dynamic change in the available set of information that is needed for ranking nodes in order for the algorithms to properly select the best candidates for forwarding. Thus, there is a call for dynamic weights of the attributes in order to compensate for that dynamic change in context through leveraging available and strong attributes while minimizing the reliance on the weak or non-existing context information. Accordingly, this research proposes a dynamically adaptive ranking algorithm that integrates several context-aware sensors; namely, interest awareness, power awareness, and place awareness with social awareness and user activeness. The algorithm takes into consideration various factors that affect the rank of the co-existing nodes. These factors include the following: 1) The relative interest of the user in the forwarded content; 2) The power capabilities of the mobile node the user holds; 3) The degree of social popularity of the user; 4) The degree of activeness of the user in terms of physically or logically contacting nodes of other users; and 5) The popularity of the places the user frequents.

This algorithm dynamically changes the weights of the above mentioned factors to suit the surround- ing environment. Based on these dynamically changing weights, the factors that control the node’s rank are dynamically weighed to adapt the dynamically changing environments the node moves across. With such a dynamically-changing rank, the nodes are continuously re-evaluated for the sake of optimizing the achievement in terms of the following metrics: 1) increasing the delivery ratio; 2) increasing the contact with the interested users; 3) minimizing the contact with the uninterested users; 4) minimizing the cost paid while the targeted delivery ratio is achieved; 5) minimizing the amount of consumed power while the targeted delivery ratio is reached; and 6) maintaining fairness in utilizing the resources of the available mobile nodes in the place.

Accordingly, each node will execute this ranking function utilizing its own values of the above men- tioned factors. Its rank dynamically changes as the weights of these factors change; thus, with each new encounter the node recomputes its own rank and exchanges its value with the encountered node for decision making on content forwarding.

15 1.5 Roadmap

After defining the challenges an SPS would face and designating our proposed solutions, we will provide a brief overview of the remaining chapters. In the Related Work Chapter (Chapter 2), we follow the footsteps of similar research efforts that have attempted to combine two domains: pervasive systems and social networks and present them in a timeline that illustrates the progress of both domains inde- pendently, and the gradual merger between the two over time. We present the evolution of the merger between both domains in order to gain a better understanding of what has been achieved thus far and get a better sense of the bigger picture and consider what the future holds. In Chapter 2, we also describe what we label as Social Pervasive Systems that cross-pollinate a mutually influential mobile and social world with opportunities for new breeds of applications. We then present a new set of new services and potential applications that emerge from this unique blend. This section is followed by a description of some of the challenges we expect such systems to face and our proposed solutions.

In our experimental work, we focus on a sub-domain of SPS; namely, that of socially influenced context-aware systems. Accordingly, we present our four contributions: The first three contributions maintain improvement in the performance of social-based context-aware forwarding algorithms in oppor- tunistic networks, while the fourth contribution is the simulator we use to carry out our experiments.

More specifically, in Chapters 3 and 5 we present a framework for integrating interest-awareness and power-awareness into any social-based opportunistic forwarding algorithm in order to achieve better performance in terms of efficiency, effectiveness and power-awareness. This is made possible by inte- grating interest-awareness and power-awareness components in node ranking and in the decision-making of the forwarding process. We also demonstrate how integrating power-awareness within the proposed framework improves performance in terms of power utilization fairness and overall power consumption.

We empirically show that by integrating our framework to three well-known social-aware opportunistic forwarding algorithms; namely, PeopleRank, SocialCast, and SCAR. We then evaluate the performance of the proposed algorithms to the original ones in terms of predefined metrics.

Our second contribution is the introduction of Space Syntax based forwarding in our simulation envi- ronment. We propose better attraction point definition methods that allow for more accurate forwarding decisions and that contribute towards better space planning. First, we question the accuracy of Space

Syntax metrics in defining the attraction points in a specific space and argue that this negatively affects the performance and the accuracy of forwarding decisions. Next, we propose new Space Syntax metrics in Chapter 3.2. Finally, a set of proposed implemented versions are detailed and evaluated in Chapter 6.

Our third contribution, which is detailed in Chapters 3.3 and 7, is a proposed dynamic adaptive ranking approach that dynamically changes the weights of the attributes of its ranking function based on the current context.

All these contributions are discussed in detail and are demonstrated with simulation experimentations.

16 Our fourth contribution is comprised of the SAROS simulator that we built in order to carry out these simulation experiments in the absence of a simulation environment that suited our needs. The simulator is discussed in detail in Chapter 4. This simulation environment requires datasets that feed in some data such as mobility traces, social profiles, social relationships, and power consumption models among others according to each experiment’s requirements.

A comprehensive discussion and conclusion is detailed in Chapter 8. Finally, the research has gener- ated several new ideas and has suggested that further areas of research and study are required. Chapter

9, the Future Work chapter, outlines two new contributions that we plan to expand on.

1.6 Our Published Work

We would like to list the work we have published so far based on this research work:

1. Soumaia Al Ayyat, Sherif Aly, and Khaled Harras, ”Social Pervasive Systems - The Integration of

Social Networks and Pervasive Systems,” in PECCS, Feb. 2013, pp. 118-124.

2. Soumaia Al Ayyat, ”Social Pervasive Systems: The harmonization between Social Networking and

Pervasive Systems,” in IEEE PERCOM PhD Forum, March 2014, Best PhD Forum Award.

3. Soumaia Al Ayyat, Khaled Harras, and Sherif Aly, ”Interest Aware PeopleRank: Towards Effective

Social-Based Opportunistic Advertising,” in IEEE WCNC, April 2013.

4. Soumaia Al Ayyat, Sherif Aly, and Khaled Harras, ”PIPeR: Impact of Power-Awareness on Social-

Based Opportunistic Advertising,” in IEEE WCNC, April 2014.

5. Soumaia Al Ayyat, Khaled Harras, and Sherif Aly, ”On the Integration of Interest and Power

Awareness in Social-Aware Opportunistic Forwarding Algorithms,” Computer Communications,

pp. 97-110, Nov. 2015.

6. Soumaia Al Ayyat, Sherif Aly, and Khaled Harras, ”SAROS: A Social-Aware Opportunistic For-

warding Simulator” in IEEE WCNC, April 2016.

Here is a list of other research work that cited the above publications:

1. Girolami, Michele. ”Device Interoperability and Service Discovery in Smart Environments.” PhD

Thesis, DOI: 10.13140/RG.2.1.2344.7926, Italian National Research Council, (2015).

2. Girolami, Michele, Stefano Chessa, and Antonio Caruso. ”On service discovery in mobile social

networks: Survey and perspectives.” Computer Networks 88 (2015): 51-71.

3. Lin, Chia-Yu, et al. ”An effective algorithm for interest aware opportunistic advertising by mining

social and consuming information.” Vehicular Technology Conference (VTC Spring), 2014 IEEE

79th. IEEE, 2014.

17 4. Jiwei Li, Zhe Peng, Shang Gao, Bin Xiao, Henry Chan, ”-assisted energy efficient data

communication for wearable devices”, Computer Communications, Available online 30 August

2016.

In addition, I presented this research work as poster presentations in the following international and national events:

1. The Youssef Jameel PhD Fellowship Summer School, 22-25 June 2014, Cardiff, UK.

2. The Microsoft Research PhD Summer School, 30 June 4 July 2014, Cambridge, UK.

3. 3MT Contest, 4 March 2015, AUC, Cairo, Egypt.

4. The AUC Annual Research Day, 30 March 2016, AUC, Cairo, Egypt.

From all the above publications, I have been awarded the following awards:

1. Best PhD Forum Award in the IEEE PERCOM PhD Forum 2014.

2. The Runner Prize in 3MT Contest 2015.

18 Chapter 2

Background

This chapter provides an extended account of the challenges related to the pervasive systems and social networks, stemming from their early inception, up until the present day. We find that over time, the two realms have become closer to one another. We identify the limitations of each of these realms, the challenges encountered when attempts have been made to merge them, and propose solutions for overcoming these challenges. We would like to emphasize that while attempts have been made to tackle some of the problems faced by Social Pervasive Systems (SPSs), there is still a great deal of work that needs to be done to ensure the pervasiveness, context-awareness and social-awareness of these systems without undermining security, privacy, scalability and real-time performance.

Recent advances in mobile technologies and the enabling network infrastructure have paved the way for the growth of two prime domains, namely, mobile-based pervasive systems and online social networks.

Mobile technologies have been presented as an extremely necessary ingredient in the emergence and implementation of intelligent context-aware applications [31] [32] [33]. In parallel, the world-wide pop- ularity of online social networks (OSNs) has brought about a wave of great social influence [34]. With such a high population of massive social networks - Facebook alone encompasses 1.65 billion monthly active users as of March 2016 [7] among which 1.51 billion are mobile users, while there are 1.3 billion

YouTube users -, OSNs possess huge amounts of social information that present a fertile ground for research. Such immense information is also a substantial source of contextual information, which has the potential to improve the intelligibility of pervasive systems significantly [35]. Similarly, by receiving context from mobile sensors, social networks can become aware of user context, enabling them to provide more intelligent social services [36].

While Pervasive Systems and Online Social Networks have been developing alongside one another, we found that the lack of enabling technologies hindered their progress at first. Both domains evolved at a slow pace until the end of the 1990s [37] and were mainly isolated from one another with the exception of a few select commercial applications such as Lovegety and Humming Bird [38] that attempted

19 to combine features from both systems for better service. Since 2000, both domains have witnessed rapid improvement coinciding with the advances in sensor technology, mobile technology and network infrastructure. While both the pervasive and social networking domains were aware of the power of integrating forces, early attempts to conduct collaborative research that accommodated both areas were only able to make some incomplete (though important) contributions as observed by Baldauf et al. [39] and Boyd et al. [40]. A few attempts initiated in 2007 contributed to the merger of both domains and examined their influence on each other in an attempt to maximize benefits [41] [36].

Based on our analysis of such mergers over time, we believe that further developments in the merger domain between the mobile and social worlds is inevitable. We demonstrate a solid merger in the form of new systems that we call Social Pervasive Systems (SPSs). We define an SPS as a system that extensively utilizes both primitive and fused context from both the mobile and social worlds. Relying on the co-influence of its ancestors, an SPS will infer significant bidirectional user context and preferences between mobile and social systems, thus leading to a new breed of wealthy services and applications.

The contributions we provide in this chapter are broken down into three main points. First, we define

Social Pervasive Systems as a new research thrust and provide an illustrative analysis of the most promi- nent efforts that have been made towards its emergence. Second, we shed some light on the prominent applications in which SPSs would be of most striking importance and provide a discussion of the key challenges that need to be addressed by the research community. Finally, we provide suggestions for how to address these challenges. These contributions were presented and published in the International

Conference on Pervasive and Embedded Computing and Communications Systems (2013) [42].

2.1 Context-Aware systems and Social Networks

Context-Aware Systems constitute one of the main research areas that have contributed significantly to the construction of SPSs. In the 1990s, commercial, non-standard context-aware applications emerged bringing with them the challenge of building applications that used extremely heterogeneous sensors.

Thus, the progress and scalability of context-aware systems was significantly hindered. In the past decade, however, the coinciding improvement of mobile sensor technology and high bandwidth network infrastructures has revived research into - and the implementation of - pervasive systems [39].

In parallel to the progress of pervasive systems, online social networks (OSNs) were transformed from offline applications or mere contact lists into online social network sites (SNSs) that could be accessed off the web and through Application Programming Interfaces (APIs) [40]. Two phases in the development of the pervasive systems and social networks and the evolution of their merger were identified and visualized in the form of a timeline which is provided in Figures 2.1 and 2.2. These figures emphasize the milestones in the evolution of context-aware systems and online social networks and their merger.

20 2.1.1 Context-Aware Systems

Definition of Context-Aware Systems

Context-awareness is a fundamental component of pervasive systems. The term ”pervasive” was origi- nally in 1991 coined by Weiser, who described pervasive systems as those systems that ”weave themselves into the fabric of everyday life until they are indistinguishable from it” [43]. The first definition of ”con- text” was introduced in 1994 by Schilit and Theimer who said that context can be location, identities of nearby people, objects and changes to those objects [39]. In 2000, Dey and Abowd reached the most accurate definition of the term context which they identified as ”any information that can be used to characterize the situation of entities - an entity can be a person, a place or object - that are considered relevant to the interaction between a user and an application. This context includes the user and the application themselves.” [39]

Evolution of Context-Awareness Since 1990s

Context-awareness remains an exciting area of research that began with the Xerox PARC experiments that were conducted in 1992. For a long time, location has been considered the dominant kind of context information on which context-aware systems relied. By mid 2000, more sources of context information were obtained in coordination with advancements in infrastructure making more mature context-aware applications possible. Next, we break down the lifetime of context-awareness into two main phases of development.

Figure 2.1: The Evolution of both Social Networks and Context-Aware Systems from the 1980s to 2006

Phase 1 (1991-2006): Following Weizer, in 1992, the first experiments in ubiquitous computing

21 research were conducted by at Xerox PARC [43] to develop the ParcTab system [44]. In the same year, the Olivetti Research Lab launched its Active Badge Location System as be the first context-aware application [45]. In the early 1990s, commercial, non-standard context-aware applications emerged, along with extremely heterogeneous sensors that together presented a huge challenge for the building of applications. As a result, the progress of context-aware systems was significantly hindered. Based on the American E911 mandate of 1996, the mobile-network operators made great efforts to locate those emergency callers by using advanced positioning methods with a prescribed location range accuracy. Gain from these investments in the emergency services, they endeavored to introduced a set of commercial location based services (LBS) in order to locate a wider variety of nearby services. At the time, however, users were not interested in such services, and so they were neglected and faded away [31]. As such, the progress of this kind of technology was not only hindered by a lack of infrastructure, but also by the lack of acceptance on the part of those who would employ their use. In a study conducted by

Eisbach et al. (1999) it was shown that the success of pervasive systems is based on their impact on human social behavior and the users’ acceptance of the new technology. Their survey found that users tended to refrain from technology and were not confident in the systems’ ability to preserve privacy. The authors proposed a model of the ways in which pervasive systems can influence human behavior, social attributions, and interaction outcomes [46]. They demonstrated that pervasive systems can affect our social life and communication. They also pointed out that the proper design of these systems as artificial social actors and the gain of the users’ trust and appeal was essential.

The most prominent context that was utilized in context-aware research and applications since the late 1990s has been location. This was evident in the early attempt to provide LBSs in 1996 and the

Cyberguide application that was developed by Abowd et al. in 1997 to enhance guidebooks by adding location awareness and introducing a simple form of orientation information [44]. It was not until 2001 that location-aware opportunistic browsing was researched when Bruijn et al. introduced a framework for opportunistic browsing that was based on the location of the device that provided the search results

[47]. Researchers had not yet put into consideration the user’s preferences when filtering the search results since all the research conducted at the time concentrated solely on location as the context to be aware of during search filtration.

A useful contribution was made by Judd et al. in 2003 who provided a contextual information service

(CIS) for proactive pervasive applications that enabled them to adapt to the user’s current environment.

This service provided the applications with a virtual database of the entities and resources available in the current environment. It also enabled on-demand computation of contextual information and relieved the applications and the mobiles from the burden of implementing the required database or the query processing since all of these tasks were handled by the CIS. The service also provided a query synthesizer and a number of contextual information providers for these applications [48].

22 By 2004, the service-oriented middleware approach was introduced in the process of developing context-aware systems as marked by Gu et al. who developed the middleware-based context-aware system SOCAM (Service-Oriented Context-Aware Middleware) [49]. SOCAM introduced an architec- ture for constructing and rapidly prototyping context-aware mobile services. However, it was based on a centralized implementation and did not address any mechanism for storing and managing context history nor did it accommodate any requirements for privacy protection [44] [50]. It was Fahy and Clarke who handled the data storage issue by presenting CASS (Context-Awareness Sub-Structure) an extensible centralized middleware approach that was designed specifically for context-aware mobile applications

[39] [49].

Offering a wider range of support for context-aware mobile applications and services, the AWARE-

NESS Project(2004 - 2008) [51] provided an infrastructure for developing context-aware and proactive applications targeting healthcare applications through mobile networks. These healthcare applications featured tele-treatment and tele-monitoring of patients. In AWARENESS, proactive context-aware ap- plications are executed on top of a service infrastructure which provides generic components and manages context, security, and identity. The service infrastructure operates on a network infrastructure which supports context-aware mobility [52].

In 2005, the LBSs were revived with the introduction of Web 2.0, GPS capable mobile devices, and the 3G broadband [31]. Such enabling technologies supported the improvement of LBS features transforming them from being reactive to proactive, from tracking a single target’s position to multi-target service, from content-oriented to application-oriented, from self to cross-referencing, and from operator-centric to user-centric. Bellavista et al. in their research paper published in 2008 projected the future of LBSs with the emergence of middleware and open platforms such as Google’s Android and the like that could offer LBS to everyone.

Up until that point, knowledge related to sensor capability of mobile phones had not been widely researched. The aforementioned milestones in the history of context-awareness since the early 1990s till

2006 are illustrated in Figure 2.1.

Phase 2 (2007- present): In the past decade, the coinciding improvement of mobile sensor technol- ogy as well as high bandwidth network infrastructures has revived the research into and implementation of pervasive systems [39]. By the mid 2000s several context-aware systems realized the importance of mobile phone sensors in detecting context. CenceMe offered a good example of a context-aware system that used mobile sensors to extract and infer context. CenceMe was introduced by Miluzzo et al. in 2007 as an application running on iPhone devices [41]. It used mobile sensors to collect context and classify the raw sensed data and then uploaded the user status to back-end servers. These servers executed context inference and classification algorithms to deduce the user’s actions [53].

Starting from 2008, the ubiquity of mobile devices and their rapid advancement supported more

23 Figure 2.2: The Merger between Social Networks and Context-Awareness since 2007 concrete research into context-awareness but more significant work has been put into fusing context with social information. The main trend in context awareness research has been to focus on situation awareness and context prediction as illustrated in Figure 2.2.

Many studies have used semantic web-based rule engines such as ontology reasoning and machine learning algorithms such as decision trees [54] [55] [53] [56] and Bayesian networks [57] [58] [59] [49] to infer high-level context and deduce the user’s status. On another front, research studies have applied user-defined rule-based reasoning [60] or combined ontology-based reasoning and user-defined rule-based reasoning to infer context [61].

Current Research Trends in Context-Awareness

The service-oriented approach in context-aware systems has informed most of the research conducted over the past few years mainly because it provides standardized protocols and standardized access which strengthens the exchange of context information through the web services [39] [52] [32] [62] [63] [64] [65]

[66]. The service-oriented approach also facilitates the easy replacement of, or changes in, the backend code without affecting the tools calling such web services. Thus, it supports the addition or removal of sensor models, APIs, mobile handling and much more without changing the main programs, keeping them flexible and capable of interacting with evolving standards.

Pervasive systems are now capable of recognizing not only location and modes of locomotion but are also capable of complex activity recognition [67], thus enabling situation inference [58]. This capability is sustained by advances in sensor technology which are able to operate with increasing accuracy in

24 addition to ongoing improvements in the study of context fusion and inference rules [68].

The timeline in Figure 2.2 summarizes the milestones in the development and evolution of context- awareness including the introduction of the proactive context-aware services, situation awareness, context prediction and adoption of the service-oriented middleware approach in the domain of context-awareness.

2.1.2 Social Networks

Online Social Networks (OSNs) are an essential component in the construction of SPSs. Parallel to the progress of pervasive systems, OSNs have been transformed from offline applications or mere contact lists into online social network sites that can be accessed off the web and through APIs [40]. This section focuses on social networks in terms of their definition and their evolution from their relative obscurity in the 1980s up until the present day where they currently enjoy massive popularity.

Definition of Social Networks

A general definition of a social network is ”a system with a set of social actors and a collection of social relations that specify how these actors are relationally tied together” [69].

According to Boyd et al., social network sites are ”web-based services that allow individuals to (1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system. The nature and nomenclature of these connections may vary from site to site.” [40] This definition was valid at the time of writing in 2008.

However, by 2011 social network sites (SNS) had come to encompass more activities and features including: [69]

• A multi-functional platform for personal online content creation such as photo and video sharing,

text messaging, commenting on other users’ content and blogging.

• Photo-based visualization of the user’s social network of connections.

• Personal information such as interests, contact information and status as part of the user’s personal

profile.

• Privacy settings to control who can access which part of the user’s personal profile.

• Provisions to allow friends to post photos, videos and messages on the user’s profile.

Evolution of Social Networks Since 1980s

Social networks started as autonomous systems that were not connected to other systems - but progressed to connect people worldwide and thus gained popularity. In the course of their development, SNs standardized their APIs, facilitating information exchange with other systems and applications.

25 Phase 1 (1980s - 2006): The earliest attempt at a social network was in the early 1980s when the bulletin board system (BBS) [70] became widespread - a feature that enabled users who were located geographically distant from each other to communicate through BBS over telephone lines, thus providing a facility for making virtual groups or communities. In 1984 FidoNet - a non-commercial that used modem-based dialup between BBSs - added an extra feature that allowed for the transfer of private messages among BBS users and for files to be attached to these messages, thus enabling the exchange of information in the form of file attachments [71].

At that time, social networks did not operate as a real-time application, but rather as a message relay tool. In 1996, ICQ was released to become the first internet-wide instant messaging service. It gained popularity as it offered instant messaging, free SMS, and multi-user chats.

The first dedicated social network site - named SixDegrees.com - was launched in 1997 and allowed users to create profiles, list their friends, and then in 1998 it added a feature that enabled them to browse their list of friends [40]. In 1999, Microsoft launched its MSN messenger [72] as a free internet messaging service that allowed users to communicate with other MSN users via the internet and also to connect to several Microsoft communication tools such as Hotmail. Ryze.com was launched in 2001 to become the first SNS that was employed to encourage users to improve their business network. By 2004, social network sites began to support photo-sharing as exemplified by Flickr. A year later video sharing SNSs such as YouTube appeared and gained popularity. In the same year Facebook gradually expanded its membership from being restricted to Harvard members only to finally be allowed for everyone. Facebook introduced an extra feature by enabling outside developers to build ”applications” that gave users the ability to personalize their profiles and perform other tasks [40].

Phase 2 (2007 - present): By the mid 2000s endeavors were made to introduce social networks as a component of e-learning systems. In 2007, several publications elaborated on that contribution [73]

[74] [75]. For example, O’Rourke analyzed the SNs in e-learning systems and concluded that the social interaction between students and instructors and also between the students and their peers improved their capabilities for online participation and learning. Through discussions and collaboration tools, students were able to develop a social network in the e-learning environment. Such social networks were found to improve the students’ self support in learning, and increase their sense of responsibility in learning and supporting others in learning [75]. This enhanced the formation of what became known as Learning

Networks - defined as online social networks that rely on the social relations among the learners to foster non-formal learning and engage them in groups [76]. A study conducted by Berlanga et al. (2008) on how ad hoc transient communities foster knowledge sharing in learning networks, concluded that one of the main benefits of such communities was that it enabled learners to overcome any difficulties they had with online learning tools, particularly those who were at risk of being isolated from the rest of the group. On another front, Graf et al. (2008) proposed an infrastructure for developing pervasive learning

26 environments that included a social networks component for connecting the learning environment with the students’ online social network site accounts [77]. The authors tackled the encountered security issues in these environments and proposed inserting security components in order to achieve trust. In

2009, another useful contribution was made by Awodele et al. in an effort to improve the university student experience by augmenting their Learning Management System (LMS) with a social network of the academic and non-academic experts from the university in order to provide students with help and offer consultation [78].

More recently, there has been a growing trend of sharing photos and videos on social networks sites such as Flickr, YouTube and Facebook that has led to their expansion into an immense multimedia database prompting the emergence of the social multimedia approach [79] [35] [80] [40]. Social tagging has also become a popular activity in social networks and a powerful means of providing collaborative activities [34] [80]. Mizzaro et al. attributed the popularity of social tagging to the release of Web 2.0 services that enabled non-professional users to informally organize and share information content [81].

By sharing these tags, new users could be guided by the social agreement on how to categorize the viewed information such as web pages, photos, videos. Also, social tagging highlights the influence of culture on the understanding of specific term. As such, it constitutes a rich source of research in the field of social studies. Social navigation is another powerful feature of social networks that relies on social tagging. Its concept was first introduced by Dourish and Chalmers in 1994 as the ”navigation towards a cluster of people or navigation because other people have looked at something” [34]. Social navigation has been applied by many users to guide others as they have sought information or navigated through online or physical space which serves many online or mobile applications. Its uniqueness results from the social imprint that users leave as they post tags or post their impressions about specific online content. Such social presence encourages others to follow the same path and feel the presence of other visitors to this site.

Online social networks have also been an effective means of expressing political views and have been used as blogs [40] [82] [33] [69]. For instance, among the powerful tools that led to the success of Barak

Obama during the 2008 US presidential elections was the utilization of his video blog on YouTube as part of his campaign [80]. Also, Facebook played a vital role in the 2011 uprisings that took place in the

Arab world, prompted by Facebook groups that had been initiated by the youth in those countries. In turn, governmental organizations initiated their own pages on Facebook in an attempt to open a channel of communication with people and achieve a level of mutual understanding.

Currently, many online social network sites detect context from the user personal information provided or the user’s list of friends and accordingly recommend certain actions. For instance, based on the user’s birth-date and the current date, these sites send reminders to the user’s friends reminding them of her approaching birthday. This is a simple low-level context detection and situation inference. Another

27 simple inference application applied in all social networks is proposing a new list of friends based on the fact that both users share common friends. Also, some sites display advertisements customized to the user’s interests and preferences. More intelligent sites detect the recent status of the user and display matching advertisements.

Inasmuch as they uncover indirect relationships and trends, social networks analysis (SNA) has opened a world of material for researchers. SNA began as a field of study in sociology in the 1930s that applied network characteristics to social phenomena [75]. According to O’Rourke, who wrote a Master’s thesis that analyzed social networks in e-learning systems, SNA is a powerful means to study the social behavior of students and the social interactions among them to extract the strong and weak points in a learning system and then recommend certain enhancements as per the deduced analysis. In 2011, Rosen et al. provide a study of the co-evolution of SNA science and the online environments computer-based communications [69].

Figures 2.1 and 2.2 illustrate the evolution of social networks since the 1980s until they gained massive popularity.

The Current Research Trend in Social Networks

Many research studies discuss the benefits and risks of using social networks and SNSs. They mainly describe the usage patterns based on culture, race and level of education. Among these papers, Boyd et al. state the pros and cons of SNSs, and claim that on one hand they maintain offline social ties and solidify them; as such SNSs have become an important part of the users’ lives. On the other hand, several studies have shown that the encroachment of SNs in the lives of youths in particular has raised a number of concerns about privacy and safety [40].

In addition to maintaining social relationships, social networks are rich in providing detailed contex- tual information that describes an individual’s personal interests and preferences. Thus, social networks offer relationship-awareness and preference-awareness [41] nominating them to be a rich source of social information for research. We have compiled the milestones in the history of social networks since the early 1980s until 2007 and visualized them in Figure 2.1 illustrating the coinciding progress of social networks and context-awareness. However, studies have shown that although these sites are designed to acquaint people with those they do not know, users tend to select their acquaintances as friends and use these sites to facilitate communication means with their real-world friends [40].

The current architecture of online social networks is server-client based and relies on cloud technology for storage and processing. The online SNSs are stored and managed through a server - whether the server’s actual deployment is a distributed system or a single server - and accessed by clients through a web interface. However, some localized applications build their own social networks into the mobile devices of the users and these from the list of social contacts in the mobile phones to a custom-built

28 small social profile that can be accessed and managed by a small set of users’ phones and/or by the application for which this social list is built. It is worth mentioning that the supporting infrastructure for SNSs is the current and prevailing cloud technology that supports the backend distribution of storage and processing of social networks and which thus enables scalability and invisibility. As an alternative, many mobile social applications and local social networks rely on peer-to-peer network architecture [83].

As the popularity of SNSs increases, the criticality of the privacy protection issue has garnered more attention. Many researchers have proposed partial solutions to this significant issue such as offering an architecture for user privacy on SNSs like FaceCloak which is proposed by Luo et al. (2009). Their solution aims at hiding the user’s personal information in an encrypted format on a separate server from other unauthorized users while maintaining the same services attained by the site [84].

2.2 Social Pervasive Systems

While many research attempts have concentrated on mobile context-aware systems or social systems, few have merged both to maximize the benefits attained from each system. To support this argument, we traced some publication trends in this field. We used Google Scholar to survey the number of publications since 1975 till 2016 that either did unique research in one of the indicated areas, or underwent combined research in both areas. The trends illustrated in Figure 2.3 indicate a modest activity in research that combines both mobile context aware systems and social networks. We believe that the fusion of the context extracted from sensors readily available in mobile devices, along with the wealth of information available in OSNs, can produce a new powerful generation of applications serving a wide range of domains.

In this section, we briefly trace the progress of the technology enabling the infrastructure and which has motivated research into this cross-pollination. We then navigate in time to explore early attempts of a merger between the two domains followed by a review of more recent systematic activities that nonetheless remain relatively unstructured. Finally, we define SPSs and illustrate their fundamental features.

2.2.1 Enabling Technology

Technology advancements in affordable mobile technology, including higher processing powers, larger memories, and better displays, have paved the way for enabling the merger between social and mo- bile context-aware systems. Notable advancements have also emerged in mobile networking from the introduction of Bluetooth in 1998 up until the launch of 4G technology in 2009. Both lines of advance- ments have enabled the creation of handheld devices that, to a great extent, continue to compete in the replacement of larger non-mobile computational devices. Overall, mobile networking has become ubiquitous amongst a large population of mobile users, thus fostering easier access to OSNs. Simultane- ously, since its inception in 2000, mobile sensor technology has facilitated greater innovations in smart

29 phones. Devices continue to be equipped with smaller and more affordable sensors, which allows devices to be more aware of ambience, and further prompting the creation of more intelligent applications that promote better usability and sensitivity to user needs. Ongoing improvements in sensor technology, de- vices, and networking infrastructure have together contributed to a greater degree of symbiosis between social networks and mobile devices, and have also strengthened the popularity of mobile social network applications and better facilitated smooth access to online social network sites.

2.2.2 Evolution of SPS

From our observations, there were several mild attempts at generating premature SPSs during the late

1990s. 2007 marked a new era in the closer fusion of both domains as demonstrated by the formation of a set of better social and context-aware applications.

Phase 1 (1990-2006):

Early attempts to implement primitive social pervasive applications were undertaken in the field of commercial products. In 1998, the first commercial product for introduction systems (Lovegety in Japan) was launched [38] where users with matching profiles were introduced as they came into proximity with each other. In 1999, Hummingbird was launched as the first interpersonal awareness device for collaboration through the augmentation of instant messaging and email [38]. However, these applications required special devices and manual feed in order to generate a primitive social profile.

Early attempts to fuse mobile sensor technology with social networks were made in 2001 when Campus

Aware [34] superimposed visitor’s impressions of specific locations on top of a university campus map, thus enabling visitors to share their experiences of the place.

In 2002, mobile devices became capable of user collocation pattern detection, whereby common friends were sought to introduce unacquainted persons who shared common social patterns to each other [85].

For example, the Experience Project placed proactive displays in the UbiComp conference next to ”tag readers” in order to display presenter talks that matched the profiles of users who approached the display while wearing their own tagged conference badges [38]. In this case, user profiles were stored in a local database.

In 2004, Bluetooth technology was introduced further advancing research into the merger between

SNSs and pervasive systems. The first mobile phone application that could create an urban community through repeated Bluetooth scans in order to log nearby phones and facilitate communication between strangers was called Jabberwocky [38]. Another such application called Social Serendipity combined the use of Bluetooth with staged localized social profile systems and proximity information, in order to achieve a semblance of context-awareness [38] and social profile matching.

In 2006, Eagle et al. introduced the concept of Reality mining by utilizing mobile phones as behavioral

30 sensors thus proving their candidacy for sensing human behavior. By analyzing the collected data, they built a system that could infer relationships and sense complex social systems [86]. The majority of the aforementioned research did not include the various forms of context or have the capacity to communicate with external social networks.

Phase 2 (2007-present):

The recent integration of context-aware techniques, sensor technology, social networks, and mobile tech- nology has led to fruitful research in context fusion. For instance, in 2007, the SOCIALNETS project studied physical and electronic social networks in order to construct an opportunistic virtual and adap- tive social network that could provide knowledge and content management to pervasive applications [87].

Some studies proposed the use of local social networks as a means of ensuring privacy preservation and trust management policies in location-based systems given that they enable localization within wireless networks via opportunistic networks [88].

Figure 2.3: Trend of Research in Social Networks, Pervasive Computing and Joint Research since 1975

More recent work incorporates the social information retrieved from OSNs into context-aware systems in order to produce a new form of contextual information that can be utilized in the provision of better service [41]. Social networks and context-aware systems exchange information in order to provide more intelligent services. On one front, social networks seek awareness of the user’s current context in order to provide the best matching service; some research applications are able to detect the user’s current context using their mobile device sensors and then connect to their social network to then feed the current context and, accordingly, infer the best actions [89] [54]. It is also possible to integrate user experience in virtual worlds with the real world through context-aware bridging systems. Through Avara, for example, current user actions - as extracted from the mobile sensors - are fed into the user’s 3D virtual world account such as Second Life [90].

In 2009, a number of significant contributions were made including: Social context-aware browsing to

31 improve search techniques [91], the application of the social approach in resolving context-aware system group conflicts [92] [41], social context-aware services that could feed into social networks [54], adaptation and support for manual control of context-aware systems [41] to gain users’ trust and confidence, social pervasive e-tourism that could infer user context and mobility profiling to provide recommendation services [93] [35], and the integration of online user profiles with face-to-face presence [94].

In 2010, an even closer fusion between context-aware systems and online social networks, brought about the following contributions: Context fusion to improve new forms of sensors such as the calendar

[95], logical sensor generation for individuals/group action recommendation [41], broker-based social matching service that supports opportunistic social networking in DTNs [96], the recommendation and monitoring of social relations as proposed by FriendSensing and SensingHappiness [36], and finally the integration of user experience in virtual worlds, such as SecondLife, with the real world by feeding current user actions - extracted from the mobile sensors - into a virtual world account [90]. A number of social-aware opportunistic message forwarding approaches constitute another emerging branch of joint research of social systems and context-aware systems. These approaches rely on the inferred social graph combined with the users’ current context and history of context in order to optimize the performance of the forwarding approach. For instance, Profile-Cast [25] selects forwarders based on the users’ mobility profile and seeks the destination nodes that belong to a common preset profile. Another well-known example is the BubbleRap forwarding algorithm [97] which relies on the ”community” feature when forwarding messages among mobile nodes.

Having examined each of these contributions, we will now take a broader view and illustrate the general trends we were able to observe in Figures 2.1 and 2.2. Having extrapolated from these trends, we believe that a major new research thrust is emerging, which we define as Social Pervasive Systems

(SPS).

2.2.3 Definition and Features of Social Pervasive Systems

Social Pervasive Systems (SPSs) are created by the cross-pollination of mobile systems and social systems that draw extensively on mutual context. They inherit the main features of both domains, yet are characterized by a new breed of features.

Any SPS inherits from pervasive systems context-awareness, invisibility, the ability to handle data received from diverse sensors, the ability to deal with heterogeneity, context management, proactiveness, adaptation, security preservation and scalability [31]. On another front, any SPS also inherits from social systems social profile generation, social communication, interest group generation, media sharing, the ability to post peer comments, tagging, allowing application sharing, friendship network generation to enhance social networking, and the provision of several levels of privacy settings [40].

In addition to the above, SPSs have their own set of unique and prominent features - as we envision

32 them - due to the fusion of significant attributes of both ancestors and the co-influence of both domains.

In such systems, social networks will provide both low level and fused social context to mobile devices using logical social sensors. In turn, SPSs can influence social networks by exploiting the mobile-devices’ to sense the surrounding environment in order to improve the social networks’ awareness and intelligence.

To be successful, these systems must manage in real-time potentially massive amounts of diverse data to avoid latency, outdated context inference, and inappropriate SPS actions and recommendations.

For further SPS performance enhancement, we recommend a set of optional features such as the inclusion of certainty levels with situation identification, optional user customization of the system at all stages to improve system flexibility, and optional utilization of cloud technology in data storage distribution and processing. Finally, SPSs may rely on opportunistic networks to widen the range of applications and provide dynamic adaptation based on communication.

2.2.4 Applications

With this cross-pollination, a breed of new application families emerges to provide services with a high level of social impact. In this section, we describe five different families of applications that may emerge from such a merger. Figure 2.4 illustrates these application families along with the context awareness features and social awareness features that support them.

Figure 2.4: Application Families in Social Pervasive Systems.

33 1. Monitoring Social Behavior:

These systems monitor the cumulative behavioral profiles of the social groups within a certain population in order to categorize them into sub-populations that share common features, economic conditions and health conditions. These systems consist of two phases; first, gathering the collective behavioral profiles from both context-aware mobile systems and social networks, and second, analyzing and grouping these profiles to deduce the common sub-populations. The behavioral data gathering phase mainly relies on collaborative, large-scale sensing among people’s mobile phones combined with social information collected from social systems. There are significant behavioral data gathering projects that support the data gathering phase such as Reality Mining [86] and ”Big Data” [98] which provide massive amounts of social/behavioral data gathered from mobile phones. The second phase requires significant research contributions in the field of high-level context inference to deduce and analyze the common social profiles among social sub-populations.

Such social behavior monitoring systems are valuable to a wide spectrum of applications such as population-targeting healthcare applications, economic, or even political applications targeting certain sub-populations, urban sensing, monitoring traffic congestion and analyzing social interactions. Further- more, fusing social tagging with context-awareness and multimedia technology enriches Social Multimedia

[80] which constitutes a wealthy resource for social context-aware search engines and social studies [35].

2. Social Persuasive Applications:

Effective and persuasive applications such as social pervasive advertising would employ the use of both mobile context and social context which can then be implemented in diverse fields and areas ranging from advertising to education. One such use could be to distribute customizable and appealing advertisements.

Such applications can persuade clients to purchase certain kinds of products or services by having prior knowledge of how specific products or services relate to the customers’ mobility or particular social interests. Further information about social contacts, their preferences, and the degree of their influence on the subject in question can even help facilitate the use of effective persuasion tactics. Furthermore, persuading users to move to certain regions in a commercial zone can help to decongest physical areas by redirecting customers to areas of lower commercial value, rather than merely encouraging users to purchase a product or service.

Persuasive systems can also generate educational applications. Since they rely on user’s social profiles, current context, and the context of friends and peers they would be able to fuse this data to produce information about the academic progress of users and compare it to that of their peers as an incentive to perform better academically. Peers with higher interaction with a user may even be more influential than others with less interaction patterns, so ranking in terms of social contacts’ interaction could be very useful for applications of this kind.

34 3. Socially Influenced Context-Aware Systems:

By utilizing knowledge of mobility patterns, social interactions, social and behavioral profiles, such systems can deduce future actions and provide suitable services accordingly. Among the services provided by this family of applications is the identification of mobile nodes capable of forwarding information in a way that is sensitive to their behavioral profiles and their willingness to perform such actions. Social- aware ad hoc message forwarding approaches constitute a vibrant research area that can benefit from this application family; we propose incorporating into this application family the following specific features: interest, awareness of the remaining node’s power, willingness to forward messages, activeness and social ranking in addition to other context and social-aware parameters. Another field where SPSs are of value is in social/context-aware search engines which can be guided by users’ current context, social preferences and the history of browsing actions.

4. Alerting Systems:

Alerting systems monitor user behavioral patterns and gather current user context from various logical and physical sensors to infer any changes in behavior, and to deduce whether such behavioral changes demand that users are notified about a particular phenomenon. If an emergency or other critical status is detected, these systems alert users or their social contacts. The alerting systems can infer the proper social contacts to contact based on the ranking of social interactions in the given social graph.

Alerting systems also be utilized in applications that notify users of their peers’ current activities or progress in order to motivate them towards better performance. We foresee the benefit of alerting systems in broader contexts, such as improving a nation’s economic status by analyzing the outcomes of the social behavior-monitoring systems in order to pinpoint any sub-populations on the verge of economic danger. Furthermore, governments may be alerted by these applications of specific sub-populations in need of direct support and thus avoid approaching economic crises.

5. Social Recommender Systems:

These constitute one of the most valuable potential applications of SPSs. Here, recommendations are made based on situation/context inferred from the fusion of social information, user and peers’ context and behavioral profiles. Situation/high-level context inference is a vibrant research area that heavily relies on context information that is extracted from OSNs, mobile devices, logical and physical sensors which together are combined with machine learning approaches.

We foresee that these systems will be of great value in work environments where they may recommend to managers the optimal group of employees capable of performing certain tasks based on an analysis of social interactions and personalities obtained from their social profiles. These applications can also provide common services within the workplace to all employees based on context. Such services can be

35 adapted to suit specific mobile device capabilities, or the current requirements as per the task in hand.

Other types of recommender systems can include social context-Aware learning management systems that can, for example, customize study material based on a student’s level of comprehension, recommend helpful interest groups, and even persuade students to progress in their studies through alerts about their colleagues’ progress. Such systems can also generate study groups based on students’ common study habits. On another front, the application may alert instructors of those students facing difficulty in comprehending certain lessons, and recommend suitable methods for helping those students, and may also refer instructors to particular references that would help improve their own teaching skills, all based on information co-obtained from both pervasive and social worlds.

Figure 2.5: Challenges and Proposed Solutions for Social Pervasive Systems

2.2.5 Challenges

To realize our proposed SPS application families, we list some challenges that need to be addressed by the research community in order to realize the full potential of SPSs. We believe the following challenges are the most prominent set of challenges as illustrated in Figure 2.5 and detailed below:

36 1. Intelligence and Invisibility

Given the large spectrum of contextual data an SPS can attain from both the mobile and social worlds, further challenges of intelligence and invisibility arise. Irrespective of the amount of sources of contextual information that are present in such systems, the need for user intervention in processing should always be minimized. Issues such as the utilization of different social networks, authenticating credentials, and diverse mobile platforms should all be invisible to users.

2. Extreme Heterogeneity

Above and beyond typical heterogeneity issues in pervasive systems, SPSs are challenged by magnified heterogeneity challenges given that they involve both social networks and mobile systems combined.

Challenges involving access to various social networks like Facebook, Twitter or LMSs include amongst many, variations in structure (based on the types of context that could be retrieved), and variations in proposed APIs for access. In addition, , numerous communication infrastructures, along with differences in interacting devices, and multiple login credentials of the interacting social/mobile systems still exist.

3. Power Conservation

Although power and energy are common challenges in pervasive systems, the massive information ex- change that may be involved in systems combining the mobile and social worlds imposes an even greater challenge to the conservation of power. Higher communication requirements also impose a significant challenge due to the amount of battery power depletion that will be incurred upon mobile devices.

Moreover, in mobile opportunistic networks and mobile ad hoc networks there is main reliance on the mobile nodes resources in processing and decision-making procedures. Consequently, the involved mobile nodes risk exhausting their limited power, storage and computational resources.

4. Real-time Constraints

The trade-off between real-time management of input data and processing time consumption requires careful consideration. The rigid time constraints expected in many SPS applications are also coupled with the need for huge storage space and large processing power in order to process large amounts of collected contextual data.

5. Security and Privacy

Privacy problems still hold when exchanging and utilizing social and mobile information, especially in systems that use hosts whose reputation and trust is not well defined [33]. Unsuitable privacy levels - especially when social information is utilized - creates the risk discouraging users from participating in such systems [99].

37 6. Scalability

Scalability is an inevitable challenge that needs further research efforts. The majority of SPSs offer services in localized areas. Once the range of services increases, performance typically deteriorates.

Furthermore, processing massive and noisy real-time data, the risk of latency, and the inference of context from data with varying degrees of uncertainty all constitute the challenges arising from scalability [100].

7. Conflict Resolution

On another front, as different sources of contextual information from mobile and social networks are used, conflicting contextual information becomes a notable risk. Furthermore, since these applications tend to serve group of users, such applications might face the challenge of satisfying the conflicting needs of these users. Several approaches are being researched for the sake of group satisfaction, and prioritizing user’s needs and interests [101] [92].

2.3 Opportunistic Networks

Exponential advancements in mobile technologies - in terms of advanced sensors and various wireless network capabilities - have enriched mobile devices with intelligent features, making them ideal candi- dates for pervasive systems. Alongside these advancements, social networks which have been granted seamless accessibility from mobile devices have given rise to a gold mine of contextual information [7][8].

The resulting ecosystem that merges the social world with the mobile world, all supported by associated technologies, enables the creation of a set of smart services and applications. However, given the current

0.9 Exabytes mobile data volume that is expected to reach 11.2 Exabytes in 2017 [6], network infrastruc- ture is bound to become overloaded, and users can experience occasional network service contention or unavailability. Despite the ubiquitous network advantages that have reached LTE and 4G, some connec- tivity problems have emerged. For example, users suffer from rising cost of service delivery [9]; not all devices are connected with predefined routes; and not all places are reachable. These obstacles have mo- tivated ad-hoc communications [10] [102], delay tolerant networks [11], and opportunistic networks [12] to act as a complementary infrastructure that enables communication in environments with disrupted connections. Therefore, the reliance on ad-hoc connections among mobile nodes to forward content in a local area offers partial relief from network infrastructure overload.

Opportunistic networks are classified into: pocket switched networks (to which mobile opportunistic networks belong) [103] [104], vehicular networks, amorphous opportunistic networks, and sensor net- works [105]. Opportunistic networks open a venue and wide opportunities for research as an evolution of

MANETs for their opportunistic and wireless connectivity [13] and for their human-centric nature, how- ever, there is a set of challenges that face opportunistic networks; namely, the intermittent connectivity,

38 no end-to-end connectivity, heterogeneity of devices which then lead to a need to harvest information from diverse resources, scarce resources, no infrastructure, delay tolerance, routing and forwarding chal- lenges, high mobility of the nodes, and thus a dynamic topology, selfishness of some participant nodes , trust is an issue, privacy and security among other challenges [105].

2.4 Social-Aware Opportunistic Forwarding Algorithms

Given the evolving communications ecosystem, merging mobile technologies and social networking, social- aware opportunistic forwarding algorithms [106][25][24] constitute one of the most promising approaches for ad-hoc communications. These algorithms take advantage of social relationships among mobile hold- ers in any given place and forward messages accordingly. The most advanced forwarding algorithms currently available can be classified into three main categories: 1) the power-oblivious social-aware op- portunistic forwarding algorithms which rely on social awareness and interest, but do not pay attention to power awareness when making forwarding decisions [22][106][25][24]; 2) the social-oblivious power- aware and energy-efficient routing algorithms [107][108][109] which seek efficient energy routes, but do not capitalize on other contextual information such as contact frequency, mobility patterns, and the usage profile of devices; 3) the social-oblivious power and context-aware opportunistic forwarding algorithms which consider power and context information, but do not exploit social information [27][110]. Forward- ing algorithms target either a single destination or a group of nodes. The group-oriented forwarding algorithms identify destination nodes based on either their node IDs or a common profile. Our research domain targets the profile-based forwarding algorithms developed for mobile opportunistic message de- livery. In order to forward content among mobile nodes, these algorithms currently compute a rank per node in the process of selecting the optimum forwarder nodes [111]. Social forwarding algorithms include the node’s social rank in making the forwarding decision. However, most of these algorithms encounter a set of challenges in maintaining effectiveness and efficiency in performance. In previous work conducted, we reviewed the challenges facing these algorithms and focused on four main challenges [112]: the incentive-oblivious forwarder selection process; neglection of the power capabilities of the nodes in place; the limited contact durations among nodes; and, the forwarding algorithms’ unfair utilization of the nodes’ resources.

2.4.1 Power-oblivious, Social-Aware Opportunistic Forwarding Algorithms

Many social aware forwarding algorithms do not take into account power awareness in forwarding. These algorithms rank nodes upon according to any or a combination of the following social metrics: the nodes’ social popularity [22]; common interests [113][114][115]; common mobility behavior [25]; community- based metrics [116][117]; and, activeness in connectivity with other nodes [24]. In all these ranking

39 metrics, the nodes that rank the highest are the ones that are most likely to successfully deliver messages based on their higher probability of encountering destination nodes. This category of algorithms favors the higher ranked nodes. However, some of these algorithms balance between the social awareness and the opportunistic selection of forwarder nodes, such as PeopleRank [22] and IPeR [113]; these algorithms sustain delivery in socially-unrelated communities. Furthermore, very few algorithms rely on nodes that have a similar interest to that of the forwarded content, a concept that was introduced and presented by IPeR [113] in our previous work, and by the ”Interest-Aware content distribution in DTN” algorithm

[115].

From another dimension, the social-aware forwarding algorithms are classified based on the way they identify the destination nodes: pre-identified target algorithms or profile-based target algorithms. The

first group targets specific destination nodes according to their explicit identification [22][106][97] [24], while the second group targets groups of destination nodes by their social/mobility-behavioral profiles

[25][113].

Only a few social-aware forwarding protocols - among them SocialCast [24] - mention awareness of the node’s remaining power as a form of context awareness, however, they are still yet to conduct proofs or experiments for the proposed integration of power and context awareness. Our analysis of SocialCast found it to be very consuming both computationally and storage-wise to keep records of the node’s co-location with any subscriber on a real-time basis while also maintaining its mobility pattern history.

Furthermore, SocialCast burdens the network with the exchange of control messages among all the nodes in a specific area.

2.4.2 Social-oblivious, Power-Aware, and Energy-efficient Routing Algorithms

Due to the energy constraints placed on nodes in ad hoc networks, it is crucial to design power-aware routing protocols with a view to maximizing the lifetime of both the nodes and the network itself.

Some of these protocols target routes that cost the least amount of power in order to minimize power consumption, but these may deplete the battery of some forwarder nodes, thus reducing network lifetime

[104][118]. Other approaches, such as PILOT [107], may use a particular route that will consume more power in order to avoid using nodes whose batteries are depleting. Such approaches mainly maintain energy efficiency by combining the awareness of the node’s power with another cost function as part of the forwarder selection process [109]. However, all these algorithms overlook the social relationships dimension when making forwarding decisions.

To maximize network lifetime, power awareness and lifetime prediction routing protocols seek routes that minimize the variance among the nodes’ remaining power. While such protocols improve the network lifetime, they tend to create additional control traffic [119]. Seeking fairness via minimizing the energy consumed per node, some protocols such as CMMBCR [108] select the route which consume the least

40 total transmission power where the remaining power of all the nodes falls above a certain threshold value; otherwise, route selection is based on another cost function. However, the performance of such algorithms varies according to the selected threshold value [119][120].

In mobile opportunistic networks, connections between mobile nodes are transient leading to unstable route connections. Alternative solutions seek one-to-one hop connections among these mobile nodes. So far, only a few energy-aware contributions have been made in this area such as those made by the

Energy-aware BUBBLE Rap forwarding algorithm [121] and The Energy-Aware Social-Based Multicast algorithm [122]. Both algorithms combine socially-aware routing with energy consumption optimization.

However, these algorithms do not incorporate awareness of the forwarder node’s interest in the forwarded content.

2.4.3 Social-oblivious, Power and Context-Aware Opportunistic Forwarding Algorithms

The majority of the proposed context and power aware routing protocols do not consider social infor- mation in decision-making and mainly operate on static sensor networks or wireless ad hoc networks

[110]. For instance, the context aware opportunistic routing protocol, SCAR [27], allows efficient routing of mobile sensor data to the destination nodes via the best path selection. SCAR relies on the candi- date nodes’ collocation history with destination nodes, on their change degree of connectivity, and on their current power in path selection, but it does not consider social awareness. Like SocialCast, SCAR over burdens the network with the regular exchange of control messages among all the nodes present.

Furthermore, as a result of the limited power resources of the sensor nodes in wireless sensor networks

(WSN), many power-aware and energy-efficient routing protocols propose solutions for WSN and yet pay little attention to opportunistic networks [110].

Most of the above mentioned algorithms encounter challenges in terms of maintaining effectiveness and efficiency in performance. In previous research [112], we reviewed the challenges encountered by these algorithms and focused on four main categories of challenges pertaining to: 1) The incentive-oblivious forwarder selection process; 2) Neglect of the power capabilities of the nodes in place; 3) The limited contact durations among nodes; 4) The forwarding algorithms’ unfair utilization of the nodes’ resources.

To the best of our knowledge, the work presented in this thesis is the first attempt to integrate interest awareness and power awareness in social-based opportunistic forwarding algorithms. It is also the first to propose an integrated solution for the challenges mentioned above.

41 2.5 Space-Syntax-based Forwarding Algorithms

2.5.1 What is Space Syntax?

Space Syntax was initially proposed in the field of architecture by Hillier and Hanxon in 1984 to model natural mobility patterns by analyzing spatial configurations. It accurately predicts natural movement patterns in an area based on how its segments are connected [28] [29] [30]. Space Syntax opportunistic forwarding exploits the infrastructure of a particular place and the popularity of specific attraction points in that place in order to predict user mobility patterns, flows and areas of possible congestion. As such,

Space Syntax is comprised of a set of techniques for analyzing spatial configuration, and is derived from a set of theories linking space and society. Using Space Syntax, one can relate spatial configuration to where people are, how they move, and how they adapt to space [30].

Space Syntax offers a set of measurable metrics that quantify the effect of road maps and architectural configuration on natural movement; it accurately predicts natural movement patterns in an area based on how its segments are connected to one another. Space Syntax metrics are pre-calculated once using static information about the environment and do not need to be re-calculated unless the environment changes [28].

2.5.2 Related Work

We located a number of studies that looked at Space Syntax specifically in terms of how it affects or controls the social behavior of humans in a particular place and which inferred human mobility and social interaction from these behavior patterns [3] [123] [124] [125].

The studies that have built on Space Syntax in order to improve the opportunistic forwarding process include:

1. Space Syntax Forwarding by PopularityIndex: [28] Building on the Space Syntax, these authors

introduce a new Space Syntax metric, namely, the popularity index. This new metric is computed

as a function of the segment’s integration value, its location index, and its attraction value. Based

on the computed segments’ popularity indexes, the popularity index of any attraction point on

the map is then calculated. The authors demonstrate the effect of the newly defined metric by

proposing the following two space-syntax-based forwarding algorithms:

• The SS-Cutoff algorithm prohibits message forwarding to nodes that are geographically lo-

cated in areas whose popularity falls between a certain configurable threshold, while it, with

a given probability, forwards messages to nodes in areas whose popularity is above or equal

to the threshold.

• The SS-Sliding algorithm forwards messages to nodes according to a variable probability that

42 is inversely proportional to the popularity of the area in which these nodes reside. Thus

forwarding becomes less aggressive in areas that are more popular so as to decrease cost.

However, the limitation of the algorithms proposed is that they rely on the condition of first finding

and establishing a route to the destination node before initiating the forwarding process. Satisfying

this condition could be difficult in disrupted networks, and thus limits the area of application of

their proposal.

2. Select & Spray: [29] This algorithm is a Space Syntax based opportunistic forwarding algorithm

that is applicable on a city-wide level. They consider Space Syntax information to guide oppor-

tunistic forwarding based on the likelihood of a location to attract mobile nodes. The approach

also depends on a server/client setup not an opportunistic network approach. The server consists

of a ’data collector’, a ’recommendation engine’ and a ’select engine’ that collect and memorize

registered clients’ mobility data in addition to a priori knowledge of the city map and the position

of attraction points. The algorithm then computes a maximum probability utility function which

ranks all nodes in the network based on their probability of reaching a destination, and then selects

the best subset of connected nodes that will be used to disseminate a message with the least poten-

tial cost and delay. Based on a city-wide range and a server/client configuration, this application

is rather costly in terms of computations, communications and resources.

3. In a study conducted by Mitbaa et. al [126] that exploits Space syntax for Mobile Opportunistic

Networks, the authors survey the field of opportunistic forwarding in small-scale urban environ-

ments and categorize them according to the assumptions on which they base the forwarding deci-

sions. The authors illustrate that these algorithms - as they rely on more complex assumptions -

achieve more efficient performance, ”rendering solutions built on them unusable outside their in-

tended environment categories” [126]. This study proposes three sample algorithms each of which

represents one of their defined categories. The proposed algorithms assume that the forwarded

messages may not be fragmented, and are designed to ensure there is enough time for nodes to

exchange messages. The forwarding decisions are based on static information such as city map and

attraction points, or, alternatively on local information regarding the node itself.

The authors propose the following three sample algorithms:

• Assisted Space Syntax Opportunistic Forwarding (ASOF): This forwarding approach is based

on the nodes’ awareness of a map layout, the nodes’ position, the position of the destination

internet access point locations, and the nodes’ mobility patterns.

• Location-Aware Space Syntax Opportunistic Forwarding (LASOF): This forwarding approach

is based on the nodes’ awareness of a map layout, the nodes’ position, and the position of the

43 destination internet access point locations.

• Space Syntax Opportunistic Forwarding (SOF): This forwarding approach is based on the

nodes’ awareness of a map layout.

4. Space Syntax and Pervasive Systems: In a study of how innovative the application of Space Syntax

in the domain of Pervasive Systems can be, the author applied Space Syntax metrics in different

phases of the development of pervasive systems [3]. The author proposes three case studies that

demonstrate the role of Space Syntax as a simple application module, as an explanatory tool, and

as a modelling tool.

5. Weighted PageRank approach [123] is another study that proves that the underlying space-space

topology of an urban environment is far from random. Rather, it exhibits small world and scale

free structures, and proves that the weighted PageRank approach can accurately predict human

movement in an urban environment.

All the above-mentioned research contributions exhibit the following limitations:

1. The few contributions that combine Space Syntax metrics with other opportunistic forwarding

techniques mainly consider expansive areas at a city-wide level. They do not consider small scale

level application such as a campus-wide or convention center application.

2. Some of these contributions consider social graph metrics (such as closeness centrality and between-

ness centrality) in addition to the contact graph, but no proposed algorithm so far considers the

forwarder nodes’ interest in the forwarded content.

3. The Popularity Index approach considers mobile destination nodes [28] while the ASOF approach

and its peer algorithms consider destination locations such as streets or static access points [126]

[29];Thus, their proposed algorithms cannot be applied to mobile destination nodes.

4. All the proposed Space Syntax-based contributions to date do not take into consideration power

consumption and some complex algorithms consume a great deal of energy. The preservation of

energy is something that is crucial in limited-resources mobile opportunistic networks.

5. The proposed algorithms by Mitbaa et al. papers [126] [29] rely on the mobile node’s continuous re-

computation of the weights of its own contact graph to re-compute its utility function. Practically

speaking, these computations consume the node’s resources especially when this approach is applied

on a city-wide scope. It is worth noting that the node needs to compute this graph for each

forwarded message since each message has its own set of destination locations/access points. Given

that most of the messages will be forwarded through the betweenness-central mobile nodes, this

computational consumption leads to unfairness by exhausting the node’s resources.

44 A more detailed comparison among these algorithms is illustrated in Table 3.1.

2.6 Proposed Solutions

In this section, we propose solutions that address some of the aforementioned challenges. These pro- posed solutions are illustrated in Figure 2.5. In order to achieve intelligibility, better situation inference algorithms are needed to better accommodate user needs and expectations, and to gain user confidence in the intelligibility of applications. Possible approaches for situation inference would require triggering combined sets of rules based on multitudes of observed activities such as recent user social interactions, response to recent communications and change in routine behavioral profiles.

Furthermore, applications must gain intelligence and become adaptive in selecting which social/context information to rely on in making decisions based on the context information available in the current en- vironment. For instance, several social-based forwarding algorithms assume the existence of a fully connected social graph among the members of the community such as the PeopleRank social forwarding algorithm [22]. This assumption, however, does not hold in all environments; in a mall environment or a conference venue not all the existing members hold social relationships with the other members of that transient community. Therefore, an intelligent application would need to introduce another social relationship that encompasses the majority of the members in this environment that either replaces or at least integrates with the initial social metric so that it can improve the approach performance and boosts its intelligence. For example, the users that visit a mall environment share common interests in the products or services offered in this mall. The availability of such an interest-based social metric is crucial in decision making and as such would provide a more discerning service without explicit user intervention. Another significant type of contextual information that some applications do not take into consideration is the expected duration that two nodes stay in contact with each other. Eliciting this piece of information in the decision making process saves the applications from wasting a significant amount of power when processing and as a result of incomplete data transfers.

To overcome the challenge of heterogeneity, it is vital to standardize the context-data exchange among various systems/sensors. Such standardization is achievable through a standard context markup language or a standard format/protocol for exchanged context information. Standardized login credentials are also required. Initial contributions propose an ”identity aggregator” that stores these credentials and transparently maps the services against the proper credentials [41]. However, the ”identity aggregator” idea faces its own set of challenges such as security threats, ad hoc communication, and centralized versus distributed processing. With respect to data collection phase, further research is needed to determine the best distribution method in order to satisfy optimal processing times and storage and I/O cost.

Regarding power conservation, some research contributions propose solutions such as balancing pro-

45 cessing and communication among the resource-rich and resource-weak devices [127]. The frequency of context information updates could also be set based on urgency and the frequency of change [41].

Ongoing trends to use renewable power sources such as solar power and wind for charging the SPSs

[100] could be also applied. Moreover, intelligent applications must be aware of the challenge of power conservation and intelligently select the resource-capable mobile nodes that will be able to successfully accomplish the processing task. This would prevent service failure and avoid overwhelming mobile nodes with extra processing - something that would ultimately encourage them to participate in the service.

More innovative energy harvesting techniques can be adopted such as transforming the electromagnetic waves from the surrounding objects [128] [129] [130]. The transformation of negative human energy

(NHE) into an electric form suitable for charging wearable/mobile devices is an area that is yet to be developed, but which carries a great deal of potential [131] [132] [133]. A related suggestion, that would break new ground, is the inclusion of NHE-detection sensors in mobile devices, coupled with the user status extracted from the OSN, and which would guide both systems to adapt their themes/applications according to the currently detected mood of the user.

Distributed processing and Cloud computing are among the promising fields that support SPS scal- ability and compensate for resource deficiency in resource-weak devices. Cloud computing also shows potential in resolving the stress of massive real-time processing. Besides, we believe that predicting context offline can save time and reduce the huge burden of real-time processing.

Finally, for privacy preservation, users may set context-aware privacy settings that change according to location, time, or user mood. In addition, secure zones, through which SPSs can safely migrate their context data, could be researched.

2.7 Conclusion

In this chapter, we described what we label as Social Pervasive Systems that cross-pollinate a mutually influential mobile and social world with opportunities for new breeds of applications. We presented fruitful research attempts that sought to establish a merger between both worlds and demonstrated the lack of sufficient research in this area. Above and beyond we presented a set of new services and potential applications that emerge from this new blend, and also described some of the expected challenges such systems would face. We proposed some solutions and areas that require further research such as defining a standard markup language for context information exchange and reliance on cloud technology for processing and storage, developing new metrics for user preferences prioritization to resolve preference conflicts and to establishing security zones for safe context data migration.

46 Chapter 3

Proposed Frameworks

Opportunistic social-aware forwarding algorithms are much needed particularly in environments which lack network infrastructure or in those that are susceptible to frequent network disruptions. Most current algorithms are oblivious to both the user’s interest in the forwarded content and the power resources of the available mobile nodes. This chapter proposes three frameworks as follows:

PI-SOFA (Power and Interest aware integration with Social aware Opportunistic Forwarding Algo- rithm), is a framework for integrating awareness of both the interest and power capability of a candidate node within the forwarding decision process. Furthermore, the framework adapts its forwarding deci- sions according to the expected contact duration between message carriers and candidate nodes. This framework is described in detail in Section 3.1.

Space Syntax Forwarding Framework is the second framework which proposes a set of new Space

Syntax metrics that we believe would offer more accurate representation of the place popularity which is inherited by the nodes that visit this place. Accurate representation of place popularity is a significant factor in computing accurate Space Syntax metrics that highly improve the candidate ranking process resulting in an effective and efficient social-aware forwarding approach. The proposed Space Syntax framework is defined in detail in Section 3.2.

Dynamic Adaptive Ranking is the third and the comprehensive framework that dynamically integrates a set of parameters to compose a dynamic adaptive rank for each node. The forwarding decision making mainly relies on this dynamic adaptive rank in order to adapt to the current context. The main proposed parameters of the rank are dynamically weighted attributes which include opportunistic selection, the nodes interest and power capability, the measure of how active socially popular users are, and the popularity of the place these users frequent. This framework is defined in terms of a cognitive map in

Section 3.3.

47 3.1 The PI-SOFA Framework

This research proposes PI-SOFA, as a framework for integrating interest-awareness and power awareness in profile-based social opportunistic forwarding algorithms. This framework integrates three main factors in making forwarding decisions: user interest in the forwarded content, user’s social rank, and the power capability of the candidate node. The expected contact duration between the message carrier and the candidate forwarder may also be included when making forwarding decisions.

Our framework is comprised of three contributions:

The first contribution elicits interest awareness when making forwarding decisions in order to better facilitate content dissemination to groups of interested nodes. Being aware of the user’s relative interest in the forwarded content, the forwarding algorithms are able to mainly approach interested users and avoid unnecessarily overwhelming uninterested ones, thus significantly reducing the wasted cost that would result from the dissemination of massive information to uninterested recipients. This contribution mainly improves the effectiveness of the forwarding algorithm. Furthermore, it is necessary to bring in an incentive to motivate nodes in the forwarding process participation. For this reason, our framework capitalizes on the user’s relative interest in the forwarded content, which it takes to be an effective incentive to participate in the forwarding process in this way, the users will themselves benefit by receiving this same content which is of partial interest to them. The proposed paradigm is an interest-aware version of any social forwarding algorithm that rewards or penalizes the node’s social rank based on its relative interest in the forwarded content. The rewarded/penalized rank is then utilized in the process of decision making on whether to forward content or not. This piece of information is stored at each node’s memory for use in computing its rank to then be exchanged with the encountered node for forwarding decision making.

The second contribution, builds on the first by integrating awareness of the candidate node’s power capabilities when making forwarding decisions. This integration directs the forwarding algorithm to rely on power-capable nodes as content forwarders by rewarding or penalizing the node’s social rank based on both its power capability and its relative interest. This contribution reduces overall power con- sumption and improves the utilization fairness of the forwarding algorithm. This approach incorporates interest and power awareness in both the ranking and the forward decision making of the social-aware forwarding algorithms in order to overcome four main challenges that these algorithms face: 1) The forwarder selection process is incentive-oblivious. 2) Power-oblivion of the social forwarding approaches in the forwarder selection process may lead to the nodes’ failure to sustain the forwarding process. 3)

Power-fairness-oblivious forwarding algorithms unfairly over-utilize some forwarder nodes, while only lightly utilizing others. 4) Neglecting the contact duration sufficiency between the encountered nodes for complete message transfer causes a waste of non-trivial resources.

Finally, the third contribution integrates threshold-based opportunistic forwarding to the two above

48 integrations. This contribution improves the effectiveness and the efficiency of the forwarding algorithm.

This extra option opens the door for guided opportunistic forwarder selection; relatively interested users who own power capable nodes are favored to become the next content forwarders, even if they do not satisfy the other selection criteria (such as social popularity, activeness, etc.). Being interested in the content and being power capable, these forwarders have a higher probability of encountering destination nodes and a higher chance of sustaining forwarding within the content time-to-live (TTL).

The framework that we envision enables integrating interest and power-awareness both in ranking the nodes and in the decision making process within social-aware opportunistic forwarding algorithms.

First, interest integration in this process is introduced. Then, power integration in the ranking and decision making process is demonstrated. Finally, integration of threshold-based opportunistic forwarder selection is proposed. While each of these integrations are beneficial in their own right, the optimum result is achieved when all three are combined. This framework proposal was published in a journal paper [134].

3.1.1 Interest Awareness Integration

Integration of interest awareness in social-aware forwarding algorithms is primarily based on rewarding or penalizing the social rank of the nodes as per their potential interest in the forwarded message. It is worth mentioning that throughout this chapter any reference to ”node’s interest” should be understood as being the interest of the user in possession of the node. ”interested nodes” or ”semi-interested nodes” should be understood as reference to the nodes whose interest profiles have partial commonality with the interest profile of the forwarded content. The algorithm expects any format of an interest vector or interest set that indicates the interest topics or fields that the user is interested in. The algorithm expects a similar format for the interest topics the forwarded message belongs to. After eliciting the interest of nodes and of the message, and depicting it in the form of an interest vector/interest set, the

[S]imilarity [I]nterest (SInt) is measured using the Jaccard set similarity index [135]. The range of values of SInt ranges from 0 which indicates no common interest at all, to 1 which indicates full interest in the forwarded content. If the node’s similarity index exceeds a certain threshold that indicates that the node is interested or semi-interested in the forwarded content, and thus the social rank of this candidate node is rewarded, conversely, if it falls below the threshold as an indicator that it is not interested in the content, the rank is penalized. The [P]enalized [S]imilarity [I]nterest (PSInt) is computed as follows:

  (SInt(j, msg) + reward) ifSInt ≥ T hrInt, P SInt(j, msg) = (3.1)  (SInt(j, msg) − reward) otherwise

where SInt(j, msg) is Similarity Interest and computed by the Jaccard set similarity between the user’s interest vector IntF V (node) and interest vector of the message content IntF V (msg). The Simi-

49 Figure 3.1: Flow Diagram of the Interest Awareness Integration larity Interest of the node is rewarded by a predefined value reward and penalized by deducting it.

The algorithm has the potential to take into consideration two factors. It primarily considers the nodes’ interest when rewarding the social rank, but it can also include the rewarded interest of each of the nodes’ friends. This is made possible by computing the cumulative interest of the node and its friends which is achieved by combining the PSInt of the node and the sum of the PSInt of its friends.

Thus, highly favored forwarder candidates are those with high social ranks and as well those who, along with their friends have interest in the same message.

P SInt(i, msg) + P P SInt(j, msg) F rInter(i, msg) = j∈F (i) (3.2) |F (i)| + 1

Assume Rank(i) is the rank of the node that is computed by any social-aware forwarding algorithm.

This rank is rewarded/penalized by FrInter(i, msg) according to the following formula:

Iutil(i, msg) = F rInter(i, msg) ∗ Rank(i) (3.3)

Figure 3.1 illustrates the logic of interest awareness integration in the decision making of forwarding content when a message carrier c encounters node i. Each node computes its own social rank then computes its own SInt with respect to the forwarded message which is then rewarded or penalized based on comparing it against a predefined threshold IntF W DT hr. Accordingly, each node computes its new

50 interest-aware rank Iutil. At that point, node i sends its Iutil rank to node c which then compares

both ranks; namely, Iutil(i, msg) and Iutil(c, msg) in order to take the decision whether to forward the content to node i or not.

Generally speaking, integrating interest-awareness in social-based forwarding approaches maintains a balance between utilizing interest and social context information, which ultimately improves the per- formance of these forwarding algorithms in case there is any discrepancy in interest/social information availability.

3.1.2 Power Awareness Integration

To achieve fairness in utilizing the power of the nodes in a specific place, the social-aware forwarding algorithm has to maintain awareness of the nodes’ current power resources and the expected amount of power that will be consumed upon their participation in the message forwarding process. Accordingly, the algorithm favors nodes with higher power capabilities that will be able to sustain the delivery until completion, and avoids exhausting those with poor power capabilities. There are several measures to achieve this power awareness in forwarder selection.

Awareness of the Remaining Power-level

The power-aware forwarding algorithm checks the remaining power level of the candidate nodes in order to pick the ’wealthy’ nodes and avoid the poor ones. This is maintained by rewarding or penalizing the node’s rank as per its remaining power level. The power-aware forwarding algorithm defines a battery threshold T hrbat whose value is determined by experimentation, analysis, or based on the environment in which the algorithm is applied. If the node’s remaining power level exceeds the predefined battery threshold T hrbat, its rank is rewarded and it becomes a candidate for participation in the forwarding process. On the other hand, if its power level is less than T hrbat, its rank is penalized and the algorithm avoids utilizing it in the forwarding process. For this reason, the application fixes a battery threshold

above which the ’wealthy’ members of the community become suitable candidates to forward the message

content. Thus, the ranking function of the nodes is computed as follows:

P Iutil(i, msg) = RewardedP ower(i) ∗ Iutil(i, msg) (3.4)

where   (Bat(i) + reward) ifBat ≥ T hrbat, RewardedP ower(i) = (3.5)  (Bat(i) − reward) otherwise

Figure 3.2 illustrates the flow of processing that takes place when a message carrier c encounters node

i for the sake of decision making in terms of forwarding message msg from node c to node i. Each node

51 Figure 3.2: Flow Diagram of the Interest and Power Awareness Integration computes its interest aware rank Iutil by rewarding or penalizing its own social rank Rank based on the comparison of its similarity interest with the forwarded content’s Interest vector; namely, SInt with the predefined threshold IntF W DT hr. Next, each node computes its power and interest aware rank P Iutil by rewarding or penalizing its Iutil based on comparing the node’s battery level against a predefined battery threshold BatT hr. Node i then sends its P Iutil value to node c which then compares its own

P Iutil to that of node i in order to decide whether or not to forward a copy of the content to node i.

The forward action takes place if P Iutil(i, msg) is not less than P Iutil(c, msg).

We present two variations of the power and interest-aware forwarding algorithm that meet various metrics. The variations are:

Adaptive Battery Threshold Version (PAd): The logic behind this approach is to utilize the nodes whose ’wealth’ is above the current average ’wealth’ of the battery community. This version continuously adapts the battery threshold used in selecting candidates based on the obsAvgBat values noted from the battery levels of the encountered nodes. Accordingly, the candidates who maintain a battery level above the current observed average battery level are selected to be the next message carriers.

The obsAvgBat is computed as follows:

Bat(i) + P obsAvgBat(j) obsAvgBat(i) = j∈contact(i) (3.6) 1 + |j|

52 Fixed Battery Threshold Version (Px): This version compares the candidate’s battery level to a fixed battery threshold T hrbat instead of an adaptive battery threshold. The application fixes a battery threshold above which the ’wealthy’ members of the community become suitable candidates to forward the message.

Awareness of the Power Depletion Rate

To integrate awareness of the depletion rate of the battery of the node, the ranking function computes the predicted power level of the node by the time the TTL of the message TTL(msg) is reached. Given that the node has a record of its own battery’s depletion rate DepRate(i), the expected node’s power level at TTL(msg) expiry can be computed. Accordingly, the node’s power level Bat(i) is rewarded if the expected power level after the TTL indicates that this node can sustain the delivery process without exhaustion, or is penalized if it cannot. This concept is illustrated in the following equation:

 Bat(i)  (Bat(i) + reward) if − TTL(msg) ≥ T hrdep, P redP ower(i) = DepRate(i) (3.7)  (Bat(i) − reward) otherwise

In such a case, the power and interest-aware ranking function of the node PIutil(i) is computed as follows:

P Iutil(i, msg) = P redP ower(i) ∗ Iutil(i, msg) (3.8)

In addition to the earlier mentioned forwarder selection conditions, the message carrier node seeks candidate nodes whose power levels satisfy the following condition:

Bat(i) − TTL(msg) ≥ T hr (3.9) DepRate(i) dep

Fairness in Utilizing Resources

To attain fair utilization of the nodes resources among the community, the algorithm should not over utilize the resources of some nodes the forwarding process while under-utilizing others. Fair utilization is maintained by monitoring the power level of the participating nodes. The target of fair utilization is to attain a small variation among the nodes power capabilities by the end of the delivery process. This outcome is achieved by adopting two actions: On one side, the candidate selection process favors the power capable nodes as per the earlier mentioned selection criteria. On the other side, the algorithm always maintains a minimum threshold for the participating nodes’ battery level; if the power of a message carrier node approaches the predefined threshold T hrbat, it ceases participation in the message forwarding process.

53 Expectation of the Contact Duration

During the forwarding process, a significant portion of the consumed power is wasted in incomplete message transfers. The forwarding algorithm does not take into consideration the amount of time and power resources that are wasted due to its ignorance of the expected contact duration between the carrier and the selected forwarder node. In order for a successful message transfer between two nodes to occur, they need to remain in contact for enough time. Let us assume the least contact duration required for successful message transfer is transferTime. A power effective forwarding algorithm should consider the candidates who are expected to stay in contact with the message carrier for at least transferTime, otherwise the algorithm avoids selecting these candidates. This concept is attained by satisfying the following condition:

P redictedContactT ime(i, j) + ElapsedContactT ime(i, j) ≥ transferT ime(msg) (3.10)

where PredictedContactTime(i,j) is the predicted remaining duration of contact between the message carrier i and the candidate forwarder j as per any time prediction mechanism. ElapsedContactTime(i,j) is the time elapsed since the two nodes got in contact, and transferTime(msg) is the time required to transfer the message from node i to node j.

Hence, the algorithm favors the candidate nodes that are expected to be in contact with each other for entire the duration of the message transfer. There are several time prediction mechanisms proposed by other researchers [26] [136] [137]. The more accurate the prediction mechanism, the less likely the occurrence of incomplete transfers and thus the preservation of power is achieved. On another aspect, some the available prediction techniques are time and resource consuming. Thus, the selected prediction mechanism has to make a trade-off between accuracy and the consumed resources due to complexity.

3.1.3 Threshold-based Opportunistic Selection Integration

This version adds an extra opportunistic portion to the candidate selection process. This is achieved by forwarding the message to any interested forwarder whose battery level is above the fixed threshold

T hrbat. These favored forwarders need not be socially popular users, but rather are power-capable interested forwarders. In this version, the forwarding decision logic is an OR function between two grouped computations: In one group the candidate nodes’ power-and-interest-aware rank is compared against the message holder’s rank. In the other group, the candidate’s SInt and its Power capability are compared against their respective preset thresholds. In other words, a candidate is selected if EITHER its rank is higher than the current message holder’s rank OR if both its SInt is above the interested

Forwarder threshold T hrInt and its Power capability exceeds the battery threshold T hrbat as presented in the following conditions.

54 P Iutil(j, msg) ≥ P Iutil(i, msg) Or (SInt(j, msg) ≥ T hrInt AND Bat(j) ≥ T hrbat) (3.11)

3.2 Space Syntax Framework

A detailed background on Space Syntax Forwarding is provided in Section 2.5. In this section, we propose new approaches in Space syntax-based forwarding. A comparison matrix of the related work and our proposed solution is presented in Figure 3.1.

3.2.1 Space Syntax Metrics

A set of basic Space Syntax metrics are commonly used to set the popularity of any street or spot on a given map. These metrics are summarized in Table 3.2. We are concerned with five basic metrics that are detailed as follows:

Integration Value: The segment’s integration value is the average distance between the segment and all other segments in the map. This value represents the segment’s closeness to other segments.

Connectivity Value: The segment’s connectivity value is the number of segments intersecting with this segment. It represents the segment’s degree of connectivity.

Location Index: The Location Index is the deviation of the segment’s integration value from the maximum integration value divided by the range of integration values.

Attraction Value: It is an index that indicates the level of attraction of the spot with respect to the rest of the spots in the area.

Popularity Index: The popularity index of a segment is calculated as the normalized attraction value of the segment added to the sum of the normalized length indexes multiplied by the normalized attraction values of all the segments that intersect with this segment. The normalized length of an intersecting segment is the length of the intersecting segment divided by the segment whose popularity index is being calculated.

3.2.2 Problem Definition

Space Syntax has been integrated in city-wide content delivery applications. These applications rely on the minimum information required which constitutes the static values of popularity of the attraction spots in a specified location. These popularity values are pre-calculated based on the initial map of the place (i.e. the initial design of the place). Accordingly, these applications claim that they rely on minimum input and that they are able to preserve privacy since they do not request private information of users in that place.

55 Table 3.1: Comparison Matrix of Space Syntax-based Forwarding Related Work

56 Space syntax Graph theory Description/Definition Integration Closeness The mean distance between an axis and all other axes in the system. Connectivity Degree The connectivity of axis i is the number of axes that it intersects with. Choice Betweeness The number of times axis i is used when calculating the shortest paths between all pairs of axes in a system. Control – The degree to which axis i controls ”access” from and to the axis it intersects. Global metric – A metric for an individual axis, calculated using the whole system. Local metric – A metric for an individual axis, calculated using the axis’ neighborhood (e.g. axes up to three intersections away). Intelligibility – The correlation between axes connectivity and global integration Synergy – The correlation between axes’ local and global integration. PageRank PageRank The popularity of an axis i as determined by the number of popular axes intersecting with axis i (recursive)

Table 3.2: Mapping of the Basic Space Syntax Metrics and the Graph Theory terms [3]

We find that the Space Syntax approach can be utilized in micro-level maps and not only on city-wide level maps. For example, it can be applied on a university campus map, a conference map, a mall or an airport map. However, we argue that the current Space Syntax metrics do not reflect the real distribution of popularity values. To prove our claim, we cross validate by an external measure which is the frequency of association of the mobile nodes with the popular spots in a specific place. For the purpose of this cross validation, we utilized the AUC wireless traces dataset. We first computed the popularity values of 588 selected spots on the university map with several space syntax metrics; namely, the Integration Value, the Location Index, the Attraction Value, and the Popularity Index which is based AlJarhi’s approach

[28]. Next, we compared the calculated popularity values of the spots to the popularity values of these same spots based on the frequency of the mobile nodes’ association with these spots as being derived from real wireless traces. The resulting Spearman correlation between the Popularity index values and the popularity by frequency of association is -0.04 with significance 0.322 which indicates a very weak non-significant correlation. In addition, Spearman correlation between the popularity ranking based on frequency of association and each of the popularity ranking based on the integration value, the location index, and the attraction value resulted in the values -0.77 with the integration Value , 0.77 with Location

Index, and 0.77 with the Attraction Value as illustrated in Table 3.3. These correlations have the same significance value which is 0.061 indicating non-significant correlations. These results indicate that the basic Space Syntax metrics do not accurately reflect the popularity rank of the spots in the real traces which is represented by the frequency of association of mobile nodes with these spots. These values are summarized in Table 3.3. It is worth noting that although the reliance on these university wireless traces might seem to be an incomplete measure since not all users have mobiles that are connected to the wireless network, we believe that it represents a stratified sample of the community which is a valid indicator to the user density and mobility in this specific space. Accordingly, the results of the

57 Popularity By Traces Popularity Index Integration Value Location Index Attraction Value Correlation Coefficient -0.04 -0.077 0.077 0.077 Sig. (2-tailed) 0.332 0.061 0.061 0.061 Pearson Correlation -0.104* 0.057 -0.057 -0.057 Sig. (2-tailed) .012 0.164 0.164 0.164 * Correlation is significant at the 0.05 level (2-tailed).

Table 3.3: Pearson and Spearman Correlations between the Popularity Rank based on the Basic Space Syntax metrics and the Real Traces Frequency of association

correlation are still valid and are considered an indication of the success/failure of the popularity index

value in defining the correct rank of popularity of each spot among all the spots in the place.

Thus, we pose the question: how do you define attraction points on a map? We believe that the current Space Syntax approach does not properly define the real attraction points on a map. Instead, it imposes its predictions on the map which might not reflect reality when the place is in use. A proper definition of attraction points enables the correct computation of Space Syntax metrics which can then be integrated in computing mobile nodes’ ranks for proper forwarding decisions.

From that point the research proposes a set of new Space syntax metrics that rely on a history of recorded data such as the frequency or time duration of association with access points, association with popular spots among other set of proposed Space Syntax metrics. The Space Syntax framework then integrates any of these metrics with any PI-SOFA algorithms to enable more effectiveness and efficiency.

The flow diagram of the space syntax framework illustrated in Figure 3.3 follows the logic of inte- grating the space syntax metrics in the forwarding decision making process. The process starts when a message carrier c encounters a node i, and each one of them computes its rank as per the logic of the applied PI-SOFA framework. After that each one of them computes its own space syntax rank as per a predefined space syntax metric. This metric is selected from any of the space syntax metrics proposed in this research. Next, each node will reward or penalize its Space syntax rank by comparing it against a predefined Space syntax-based threshold P laceT hr. The rewarded/penalized rank is then multiplied by the node’s PISOFA’s rank to produce a new PISOFA-SpaceSyntax Rank where both nodes’ new ranks are compared for the purpose of taking the decision of forwarding the message to node i or not.

3.3 Dynamic Adaptive Ranking

Dynamic Adaptive Ranking is the third and the comprehensive framework that dynamically integrates

a set of parameters to compose a dynamic adaptive rank for each node. The forwarding decision making

mainly relies on this dynamic adaptive rank in order to adapt to the current context. The main proposed

parameters of the rank are dynamically weighted attributes which include: opportunistic selection; the

node’s interest and power capability; the measure of how active socially popular users are; and the

popularity of the place these users frequent. If one takes into consideration that as nodes move from

58 Figure 3.3: Flow Diagram of the Space Syntax Framework one place to another or as time passes their attributes change in value and the relative importance of each attribute changes, then a fixed weight of each attribute within the node’s rank might not reflect the current context and thus leads to incorrect node ranking approach. Thus, dynamically changing the values of the attributes’ weights helps reflecting the correct current rank of the node. Accordingly, this dynamic adaptive ranking framework, adapts to the current context by dynamically changing the relative weights of the attributes that decide the node’s rank. The dynamic change of weights relies on the changes in the values of the attributes and whether these current values fall into critical ranges that set these attributes ineffective in comparison to the remaining attributes at the current context.

Other dynamically changing contexts would include the case when any of these attributes’ values is not available at any certain time due to any reason, or because the user or its mobile node refuses to reveal its value.

Based on this logic, we propose a dynamically adaptive ranking function that changes the weights of its attributes based on the current context. This node ranking function ranks candidates based on dynamically weighted attributes which are enclosed in the following components: opportunistic selec- tion OppComponent, the user’s interest IntComponent, the node’s power capability P owerComponent, the measure of how active socially popular users are SocActiveComponent, and the popularity of the place these users frequent P laceAwareComponent. This dynamically adaptive ranking function can be formalized as follows:

59 DynAdpRank(i) = OppComponent + IntComponent + SocActiveComponent+

P owerComponent + P laceAwareComponent (3.12)

Let us have a closer look at each of these components.

1. The opportunistic selection component OppComponent gives space for a pure opportunistic selec- tion of candidates without being restricted by any certain criteria. This component is allowed to take place for the portion of wOpp of the ranking function components. This component can be formulated as follows:

OppComponent = wOpp ∗ Opportunistic (3.13)

2. The user’s interest component IntComponent considers the relative interest of the user in the forwarded content as an important factor in ranking nodes. This component consists of the similarity in interest between the user’s interest vector and the forwarded content domain of interest. It is more comprehensive to include the user’s friends’ similarity interest as well. Another interesting component that might be considered in the user’s interest is the power capabilities of the mobile node he is carrying since the power limitation of some devices might hinder the user’s interest in receiving any content even if it is within his range of interest. This component can be formulated as follows:

IntComponent = wIntP ower ∗ CumInterest(i) ∗ P ower(i) (3.14)

3. The social active component SocActiveComponent takes into consideration the social popularity of this user among his friends. It also considers the level at which the user is active in interacting and contacting other nodes. This component is aware of the active members and the sociable members since this factor can be a good indicator that these members will, with high probability, get in contact giving higher chance to meet with the destination nodes and to forward the content to any one of the interested friends. This component consists of the SocialRank(i) of the node and the measure of activeness of the node. There are several ways to measure these two parameters. Accordingly, this component can have various implementations. The ranking function relies on this component with the relative weight wActiveSocial. Generally speaking this component is formulated as follows:

SocActiveComponent = wActiveSocial ∗ SocialRank(i) ∗ Activeness(i) (3.15)

4. The node’s power capability component P owerComponent considers the node’s power capability as an affecting factor in selecting the next forwarder. Actually, the user whose mobile node’s power

60 capability is not high may refrain from participating in the forwarding process, and also the node that has low power capability may not survive till the completion of the delivery process. It is more effective to highly rank power capable node among all the available nodes in the place. However, to maintain fairness in utilizing the nodes in the place, there is a need to set a threshold below which nodes are forced to drop the forwarded message copy and abandon the participation in the forwarding process. Generally speaking this component is formulated as follows:

P owerComponent = wP owerAware ∗ P ower(i) (3.16)

5. The popularity of the place is an important component to consider in ranking nodes. This component; namely, P laceAwareComponent relies on the importance of the Space Syntax metrics on decision making. There are various space Syntax metrics that can be utilized on computing the place popularity attribute P laceP opularity(i). These metrics have been explained in detail in Section 3.2 and their proposed implementations are illustrated in Chapter 6. The ranking function relies on this component with the ratio computed by the place popularity weight wP laceP opularity. This component is formulated as follows:

P laceAwareComponent = wP laceP opularity ∗ P laceP opularity(i) (3.17)

In more detail, the dynamic adaptive ranking function is computed as

DynAdpRank(i) = wOpp ∗ Opportunistic + wIntP ower ∗ CumInterest(i) ∗ P ower(i)+

wActiveSocial ∗ SocialRank(i) ∗ Activeness(i)+

wP owerAware ∗ P ower(i) + wP laceP opularity ∗ P laceP opularity(i) (3.18) where

wOpp = 1 - (wIntP ower + wActiveSocial + wP laceP opularity)

wIntP ower = f(SInt(i, msg), bat(i))

wActiveSocial = f(SInt(i, msg)) P laceP opularity(i) = f(P op(Location(i, now), P op(F reqLocation(i))))

There are five weights that dynamically change in value to adapt to the current context. These weights are:

The first weight is wOpp which is the weight of the opportunistic selection component. Its value is actually the remaining portion after computing all the weights of the other components.

The second weight is wIntP ower which is a monotonically decreasing function of the similarity interest (SInt) of the node and its power capabilities P ower(i). The P ower(i) parameter is the remaining power

61 of node i. This power component can be represented as the rewarded predicted power of the node which

is computed as the RewardedP ower(i) in equation (3.5). Also, the battery depletion rate awareness can be introduced in the ranking function where P ower(i) is computed as per P redP ower(i) in equation

(3.7). It is worth noting that in some applications the similarity interest of the user with the domain of

the message content may not be constant as it may change over time. Accordingly, the SInt(node, msg)

becomes a function of time as well. This is depicted in the following formula:

SInt(node, msg) = f(time, IntF V (node), IntF V (msg)) (3.19)

The third weight is wActiveSocial which is a function of the SInt of the node. This is attributed to the fact that the ranking function pays more attention to nodes that have interest in the forwarded

content. That is why, the node’s interest in the forwarded content emphasizes its social and activeness

components.

The fourth weight is wP owerAware which is the weight that sets the relative effect of the node’s power capability in the node ranking process. This weight decreases as the node’s power decreases to indicate

the decreasing value of this node when selecting it to participate in the forwarding process.

The fifth weight is wP laceP opularity which is the weight of the place popularity of node i; namely,

P laceP opularity(i). This wP laceP opularity weight is a function of both the current location of node i, P op(Location(i, now)), and the popularity of the most frequent location in which node i stays along its

path (P op(F reqLocation(i)).

3.3.1 Dynamic adaptive Ranking Cognitive Map

The causal relationship between the various parameters is illustrated in Figure 3.10. Let us discuss it in more details through deducing the following flow from this cognitive map:

The main components of the node’s rank are its social popularity, its activeness, its location-based popularity, its interest, its power capabilities, and the opportunistic forwarding to this node.

1. The node’s social popularity can be measured using several approaches (such as centrality and

betweenness among other measures) in order to mainly assess its rank according to social friendship

and social contacts.

2. The node activeness is determined by three factors: the node’s co-location with any one of the

targeted destination nodes, and its change in degree of connectivity with other nodes, and the

node’s co-location with its friends. Thus, Activeness(i) = f(Col(i, Dest),CDC(i), Col(i, F (i))).

This is illustrated in Figure 3.4. It is worth noting that the plus sign in this figure indicates

a positive correlation between the two variables. For instance, as the attribute co-location with

friends increases the activeness attribute increases.

62 Figure 3.4: Factors determining the Activeness Component

3. The opportunistic selection component allows any forwarding algorithm to select a node based

on a predefined probability that it will become the next forwarder upon its encounter with one of

the message carriers. It is worth mentioning that some factors guide the probability of selecting this

node for forwarding. These factors include: the popularity of the place where the node is currently

located P opularity(location(i, now)), the node’s similarity in interest with the forwarded content

SInt(node, msg), and finally the remaining power of the mobile node and its capacity to sustain

the forwarding process during the time-to-live of the forwarded content which is the predicted

remaining power P redP ower(i). This relationship is illustrated in Figure 3.5.

Figure 3.5: Factors determining the Opportunistic Selection Component

4. The Location-based Popularity component is the popularity of the node based on its being

currently located in a popular place, and also based on its frequent location in popular places or

its long stay in popular places. Thus, it can represented by the following formula:

Location − basedP opularity(i) = f(P opularity(location(i, now)), Max(F req(Col(i, P opularLocations)))).

where P opularity(location(i, now)) is the Space Syntax popularity of the current location of node i,

Col(i, P opularLocations) is the co-location of node i in the popular places while F req(Col(i, location))

is the frequency of the co-location of node i in a certain location for which Max(F req) is the max-

imum frequency of co-location. This causal relationship is illustrated in Figure 3.6.

The popularity of a location can be computed based on Space Syntax metrics or based on the

recorded user mobility. This point is discussed in detail in Section 6.2.

There are two factors that control the weight of the above mentioned main factors:

63 Figure 3.6: Causal Relationship of the Location-based Popularity Component

1. Interest Awareness is achieved through emphasizing the reliance on the potentially interested

nodes and avoiding to forward content to the uninterested ones. This is measured by identifying

the user’s similarity interest in the forwarded message or the cumulative interest of the user and

their friends. A user with high interest has a greater chance of being selected opportunistically.

Furthermore, the higher the user’s interest and cumulative interest the more their social popularity

component gets magnified. In addition, the higher the user’s interest, the stronger the weight of

the activeness component of their rank. This relationship is illustrated in Figure 3.7.

Figure 3.7: Preliminary Factors determining the Interest Awareness Factor

2. Power awareness is maintained by emphasizing reliance on nodes with higher levels of remaining

power. This can be achieved by factoring the depletion rate of the node and its expected contact

duration with the message carriers. The faster the depletion rate of the mobile battery, the more

rapid the decrease in its remaining power. Moreover, the longer the contact duration between

the message carrier and a candidate node, the higher the chance of a complete transfer of the

message. Diminishing the occurrence of incomplete transfers allows less resource waste and more

power preservation. The influence of these preliminary factors is illustrated in Figure 3.8. It is

worth noting that the minus sign indicates a negative correlation. For instance, as the wasted

power increases, the remaining power attribute value decreases and vice versa. Also, the predicted

power level of the node is formulated as

AvgContactT ime(i,OtherNodes) P redP ower(i) α depRate(i), MsgT ransferT ime

64 Figure 3.8: Preliminary Factors determining the Power Awareness Factor

The predicted power level of the node then magnifies or diminishes three main components of

the node’s rank, namely, the node’s social popularity component(see Figure 3.9), its activeness

component (see Figure 3.4), and its opportunistic selection component (see Figure 3.5).

Figure 3.9: Causal Relationship of the Social Popularity Component

To sum up, three of the main components of the node’s rank are magnified via its interest level and its power level from different aspects: First, the level of interest of the user in the forwarded message positively affects the node’s rank. Second, the higher the power level of the node the higher the node’s rank. Finally, both the interest and power factors are rewarded or penalized by employing a set of context-specific information such as the user and friends’ interest, the battery depletion rate, and the co-location duration.

From all the above, the weights of the adaptive ranking algorithm are computed using the following formula:

AdpRank(i) = wOpp ∗ Opportunistic + f(SInt(i, msg), bat(i)) ∗ CumInterest(i) ∗ P redP ower(i)+

f(SInt(i, msg)) ∗ SocialRank(i) ∗ f(Col(i, Dest),CDC(i), Col(i, F (i)))+

wLocationP opularity ∗ f(P op(location(i, now)), F req(Col(i, P opularLocations))) (3.20)

From all the above, this section details the proposed components of a dynamically adaptive ranking function that evaluates the node based on the user’s interest in the content, the node’s power capabilities, the user’s activeness in interacting with other nodes, the user’s social popularity among his contacts, and the popularity of the places the user frequent. Since these components vary in their degree of influence

65 Figure 3.10: Cognitive Map of the Adaptive Ranking Function Parameters on the node’s rank from one situation to another, the ranking function sets dynamically varying weights for these components. The weights vary based on the current context at the time of evaluating the node’s rank.

In this chapter, we proposed three frameworks for improving the effectiveness and efficiency of social- aware opportunistic forwarding algorithms. The first framework is the PI-SOFA framework which inte- grates interest-awareness and power awareness into any social-aware forwarding algorithm. The second framework introduces space syntax awareness in the PI-SOFA algorithms in order to leverage the in-

fluence of the surrounding space in improving the forwarding process towards more effective and more efficient performance. Finally, the third proposed framework is the dynamic adaptive ranking which is aware of the dynamic change in the importance of the components of ranking and accordingly it changes their weights based on the current detected context.

66 Chapter 4

SAROS: A Social-Aware

Opportunistic Forwarding Simulator

40% of the world population has an Internet connection with traffic reaching 686 billion GB to date

[138]. These numbers continue to rise exponentially, spurred by the growing advancement in network technology and the popularity of . Moreover, the pervasiveness of mobile devices and their progressive smart features have motivated users to view and share abroad range of content on social media, further contributing to the production of massive traffic data volume. As of November 2016, there are 1.18 billion monthly active Facebook users among which 1.09 billion are active mobile users [7], and YouTube has over a billion users allowing people to watch hundreds of millions of video hours on

YouTube daily, and generating billions of views [8]. Social networking sites have produced a goldmine of information about users, their interests and the nature of their social interactions.

Despite all these advancements, there are a set of challenges that impede users’ appreciation of the offered services. Among these obstacles is the ongoing demand for mobile data that exceeds infrastruc- ture support. This bandwidth demand-supply gap causes frequent contention due to the overwhelming number of uploads of media content and network access required by pervasive mobile devices. With such increasing mobile data traffic (refer to Figure 1.1), the network infrastructure has become overloaded and users experience occasional network service unavailability. In addition, the cost service delivery continues to rise [9] discouraging users from engaging in many of the offered services. In fact, not all people have predefined routes connecting them, and not all places are covered by the available network infrastructure

[126]. Especially in many locations around the world where urbanization occurs at much higher rates than network infrastructure support and deployment. All these challenges point to the need for ad hoc connections [10], delay tolerant connections [11] [139] [140] and opportunistic networks [12] [141] that would enable communication in such network-challenged environments. The emergence of opportunistic

67 networks in particular, has become a significant area of research that will likely continue to expand due to several factors: data expansion, the 4G high cost (and the emerging 5G), and the need for connections in developing urban places due to lack of a stable network infrastructure.

Reliance on content delivery via mobile opportunistic networks calls for state-of-the-art opportunis- tic forwarding algorithms [24] [142] [22] [143] [126] that complement delivery via an established network infrastructure. However, practically assessing and evaluating these algorithms in such domains is ex- tremely challenging since establishing realistic testbeds is both time consuming and costly. Reproducing real traces is also difficult due to external environment challenges [144] [145]. Meanwhile, evaluation through synthetic mobility models allows for fine-tuning but can only cover limited mobility character- istics [145]. We believe that building realistic simulators that reflect current usage trends and state-of- the-art solutions is the compromise; simulators facilitate the examination of various setups in controlled and reproducible environments.

There are several well-known simulators in the area of opportunistic networks such as ONE [145] and OMNeT++ [144]. These simulators provide measures of the standard network-based performance metrics, and offer a graphical interface for parameter input and animated output [146]. However, these simulators fall short in incorporating a set of features such as generating or importing social graphs; synthesizing or importing users’ interest vectors; simulating power consumption; simulating realistic usage profiles; fully implementing power consumption patterns of mobile nodes1; and implementing several state-of-the-art social-aware opportunistic forwarding algorithms, against which researchers can compare their solutions.

This chapter proposes SAROS, a Social AwaRe Opportunistic Forwarding Simulator, developed to simulate mobile opportunistic networking environments. Our goal is to implement an opportunistic network simulator that can evaluate the performance of novel social-aware forwarding algorithms. This simulator is enriched with realistic mobility model implementations [148] and the manipulation of real mobility traces such as SIGCOMM09 [149] and INFOCOM06 [150] mobility traces. For realistic interest and social-aware simulations, SAROS encompasses synthesized social and interest graphs as well as real data from realistic settings. For realistic power simulations, SAROS also imports real power consumption models of popular mobile brands [5]. It also imports real usage profiles [4] - and incorporates the modified kinetic battery model [2].

SAROS implements state-of-the-art social-based opportunistic forwarding algorithms such as Profile-

Cast [142], BubbleRap [97], SocialCast [24] and PeopleRank [22]. It also modifies them so that they become social and power aware. Additionally, SAROS provides a sophisticated interface and produces a rich set of output data and graphs of the measured metrics. We have conducted a set of experiments to

1Note that ONE has an add-on for implementing general energy support, while the battery module of the ZigBee OPNET Simulation Model computes the consumed and remaining energy levels as per the MICAz/TelosB mote specifications only [147].

68 validate and verify the implementation of this simulator.

4.1 Related Work

Opportunistic network simulators are categorized according to several criteria [146]; namely, whether

they are stochastic or deterministic; steady state or dynamic; terminating or non-terminating; discrete,

continuous or hybrid; and local or distributed. Our area of interest concentrates on discrete-event

simulators. There are several well-known discrete-event opportunistic networks simulators such as ONE

[145], NS2 [146], OMNeT++ [144], CCPAC [151], and OPNET [152] that offer significant features such

as: 1) a graphical user interface for the initial configuration of topology setup and parameter settings;

2) support for modeling different network-specific hardware; and 3) an interface to model, graph, and animate resulting output [146] [153].

However, these simulators fall short when it comes to the following: 1) Social-aware features such as generating or importing social graphs; 2) Interest-Aware features such as synthesizing or importing inter- est vectors, or considering the significance of the intermediate nodes’ interest as an incentive to forward messages; 3) Power-Aware features such as simulating power consumption, simulating real usage profiles, and fully implementing the power consumption patterns of mobile nodes [146]; 4) The implementation of many state-of-the-art social-aware opportunistic forwarding algorithms such as SocialCast [24] and

ProfileCast [142]; 5) OPNET and NS2 also suffer from object-oriented scalability problems [146]; 6)

The Packet formats, energy models, MAC protocols, and sensing hardware modelled in NS2 all differ

from those found in most wireless devices [146]; and 7) the capacity to import real mobility traces and

mobility models except for the opportunistic networking extension of OMNeT++ [144]. However, it is

worth mentioning that the ONE simulator allows for the generation of node movement from an external

program or from a real-world GPS trace, but users have to convert trace files into a format suitable for

the External Movement module.

In this chapter, we present SAROS, an event-driven opportunistic network simulator, which enables

us to overcome many of these drawbacks. This work has been previously presented in a conference paper

[154].

4.2 Modular Architecture

The SAROS simulator, as shown in Figure 4.1, is constructed from a set of modules simulating content

forward in a mobile opportunistic network within a certain area, namely:

1) An interest distribution simulation module that simulates the users’ interest profile that is stored

on their mobile devices;

2) A power simulation module that simulates the mobile devices’ power distribution and consumption;

69 3) A mobility simulation module that simulates user mobility, their encounters with other users along

their path, and also that enables importing a set of real mobility traces accompanied with their interest

and social graphs to be imported;

4) A forwarding algorithms simulation module that implements a variety of opportunistic forwarding

algorithms to simulate message delivery;

5) An evaluation module that implements a set of performance metrics; 6) An output module that produces the simulation results in the form of graphs and text files. The remainder of the section will discuss these modules in greater detail.

4.2.1 Simulation of Interest Distributions

To simulate the variance in users’ interest in a certain message, SAROS sets thresholds for the Similarity

Interest which is comprised of the similarity between the interest vector of both the user and the message.

The Similarity Interest of the sender acts as a knob that controls what is considered to be an acceptable number of contacted uninterested users. As such, it acts as an initial cutoff point in the forwarder selection process. Thus, it is a parameter in the simulator.

The interest distribution simulation module imports interest distributions from real mobility traces.

Alternatively, it can synthesize interest distributions as follows:

Normal Interest Distribution:

The simulator classifies the users based on their interest in the forwarded content and according to a discrete normal interest distribution where the destination set covers 2% of the community, the interested forwarders set covers 48%, while the remaining 50% are uninterested nodes. Here, the interest distribution of the population is a discrete normal distribution.

Figure 4.1: SAROS Simulator Modular Architecture

70 Discrete Uniform Distribution:

The simulator equally distributes the users into 11 categories with varying interest levels in the range [0-

1]. Accordingly, the destination set constitutes 18% of the mobile users’ population while the interested forwarders set covers 36% of the population. The remaining 46% of the population are uninterested.

Here, the interest distribution of the population is a discrete uniform distribution.

Two Disjoint Interest subgraphs:

The simulator divides the users into two separate interest graphs: 20% of the community are destination nodes while the remaining 80% of the community are uninterested nodes. This kind of distribution does not feature interested forwarders, and as such poses a challenge for the algorithm in the forwarding process given the absence of intermediaries.

Location-based Interest:

To simulate location-based interest distribution, the simulator is given the coordinates of a certain area within the simulation environment. All users who are found in this place at time zero are identified as interested users. In other words, all users located within the selected area at the beginning of the simulation are marked as interested users.

4.2.2 Realistic Power Consumption Modeling

The power simulation module generates realistic power consumption values and various battery level distributions upon which the simulation runs are based. This module generates the initial distribution of battery levels before the simulations begin. It also implements the battery discharge and charge recovery model [155], imports the usage profiles of mobile users [4], and randomly distributes these profiles among the mobile nodes. Finally, the module simulates various battery depletion rates.

Initial Battery Distributions:

Different battery distributions are simulated for various purposes.

Full Battery Distribution: All the mobile nodes start with full battery levels in order to extract the pure effect of each algorithm on the nodes’ power in terms of consumption.

Normal Battery Distribution: The mobile nodes’ battery levels follow discrete normal battery distribution in order to resemble the variations that exist in real life battery communities. The result of this distribution is stored in an external file so that they can be used when running in batch mode instead of interactive mode.

Heatmap Battery Distribution: The simulations may use a distribution that is based on a real dataset of the remaining battery capacity as recorded by [1] for 10 mobile nodes in 24 hours on a per-hour

71 basis. The recordings of the 10 nodes represented in a heatmap chart are copied in Figure 4.2. This dataset is imported when a real dataset of battery distributions is required for the experiment. These recorded values are randomly distributed among the nodes simulated at the beginning of the simulation run. This battery consumption data is also utilized when simulating real-based depletion rate models.

In such cases, each node is assigned a depletion rate based on its initial assigned battery capacity and its remaining battery capacity after an hour as per the heatmap chart.

Figure 4.2: Average Remaining Battery Capacity per hour [1]

Battery Discharge Simulation:

The simulator models the discharge behavior of the batteries of the simulated mobile devices. This simulation uses either a simple discharge model or the Kinetic Battery model.

Simple Battery Model: The Simple Battery Model reduces the battery’s charge capacity following any wireless communication between mobile nodes, such as scanning other nodes, or when forwarding or receiving messages. In the idle state, the battery does not recover any charges since this model does not consider the recovery effect of the Li-ion battery.

KiBaM Model: Alternatively, the simulator applies the Kinetic Battery Model (KiBaM) [2] which is a well-known analytical model based on the Ni-MH battery charges’ chemical kinetic process. It models the battery as 2 wells of charge, namely, the available-charge well which supplies electrons directly to the load, and the bound-charge well which supplies electrons only to the available-charge well. The simulator implements the stochastic modified KiBaM model [2] that models the battery charges in a 2-charge-well model as shown in Figure 4.3. This model considers the recovery and rate capacity effect for batteries with the following parameters: the total charge of the battery, the capacity ratio between the 2 charge wells, the 2 wells’ heights and the battery conductance. The parameters required for establishing the stochastic modified KiBaM model are provided in Table 4.1.

72 Table 4.1: Parameters of the KiBaM model

Parameter Description Range N Total charge of the battery per battery brand c Capacity ratio. c is the fraction of total (0 - 1) battery charge in the available-charge well I Charges drawn from the available-charge well at one timeslot J Charges transferred by the bound-charge well to the available-charge well at one timeslot i, j Amount of charges in the 2 charge wells N h1, h2 Heights of the 2 charge wells h1 * c = i, h2 * (1-c) = j k’ conductance per battery brand t length of the current idle slot i.e. time since some current 0 - simulation time was drawn from the battery previous to the current moment Q quanta of charge recovered 0 - all charges qx the probability that in one timeslot, x charge units are demanded 0 - 1

According to equations 4.1 and 4.2, the charges i, j of the 2 wells are

h1 ∗ c = i (4.1)

h2 ∗ (1 − c) = j (4.2)

The modified KiBaM models the battery using a three dimensional Markov chain with the three-state parameters (i, j, t). At t=0, that is, when the current is being drawn continuously, the state is computed as per equation 4.3

(i, j, 0) → (i − I + J, j − J, 0) whereJ = k0 ∗ h2 ∗ (h2 − h1) (4.3)

The transitions that are possible during an idle slot and their corresponding probabilities are listed in

Table 4.2.

It worth mentioning that for Li-ion cells there is a shortcoming of the kinetic battery model appears to be that solid state diffusion is not taken into account. In solid phase, the application of an external

Figure 4.3: The Stochastic Modified KiBaM Model [2]

73 Table 4.2: Possible Transitions in the idle slot

Transition Probability (i, j, t) → (i + Q, j - Q, t + 1) q0 * p(t) (i, j, t) → (i, j, t+ 1) q0 * (1 - p(t)) (i, j, t) → (i - I + J, j - J, 0) qI driving force makes the diffusing particles experience a drift motion in addition to random diffusion.

This effect of diffusion drift of charge carriers is discussed in detail in the specialized electrochemical literature on all-solid batteries and is known to hamper performance of the units. This is considered a limitation in the kinetic battery model and our simulator does not simulate the hamper performance of batteries.

Mobile Users’ Usage Profiles:

Several usage patterns are simulated and randomly distributed among users. The simulator applies the usage profiles that were previously defined and recorded in another research work [4] and in which the authors study and categorize the energy usage and battery lifetime into five usage profiles. The defined

5 usage profiles are:

• Suspend: which represents the baseline profile case of a device which is on standby, without

placing or receiving calls or messages.

• Casual: a user profile of a device which conducts a small number of daily voice calls and text

messages.

• Regular: represents the profile of a user who spends an extended time listening to music or

podcasts, combined with more lengthy or frequent phone calls, messaging and occasional emailing.

• Business: a user profile of a device with extended phone calls, email use and some web browsing.

• Portable Media Device (PMD): a user profile of a device with extensive media playback.

According to their study, the time and power consumption per usage profile is illustrated in Table

4.3: Profile Consumed Time Battery Lifetime (in minutes) (in hours) Suspend 0 49 Casual 15 SMS, 15 Calls 40 Regular 30 SMS, 60 Audio, 30 Calls, 15 Web Browsing, 15 Email 27 Business 30 SMS, 60 Calls, 30 Web Browsing, 60 Email 21 PMD 60 Video, 180 Audio 29

Table 4.3: Usage Profile Time Consumption and Battery Life [4]

The power consumption model applied for activities conducted within these usage profiles is shown in Table 4.4. Accordingly, for a 1-hour experiment, the power consumption is calculated by dividing

74 the daily activity time by 24 hours, then the result is multiplied by the power consumption of this activity. The result is the energy consumed due to this activity per hour. The consumed energy amount is distributed evenly for the usage profile within the hour duration. For instance, the audio playback activity consumes 320 mW. According to the PMD profile, the node plays audio for 180 minutes per day i.e. 12.5% of the day (1 day = 1440 minutes). Thus, it plays 7.5 minutes per hour. During the 1-hour experiment, for the PMD profile, we deduct 144000 joules (320 mW * 7.5 min * 60 sec) at a random time within the hour to indicate audio playback action. Similarly we deduct 68025 joules for video playback action at another random time within the simulation hour.

Activity Consumed Power (in mW) Audio playback 320 Video playback 453.5 SMS 302.2 Call 1054.3 Email 432.4 Web Browsing 352.8

Table 4.4: Power Consumption of Various Activities of the Usage Profiles [4]

Mobile Brands’ Power Consumption:

The simulator imports the power consumption values of four popular phone brands studied by [5]. These values are used when making the power consumption calculations for the modes of WiFi/Bluetooth scanning, forward/receive, and idle. These power consumption values are listed in table 4.5.

Depletion Rate Distributions:

The simulator implements the following depletion rate distributions:

Real Dataset Depletion Rate: In the real battery distribution dataset environment, the depletion rate is computed based on the difference between the given battery distribution at the beginning and at the end of the hour.

Usage Profile-based Depletion Rate: The depletion rate is calculated based on the power con- sumption as per the usage profiles imported from [4].

Random Depletion Rate: The simulator randomly and uniformly distributes the depletion rate among the nodes within the range of values [0% - 100%].

4.2.3 Mobility Simulation Module

The mobility simulation module imports one mobility model, and also imports and manipulates various real mobility traces. These real traces are recorded from a selection of environments including a mall environment, conference environments, and university campus environments.

75 Table 4.5: Four Mobile Brands Power Consumption Values [5]

Mobile Brand Power Consumption (in mW) Samsung i900 idle=2 Omnia phone WiFi(TCP mode) Scan=664, forward=1600, receive=1496 send throughput= 1232, receive throughput = 1336 Battery 1440mAh Bluetooth(RFCOMM mode) Scan=173, discovered node=29, forward=520, receive=456 send throughput= 137, receive throughput = 128 HTC Diamond 2 idle=60 T5353 WiFi(TCP mode) Scan=1284, forward=1548, receive=1512 send throughput= 1034, receive throughput = 1294 Battery 1100 mAh Bluetooth(RFCOMM mode) Scan=676, discovered node=60, forward=748, receive=708 send throughput= 115, receive throughput = 135 Galaxyi7500 idle= 72 WiFi(TCP mode) Scan= 72, forward= 992, receive= 704 send throughput= 830.4, receive throughput = 752.4 Battery 1500 mAh Bluetooth(RFCOMM mode) Scan= 160, discovered node= 88, forward= No value, receive= No value send throughput= No value, receive throughput = No value Note: The authors did not provide some values for this brand as they claim that due to limitation of the operating system, the Galaxy phone does not support Bluetooth-based file transfer. Spica idle= 32 WiFi(TCP mode) Scan= 32, forward= 992, receive= 936 send throughput= 898.4, receive throughput = 742.4 Battery 1500 mAh Bluetooth(RFCOMM mode) Scan= 152, discovered node= 136, forward= 584, receive= 376 send throughput= 115.7, receive throughput = 112.8

SLAW Mobility Model:

The Self-similar Least Action Walk (SLAW) mobility model [137] is a mobility model for persons on foot that can produce synthetic pedestrian mobility traces. SLAW Trace Generator [148] is a MATLAB-based program that produces mobility traces that are effective in representing social contexts present among people sharing common interests or those in a single community such as university campus, companies and theme parks. The social contexts are typically common gathering places that most people visit during their daily lives such as student unions, dormitories, street malls and restaurants. SLAW expresses the mobility patterns involving these contexts by fractal waypoints and heavy-tail flights on top of the waypoints [137]. The authors have shown that SLAW effectively expresses mobility patterns arising from people with some common interests or within a single community like students in the same university campus or people in theme parks or other such places where people gather around a common purpose.

76 Table 4.6: Parameters of the SLAW Trace Generator

Parameter Name Description Values dist-alpha distance alpha (1 < dist-alpha < 6) 3 num-user number of mobile users 100 size-max a side of a right square simulation area 1000 n-wp number of waypoints 1000 v-Hurst Hurst parameter for self-similarity of waypoints 0.75 or 0.9 (0.5 < v-Hurst < 1) Thours Total hours of trace generation (hours) 1 or 2 B-range clustering range (meter) 30 If the waypoints are in B-range, they are considered as belonging to the same cluster beta Levy exponent for pause time, 0 < beta <= 2 1 MIN-PAUSE minimum pause time (second) 30 MAX-PAUSE maximum pause time (second) 3600

We have made a small modification in the internal parameters of the generator in order to make it produce traces per second instead of per-minute traces. The generator requires a set of parameters in order to produce the human mobility traces. Table 4.6 lists the parameters we submitted to the generator in order to produce the traces we used in our simulation runs.

Import of Real Mobility Traces:

The simulator imports several real datasets, but it has to manipulate their data format so that it can

fit the format accepted by the simulator. Here are the details and manipulation process of the imported datasets.

SIGCOMM09 Dataset: To further validate the algorithms’ performance using real social-based mobility traces, the SAROS simulator imports the mobility traces, interests and friendship graphs gath- ered during the SIGCOMM 2009 conference [149]. In this conference, 76 participants (out of the 4700 detected users) were handed smart phones and were asked to use the installed MobiClique application for mobile social networking during the conference. Their social information - namely, the list of friends and interests - was collected from their Facebook social profile. It also includes the real social informa- tion of the 76 participants namely; friend lists added during the conference duration, interests added during the conference duration, and answers to questionnaires. Furthermore, this dataset provides the encounter traces for the 4700 detected users which totaled 12037 user encounters during the 4 days of the conference.

Although, there was no social information about the remaining conference attendees whose encounters were recorded, SAROS can extrapolate, with acceptable precision, the social and interest profiles of all the detected users based on the program schedule of the conference, the map of the session rooms, and the recorded mobility traces. Finally, the simulator randomly selects message senders out of the 4700 users.

77 INFOCOM06 Dataset: In addition, the simulator imports real mobility traces gathered during

the INFOCOM 2006 conference [150]. In that experiment, 20 static iMote nodes were installed to detect

mobile devices within a 100m range (17 static nodes in the conference rooms and 3 nodes placed in

the elevators), and 78 conference attendees were given iMote devices with a 30m range. These users

filled in a questionnaire in which they were asked to explicitly state their interests. The total number

of detected nodes was 4704 including the nodes of the users who filled the questionnaire. The technical

program included 9 tracks (including one for the opening session) located in a subset of the 20 monitored

locations.

SAROS imports these traces and synthetically generates user profiles, friend lists, and interest feature

vectors for the detected users. In manipulating the conference data, the simulator considers the 20

locations to be equivalent to 20 different areas of interest. The simulator collects the frequency of

residence for each of the 4704 users in each of the 20 locations. This information is then used to

ascertain the user’s interest in each track where his/her interest equals the ratio of frequency of residence

in a specific location n in relation to the frequency of residence in all 20 locations. Alternatively, the

simulator collects the total duration of residence of each user per location. Then, the user’s interest in

this location will be the ratio of his duration of residence in this location in relation to his residence in

all 20 locations. After extra refinement made by the simulator, it was found that 75.78% of the users

were undetected by any of the 20 locations; thus, it was assumed that these users were not interested

in any of the 20 monitored interest areas. Further manipulation through the extrapolation of interest

based on users’ mobility, the program schedule, and the sessions’ locations enables the simulator to

deduce the remaining users’ interest. To construct a friendship graph, the simulator considers any 2

users to be friends if they share interest in at least m of the 20 locations. Finally, the message sources’

locations are picked at random from the static nodes’ locations. The simulation parameters when using

the INFOCOM dataset in the simulation runs are listed in Table 4.7.

Parameter Value No. of users 4724 users in 4 days No. of days 4 days No. of locations 20 locations (17 static iMote devices and 3 in the elevators) proximity range 100m for the 20 locations, 30m for the other users Intervals between readings Not mentioned Conference area Not mentioned; the conference plan has no meter no. of interests as per mentioned in the filled questionnaire

Table 4.7: Parameters of INFOCOM Dataset

Mall Environment Mobility Traces: The simulator imports and manipulates real encounter traces in a shopping mall environment that were collected and published based on another study [136].

The authors conducted an experiment which involved placing static mobile devices in 13 shops. Also, 7

78 employees working in some of these shops were handed mobile devices. The experiment lasted for 6 days and collected the timestamp and location of encounters with any mobile device that came within a 30m

Bluetooth range. The collected dataset includes encounters with the other employees participating in the experiment from 9am till 5pm. The detected unique devices amounted to 745 in the 6 days with an average of 168 detected devices per day. The Bluetooth readings were recorded every 134 seconds. It is worth mentioning that the mall area is small (around 125m x 85m). Table 4.8 lists the most prominent information about this dataset.

The simulator imports the traces and convert them into 1-day trace files. These trace files are then used by the simulator to extract the required number of hours of mobility traces for use within the simulation runs.

Parameter Value No. of users 745 users in 6 days with an average of 168 users per day ranging from 137 to 203 users per day No. of days 6 days Time duration per day 9 am 9 pm No. of shops 13 Bluetooth range 30m Intervals between encounter readings 134 sec. Mall area 124.128m x 84.476m

Table 4.8: Mall Environment Dataset Parameters

St. Andrews University Dataset:

This is a traceset of a privacy study, including encounters, sharing preferences, and accelerometer readings that was conducted in two universities [156]; two separate runs were performed at the University of St. Andrews, while another two runs were performed at University College London with 20 students participating in each run. The mobile phones handed to the participants automatically collected the users’ locations and uploaded it to a server. Participants could choose the information to be disclosed on

Facebook, and to whom it could be disclosed. Participants also received ESM questions on the mobile phone (concerning activity, sharing choices, and privacy concerns), which were also answered through the same device. The experiments took place in the duration of the 22nd of April 2010 till the 20th of

Nov. 2010.

Thus, the simulator imports this dataset as a representation of a university environment with students’ mobility traces and their social networks. It includes user encounters, friend lists/social groups, events that users respond to, and users’ answers to interview questions.

AUC Wireless Traces Dataset:

We have requested from the AUC Institutional Review Board (IRB) permission to collect the on- campus mobility traces of 100 AUC students and/or staff members for a whole semester. Following their approval, a consent form (please refer to appendix E to view the consent form) was used to collect the

79 consent of 100 AUC students. The demography of the participants in this experiment is as follows: 44 males and 56 females. The students were made up of 12 graduates (G) and 88 undergraduates (UG) distributed among 19 majors as detailed in Table 4.9. We also categorized them into 23 social groups based on where they were first encountered by the researcher in terms of their gathering spot with friends, their presence at the booth of a particular club, and other grouping features such as their presence in a specific classroom or lab.

No. Major Class No. Major Class No. Major Class 10 Computer Science G 24 Computer Science UG 1 Chemistry G 1 Communications UG 1 Gender Studies G 17 Economics UG 10 Political Science UG 2 MICT UG 6 Undeclared UG 1 Petroleum Eng. UG 1 Accounting UG 2 Biology UG 2 Electronic Eng. UG 1 Graphics Design UG 2 Journalism UG 5 Architectural Eng. UG 1 Physics UG 7 Business UG 5 Mechanical Eng. UG

Table 4.9: Grouping of the Participants by Major in the AUC Mobility Traces

Wireless mobility traces were extracted over the course of several months, with the help of a team of engineers from the University Technology Infrastructure unit (UTI); namely, Engineer Ahmed Abdel Aziz and Engineer Islam Abdel Azim, and a team of engineers from the Computer Science and Engineering

Department; namely Engineer Faten Nakhla and Dr. Sinout. With their close support, the AUC wireless network connections traces from the Orion Trap server were downloaded during the Fall 2012 semester.

A Visual C# code was developed to process the SNMP records and to encrypt the usernames extracted from the traces left by AUC members who had logged in through the wireless network except for those who granted permission to decrypt their usernames. That is, based on the 100 consents collected, it was possible to decrypt the usernames of the 100 members from the whole set of wireless traces.

Using the currently available AUC mobility data, the traces were summarized to the level of records and were grouped per user. The daily traces files were arranged according to their size and the user density in these traces was studied to come up with the user density heatmap shown in Figure 4.4. From this heatmap, it is clear that the active duration of wireless access ranges from 8 am till 5 pm. Further study could be applied to extract the groups’ mobility patterns.

Figure 4.4: User Density Heatmap extracted from the AUC Wireless Access Traces

Currently, it is possible to infer general social groups as derived from the process of distributing

80 (a) AUC Map (b) AUC Buildings Map

(c) AUC Axial Map (d) Space Syntax Graph

Figure 4.5: Space Syntax Representation of the AUC map and collecting the consent forms such as small friendship groups, classmates (based on where they were

first encountered by the researcher), or same study major (as stated on the form) as per the 23 groups.

Random social profiles and interest vectors can also be generated regardless of these inferred social group profiles.

The internal maps of AUC buildings including the locations of access points were obtained (as shown in Figure 4.5b). Also, the whole AUC map was extracted from Google (as shown in Figure 4.5a) in order to detect the pathways and open areas (”streets”) connecting the buildings. From these maps streets coordinates and buildings coordinates were extracted. Also, it was possible to identify and record the streets closest to each Access point. This information is essential in the process of extracting the axial map shown in Figure 4.5c, for generating the Space Syntax graph shown in Figure 4.5d, and when calculating the integration value, attraction value, Location Index and Popularity Index of the streets

81 and the access points.

4.2.4 Implemented Algorithms

The forwarding algorithms module successfully implement a wide variety of opportunistic forwarding algorithms. These algorithms vary from social-oblivious forwarding algorithms to interest and power aware opportunistic forwarding algorithms. All the cases study versions detailed in section 5 are imple- mented in the SAROS simulator. In addition, the following benchmark algorithms are implemented in the simulator:

The Wait Destination Algorithm

This is a social and power-oblivious opportunistic algorithm that relies on the source node in delivering the message to the destination node. That is, the source node retains the message in its buffer and does not give copies to any forwarder node, but rather seeks the opportunity to encounter the destination node itself in order to deliver the message. This algorithm incurs the least cost as all the message forwards are actually message delivery actions. However, this algorithm does not guarantee a high delivery ratio or an adequate delay.

The Epidemic Algorithm

The Epidemic algorithm [157] is a social and power-oblivious opportunistic forwarding algorithm that broadcasts the message to neighboring nodes. The epidemic spread of the message among the encountered nodes reduces the delay and maximizes the delivery ratio, but at the maximum expense. Epidemic is considered a benchmark in opportunistic forwarding.

PeopleRank and Its Proposed Versions

The PeopleRank algorithm is a power-oblivious, social-aware message forwarding algorithm based on forwarding messages by utilizing the socially popular people nodes in place [22].

The simulator implements the contact-aware PeopleRank version (CA-PeR) which is discussed in details in section 5.1. The simulator enables integrating interest and power awareness to the PeopleRank algorithm to simulate the IPeR [113] algorithm and the PIPeR algorithm versions [158] that are proposed in section 5.1.

ProfileCast

The Profile paradigm [142] provides a service that delivers the message to all recipients who match with the specified profile. It is a power-oblivious, social-aware paradigm. It considers the history of mobility as

82 the behavior profile (BP) in ranking carriers, and the similarity among these profiles directs the message

forwarding process towards the targets who are defined by their profiles.

This paradigm relies on a form of broadcasting that is controlled by the profile of carriers and

destination nodes (targets). They do not differentiate between the profile match of the carriers to that

of the target users. They store on each user’s mobile his mobility profile in terms of an association

matrix with the percentage of time the user has spent in a set of locations over the course of a week.

The columns indicate the places and the rows indicate the distribution of time (during the day) that the

user spends in each location. They summarize this matrix in Eigen-behavior vectors and Eigen-behavior

values associated with weights for these values. Upon encounter, the two users exchange these vectors,

values and weights and run a behavior similarity metric. If the metric exceeds a certain threshold, the

mobile nodes exchange messages. This is based on a high similarity in their behavioral profiles which

indicates high similarity in interest. The considered profile can be in terms of interest, social affiliation

or behavioral patterns.

The system produces a singular value decomposition (SVD) to summarize each user’s association

matrix of his mobility pattern in Eigen-behavior vectors with their corresponding weights. When two

users meet with each other, they exchange their summarized mobility profiles and decide on the spot

whether or not they are similar. The similarity index between users U and V, Sim(U, V), is calculated

as the weighted sum of inner products of the Eigen-behavior vectors.

This paradigm sets the following assumptions: 1) The groups are implicitly defined by the intrinsic

properties of the users, and revealed by the way the users utilize the network. 2) Without relying on

any established infrastructure or registry, the paradigm can reach the targeted groups which are defined

according to their underlying properties. This philosophy leverages these ”random” encounter events as

short-cuts so that it can navigate within the behavioral space more efficiently while hopping across that

space in order to reach dissimilar nodes with relatively few message transmissions.

The simulator implements the CSI:D protocol2 of the ProfileCast paradigm. CSI:D selects forwarders

who share a common interest - that is based on a similar mobility pattern - with the sender, while the

target node’s interest is orthogonal to that. Its philosophy is based on the assumption that nodes with

high similarity in their behavioral profiles (BP) are almost guaranteed to encounter each other; thus,

there is no need for each of them to keep their own copy of the message during the process of disseminating

it.

Thus, the message holders select the next proper message holder that satisfies these conditions: 1)

The candidate should not have a behavioral profile (BP) that is similar to that of the current message

holder within the neighboring threshold thnbr. Thus, the selected nodes are those satisfying the inequality 4.4 2CSI:D stands for The Communication protocol in mobile networks based on the Stability of the user’s behavioral profile to discover the receivers Implicitly: Dissemination mode. It implements the ProfileCast paradigm

83 Similarity(BP (holder),BP (candidate)) > thnbr (4.4)

2) The BP of the candidate should be also be dissimilar to the BP of all known holders where the degree of similarity has to be less than the forwarding threshold thfwd. This is achieved by selecting the nodes that satisfy the inequality 4.5

Similarity(BP (eachholder),BP (candidate)) < thfwd (4.5)

SAROS requires both threshold values, and uses the similarity interest between the user and the forwarded message as the BP.

The parameters the authors declared to produce the best performance of CSI:D are thfwd = 0.3 and thnbr = 0.7 as these parameters enabled the algorithm to achieve almost the same delivery ratio of Epidemic with an additional 32% delivery delay and an additional 16% transmission overhead [142].

CSI:D mode provides a privacy-preservation feature whereby every user computes his behavioral profile locally and only exchanges a summarized SVD Eigen vector and values. Furthermore, the message holder sends the target profile which is orthogonal to the behavioral profile upon which forwarders are selected - only to those he encounters; they in response compare it to their profiles and decide whether they wish to receive the message or not. The forwarders are decided based on the similarity of their BP with that of the sender/current message forwarder.

SocialCast and Its Proposed Versions

The SocialCast algorithm [24] is a power-oblivious interest-based routing protocol in DTNs which sup- ports the publish-subscribe mechanism. It uses a prediction-based mechanism for guidance in the process of message holder selection which is based on co-location and mobility patterns. The details of the utility function computation are provided mentioned in section 5.2. The simulator implements this algorithm and the proposed interest and power aware versions are detailed in section 5.2.

SCAR and Its Proposed Versions

The SCAR algorithm is a sensor context-aware routing protocol for opportunistic routing [27]. SCAR is a power-aware version of SocialCast. SAROS implements an adaptive SCAR version as detailed in section

5.3. This version mainly sets an adaptive range for the battery level of the device which monotonically decreases as the battery level enters a critical range. The adaptive utility function is detailed in equation

5.20. The SAROS simulator accepts any values for the weights and ranges of this utility function.

From the SCAR experiments conducted by [27], the optimum values for the weights and range functions were determined by the authors to be: Rangecol = 1, Rangecdc = 1, Rangebat = remaining

84 battery portion, wcol = 0.75, wcdc = 0.25. Since no experiments that included the battery attribute were conducted, there is no mention of any optimum value for wbat. The predicted attributes are forecasted by applying Kalman Filter prediction techniques. SAROS also implements the proposed interest and power aware SCAR versions - ISCAR, PISCAR, and PISCAROp

[134]. The implemented SocialCast and SCAR versions do not let the message senders send control messages since they consider them to be publishers not mere forwarders. The utility value of the sender operates a knob that filters out the selected carriers. Based on several experiments, SAROS currently ensures that this value is dynamically recalculated. The implemented SocialCast and SCAR versions allow the publisher to have a buffer which enables the storage of many copies of the message that can then be forwarded to carriers. However, carriers do not have buffers of their own and can only forward the message once.

BubbleRap and E-BubbleRap Algorithms

The E-BubbleRap algorithm [121] is the energy-aware version of the community-aware BubbleRap [97] routing algorithm. The energy-aware version combines socially-aware routing with energy consumption optimization [121]. However, this algorithm does not incorporate awareness of the forwarder node’s interest in the forwarded content. E-BubbleRap introduces an energy-aware utility function into the original BubbleRap algorithm’s local and global ranking functions which rank the nodes within their local communities and within the global community respectively.

The Interest-Power Threshold Opportunistic Algorithm

This algorithm is the interest-and-power-aware version of the Epidemic algorithm. Its main logic states that any message holder will forward messages to any neighboring node(s) whose similarity interest with that of the forwarded message, SInt(node, msg), is above a predefined interest threshold T hrint, and whose battery level Bat(node) is above a predefined power threshold T hrbat. The forwarder selection condition is:

SInt(node, msg) ≥ T hrint & Bat(node) ≥ T hrbat (4.6)

The Dynamic Adaptive Ranking Algorithm

This algorithm dynamically adapts the ranking function according to the current context. The value of the ranking function relies on several factors that include an interest and power awareness factor, a social and activeness factor, a space-syntax factor, and an opportunistic forwarder selection factor.

The implemented versions of this dynamic adaptive algorithm are:

1. The DynAdp algorithm which relies on the interest and power awareness factor, the social and activeness factor, and the opportunistic forwarder selection factor. This is summarized as:

85 DynAdpRank(i) = f(IntP ower(i), SocActive(i), Opp(i)) (4.7)

2. The DynAdpOp algorithm implements the DynAdp algorithm and inserts an extra explicit

interest and power aware opportunistic selection criteria. Thus a selected forwarder node must satisfy

the following criteria:

DynAdpRank(j) ≥ DynAdpRank(i) ANDSInt(j) ≥ T hrint AND bat(j) ≥ T hrbat (4.8)

3. The DynAdpSpSyn algorithm which relies on the interest and power awareness factor, the social and activeness factor, the space-syntax factor, and the opportunistic forwarder selection factor. It is the same as the DynAdp algorithm but with the space-syntax factor added to its ranking function.

This is summarized as:

DynAdpSpSynRank(i) = f(IntP ower(i), SocActive(i), SpaceSyntax(i), Opp(i)) (4.9)

4. The DynAdpOpSpSyn algorithm which relies on the interest and power awareness factor, the

social and activeness factor, the space-syntax factor, and the opportunistic forwarder selection factor,

then inserts an extra explicit interest-and-power-aware opportunistic selection criteria. It is the same

as the DynAdpOp algorithm but with the space-syntax factor added to its ranking function. Thus a

selected forwarder node must satisfy the following criteria:

DynAdpSpSynRank(j) ≥ DynAdpSpSynRank(i) AND

SInt(j) ≥ T hrint AND bat(j) ≥ T hrbat (4.10)

Space-Syntax-based Forwarding Algorithms

Forwarding content based on the popularity of the place and awareness of the syntax of the space around

us is referred to as Space syntax based forwarding. There are several proposed Space Syntax metrics

upon which nodes are ranked for forwarder selection. The simulator implements Space Syntax metrics

such as ranking the nodes based on the integration value of the node; the location index of the node; the

attraction value of the spot the node is currently located in; the popularity index of the spot the node

is currently located in; the popularity measure of the most popular spot they encounter, the popularity

measure of the spot visited most frequently; the popularity measure of the spot at which they spend the

longest time duration; or a combination of the popularity measures of the current close spots. A more

86 detailed description of these metrics is discussed in Section 3.2.1.

4.2.5 Evaluation Module

The evaluation module evaluates the performance of the implemented algorithms in forwarding the message within a defined simulation period and according to whether they meet a set of metrics.

Effectiveness

The algorithm effectiveness is measured by classifying the contacted nodes as per their interest and by measuring the algorithm’s recall, precision, f-measure and accuracy.

Interest-based Effectiveness: An algorithm is effective if it contacts a high portion of the interested users while simultaneously avoiding the uninterested ones. Our simulator measures this effectiveness in terms of the ratio of contacted users classified by their interest. Users are classified as either interested forwarders, destination nodes, or uninterested forwarders.

Recall, Precision, F-measure and Accuracy: Effectiveness is also measured through recall, precision, f-measure and accuracy [159]. It should be noted here that the targeted true set consists of the interested forwarder nodes in addition to the destination nodes, while the false set contains the uninterested nodes. Based on the definitions of recall, precision and accuracy measures detailed in another research work [159], we compute the three metrics according to the following equations:

T rueP ositive Recall = (4.11) T rueP ositive + F alseNegative

T rueP ositive P recision = (4.12) T rueP ositive + F alseP ositive

Recall ∗ P recision ∗ 2 F − measure = (4.13) Recall + P recision

T rueP ositive + T rueNegative Accuracy = (4.14) T rueP ositive + T rueNegative + F alseP ositive + F alseNegative

where

True positive = all contacted destination nodes and all contacted interested forwarders.

False positive = all non-contacted destination nodes and all non-contacted interested forwarders.

True negative = all non-contacted uninterested nodes.

False negative = all contacted uninterested nodes.

87 Efficiency

The efficiency of an algorithm is measured in terms of the cost it incurs, the delivery ratio it achieves, and the delay in time that occurs during the delivery process.

Cost: Cost is measured by the count of forwarded message replicas that were generated to accomplish this process. We measure the total number of message replicas that have been generated at any given time, and also measure the cost per unit delivery ratio.

Delivery Ratio: Delivery Ratio is measured by the portion of successfully reached destination nodes over time to reflect efficiency.

Delay: Delay is measured according to the amount of time it takes to send each message to one of the destination nodes and which also reflects the degree of user satisfaction. User satisfaction may be measured by the average delay consumed until a message is delivered to any destination node. The shorter the delay, the higher the degree of user satisfaction.

Power Awareness

To measure the level of power awareness of an algorithm, the total consumed power is computed and the degree of fairness in utilizing the nodes’ resources is measured.

Power Consumption: Algorithm power-efficiency is reflected by its ability to conserve the overall power consumption. This metric is measured by computing the total amount of power consumed from all of the nodes’ batteries over time as well as the total amount of power consumed per unit delivery ratio.

Utilization Fairness: A fair algorithm would not exhaust some nodes’ batteries in message for- warding while preserving other nodes’ power. That is, it seeks to reduce variance among the nodes’ battery levels.

(a) IPeR (b) Opp. Fixed Threshold PIPeR

Figure 4.6: 4-category Battery Consumption over Time

In addition, the simulator measures the utilization fairness via 3 measures:

88 The final mean and standard deviation of the nodes’ power community as they present the effect of each algorithm in shaping the final battery distribution.

The variance among the nodes’ battery levels over time. The ability of an algorithm to reduce or increase the variations among the nodes’ battery levels along the forwarding process is used as a measure of fairness; fairness here indicates community closeness which is inversely proportional to variance.

The Fairness Index of the algorithm. Borrowing from the fairness index defined in another research work [160], the simulator computes it as shown in equation (4.15). This index ranges from 0 to 1 where the value 1 indicates the highest level of fairness when the SD of the final battery distribution reaches 0.

SD F airnessIndex = 1 − (4.15) mean

The simulator additionally provides a means of monitoring the battery distribution over time while clustering batteries into four categories: category 1 for battery levels less than 25%; category 2 for battery levels between 25% and 49.99%; category 3 for battery levels between 50% and 74.99%; category 4 for battery levels 75% and above. Figures 4.6a and 4.6b are samples of battery clusters over time as per

IPeR and P50Op respectively. This clustering is an auxiliary means to the above two measures for the purpose of measuring an algorithm’s fairness.

Wasted Resources

The performance of the power-aware algorithm is measured in terms of the time and amount of power wasted due to incomplete transfers of the message between two nodes which are disconnected before the complete transfer takes place. The aim of this metric is to compare between the contact-duration-aware version of PIPeROp and the corresponding contact-duration-oblivious version.

Normalized Performance Indices

For a collective performance analysis of the algorithms, the metrics were compiled in a normalized

8-metric space. In addition, three normalized performance indices are created: Effectiveness Index,

Efficiency Index and Power-Awareness Index. Each of these indices is computed as the harmonic mean of a group of the above mentioned normalized metrics detailed as follows:

Effectiveness Index: This performance index measures the algorithm’s effectiveness through the harmonic mean of the f-measure, the ratio of the contacted interested forwarders and the ratio of the contacted uninterested nodes. Its formula is

EffectivenessIndex = HarmonicMean(F measure, IntF W DRatio, UnIntF W DRatio) (4.16)

Efficiency Index: This performance index measures the algorithm’s efficiency in delivery through

89 the harmonic mean of the delivery ratio, the normalized value of the paid cost, and the normalized value of the delay in delivering the message. Its formula is

EfficiencyIndex = HarmonicMean(Delivery, Cost, Delay) (4.17)

PowerAwareness Index: This performance index measures the algorithm’s power awareness through the harmonic mean of the normalized value of the total ratio of consumed power and the Fairness index.

Its formula is

P owerAwarenessIndex = HarmonicMean(P owerConsumption, F airnessIndex) (4.18)

4.3 The Simulator Interface and Usability

This section summarizes the interaction with the SAROS simulator in terms of a sophisticated interface, output graphs and exported data.

4.3.1 The SAROS Interface

The main programs of the simulator and their main functionalities are:

The Main Form activates all the following forms. It is shown in Figure 4.7, and a summary of its parameters is listed in Table 4.10.

StartSim runs SLAW mobility-based experiments. It can also run the SIGCOMM/INFOCOM06 experiments. Figure 4.8 shows its menu, while its parameters are listed in Table 4.11.

EditINFOCOM imports the INFOCOM06 dataset to match the SAROS’s data structures by di- viding the data set into four sets of files (one file for every conference day).

INFOCOMdays divides a whole-day encounter traces data into 24 separate 1-hour encounter trace

files.

INFOCOMSim runs INFOCOM-based experiments.

EditSIGCOMM imports and manipulates the SIGCOMM09 dataset to match the SAROS’s data structures.

MallTracesSim runs experiments with an imported Mall-Environment encounter traces.

RandomSim runs experiments using the Random Walk Mobility model.

StAndrewsSim runs experiments using the imported St. Andrews University dataset mobility traces.

90 Figure 4.7: Main Form Menu

4.3.2 Output Graphs and Exported Data

The output module of the simulator generates results in the following three forms:

1. An animated graphical simulation of the mobility of the nodes with a color code that indicates which nodes are contacted and which are not. The color code also identifies the interest-based classifi- cation of the nodes. Figure 4.8 shows the panel of the animation with a legend of the color code used in this animation.

2. A set of graphs.

3. A set of metric files and data files.

The graphs are generated by a copyrighted but freely distributed portable command-line driven graphing utility called GNUPlot [161]. The graph generation commands are produced by the SAROS simulator in order to generate the graphs via the GNUPlot program. Samples of the generated graphs have been previously published in [113] and [158]. It is worth noting that all graphs included in this dissertation were generated using the SAROS simulator unless otherwise indicated.

The generated files of measured data or computed metrics for all algorithms are:

1) The achieved delivery ratio, cost and consumed power per time slot;

2) The recall, precision and accuracy;

3) The final battery level per user per iteration;

4) The delay exerted at the end of the simulation;

5) The count of WiFi scans, forwards, receiving message per time slot;

6) The Fairness index;

7) The final delivery ratio, the percent of contacted interested forwarders, and the percent of contacted uninterested users at the end of the simulation.

8) Count of the incomplete message transfers, the total wasted time and the total wasted power due to incomplete message transfers. There is also a listing of the incomplete transfer occurrences;

91 9) The generated interest distribution per run;

10) The generated user profiles per time slot;

11) The mean, the standard deviation and the variance of the battery levels per time slot;

12) The final battery distribution per algorithm categorized into 4 categories ranging from [0%-24%] to [75%-100%];

13) A gnuplot commands file that instructs Gnuplot to generate the required graphs.

Figure 4.8: StartSim Menu

The generated graphs are:

1) The final battery distribution categorized into 4 categories;

2) The battery distribution over time categorized into 4 categories;

3) The total consumed power over time;

4) The total consumed power versus the delivery ratio;

5) The count of produced control messages over time;

6) Cost versus Delivery Ratio;

7) Cost over time;

8) Delivery Ratio over time;

9) Fairness Index;

10) Recall, Precision, F-measure and Accuracy;

11) The interest based effectiveness graph presents the final delivery ratio, the percent of contacted interested forwarders, and the percent of contacted uninterested users at the end of the simulation in stacked form. Figure 5.7a is provided as a sample;

12) The final mean and standard deviation of the battery levels;

92 13) Variance among the battery levels over time;

14) The power consumed due to the sent control messages;

15) An 8-metric Spider graph which encompasses delivery ratio, cost, power consumption, fairness,

Interested Forwarders Ratio, Uninterested Forwarders Ratio, F-measure, and delay as illustrated in the sample shown in Figure 5.26;

16) The Performance Effectiveness Index; the harmonic mean of F-measure, Ratio of Interested

Forwarders and Ratio of Uninterested Forwarders.

17) The Efficiency Performance Index; the harmonic mean of Cost, Delivery Ratio and Delay.

18) The Power-Awareness Performance Index; the harmonic mean of Fairness Index and Ratio of

Power Consumption.

4.4 The Simulator Verification and Validation

Agile testing has been applied in the validation and verification of the simulator software. As the require- ments kept changing with the further research proposals that we experiment with, new requirements are handled and new code is developed with new set of testing methods. Here are some examples of the simulator code validation:

1) We examined the validation of the random seed per simulation run by examining various ap- proaches of randomizing the seeds such as setting the seed to the iteration number or to the current time converted into seconds. In one set of experiments, the current time was converted to seconds and used as a seed for randomization. In another set of experiments, the seed was set to 19 and multiplied by the iteration number. A third set of experiments relied on the iteration number as the seed.

2) To validate the functions implemented with the requirements of the simulations, we compared the output of these functions with the expected output based on a set of controlled simulations. Also, we validated the output data files with the expected results.

Several experiments to verify the simulator code have been conducted including the following:

1) To verify the implementation of the algorithms, the parameters of the algorithms are changed in order to make them behave in a similar fashion to the Epidemic forwarding algorithm. We also run the Epidemic algorithm in the same environment so that the algorithms’ performance can be compared to that of the Epidemic algorithm. The same verification method is applied by implementing the Wait

Destination algorithm and adjusting the other algorithms’ parameters so that they behave in a similar way to this algorithm. Then, the results are compared to verify the implemented code as shown in Figure

4.9;

2) On implementing the KiBaM model, the battery parameters are changed to simulate a simple battery - with no charge recovery phase - and then compared with another set of experiments that do

93 Figure 4.9: Cost versus Delivery Ratio not implement the KiBaM model;

3) To verify the method of calculating variance, the battery levels of all the users as time passes are plotted; this graph is then compared to the produced variance graph;

4) To verify the simulation of the implemented state-of-the-art algorithms, their performance is compared to that mentioned in the algorithms’ original publications. The conducted evaluation proved the resemblance in the performance of both implementations.

4.5 Conclusion and Future Work

SAROS is a simulator environment implemented in Visual C# to simulate opportunistic forwarding in environments such as malls, conference centers, or university campuses. The users’ mobility traces are either synthetically generated or imported from real mobility traces. The simulator also generates users’ social graph, and imports social graphs from real datasets. To represent users’ interest in the forwarded content, the simulator either generates random interest vectors for all users, or imports external interest vectors from other datasets. The SAROS simulator provides stable simulated opportunistic networking environments, and encompasses implementations of various opportunistic forwarding algorithms. It also provides a variety of performance evaluation metrics that are not offered in comparable simulators.

SAROS has been used in previous research contributions in order to validate the proposed interest and power aware opportunistic forwarding algorithms [113] [158].

Although the current version of the simulator encompasses many social-aware opportunistic parame- ters, it lacks the more generic features that are provided by other Opportunistic Network simulators. In future, the functionalities of the simulator can be expanded to include:

1) Implementing the packet-level forwarding parameters;

2) Implementing the amount of power consumption and buffer consumption incurred by the internal computations of the forwarding algorithms;

94 3) Implementing a more recent model of the Li-ion batteries that considers the hamper performance

[162] where another research study has recommended an extension to the KiBaM model for Li-ion batteries by introducing the hamper performance (the effect of diffusion drift of charge carriers in the solid-state Li-ion batteries);

4) Making the simulator more accessible by exposing the accepted format of any imported data to allow for other researchers to easily integrate their datasets and smoothly implement their proposed algorithms within the simulator.

95 Table 4.10: Parameters of the Main Form Menu

No. of users The number of users simulated in the experiment. No. of iterations The number of simulation runs within the experiment. No. of message senders The number of message senders within the experiment. This number cannot exceed the no. of users No. of interests The length of the interest feature vector per user Interest The damping factor of PeopleRank and its interest and damping factor power aware versions. Dest. Percent The percent of destination nodes among the users Similar Interest Factor Used in RC algorithm. It is not used anymore With total contacts Sets the maximum number of total contacts. Used in RC algorithm. It is not used anymore Proximity The range of detecting nodes in proximity in WiFi Range or Bluetooth modes. It is measured in meters Source The similarity interest of the sender with the sent message. SInterest It ranges from 0 to 1 Generate/ Sets the mode to either generate, save or read save/ read rnd the randomly generated data within the experiment. Battery The battery threshold above which the node threshold is considered for participation in the forward process dataset Selects the SLAW or the real traces dataset. wCOL The weight of the colocation component in the SocialCast and SCAR algorithms wCDC The weight of the change degree of connectivity component in the SocialCast and SCAR algorithms Result The folder in which the traces files and graphs folder resulted from the experiment are stored SLAW Calls StartSim to run SLAW Simulation mobility-based experiments Random Walk Calls RandomSim to run random walk mobility based experiments StA Simulation Calls StAndrewsSim to run experiments based on the imported mobility traces of St. Andrews University. Mall Calls MallTracesSim to run experiments are based on Traces an imported Mall-Environment encounter traces. INFOCOM Calls INFOCOMSim to run INFOCOM-based experiments INFOCOM Calls INFOCOMdays that divides a whole-day encounter DAY traces data into 24 separate 1-hour encounter traces. SlGCOMM Calls StartSim with a parameter indicating the use of SIGCOMM dataset instead of the SLAW dataset Batch mode Runs in batch mode SLAW-based experiments and/or SIGCOMM-based experiments SCAR Runs the SCAR algorithm in batch mode Gen Calls EditINFOCOM to import the INFOCOM06 dataset INFOCOM and change its format to match the simulator data structures Gen Calls EditSIGCOMM to import the SIGCOMM09 dataset SlGCOMM and change its format to match the simulator data structures

96 Table 4.11: Parameters of the StartSim Menu

Start Sim Runs the main code of the program that calls the selected algorithms Results of Produces the results of each iteration one iteration in a separate file Full batteries Sets the battery distribution to be full batteries Bat-PeR Runs the PIPeR version Compare Users’ Adds the restriction of comparing the carrier’s Batteries battery level to the candidate’s battery level Normal Dist. Sets the battery distribution to batteries the discrete normal distribution AdpBat Threshold Enables the adaptive battery threshold mode SCAR Runs the SCAR algorithm instead of SocialCast SocCast Runs the power aware version of PowerAware SocialCast/SCAR SocCastOpp Runs the opportunistic version of SocialCast/SCAR Opportunistic Runs the opportunistic version of PIPeR ContactAware Runs the contact-duration aware version of PIPeR PIPeRDepRate Runs the depletion-rate aware version of PIPeR Kalman Filter Uses Kalman filter for contact-duration prediction SocCastDepRate Runs the depletion-rate aware version of SocialCast/SCAR No Dep. Rate Disables the implementation of the depletion in the code rate in the experiment Int SocCast Runs the interest-aware version of SocialCast/SCAR Epidemic Runs the epidemic algorithm PeopleRank Runs the PeopleRank algorithm IPeR Runs the IPeR algorithm ProfileCast Enables running the ProfileCast algorithm BubbleRap Runs the BubbleRap and E-BubbleRap algorithms interest choice Select the interest distribution. destination percent Sets the percent of the destination nodes in the community. interested Sets the percent of the interested forwarder nodes in fwd percent the community.

97 Chapter 5

PI-SOFA Proposed Implementations

This chapter presents a detailed description of the incremental implementation of the interest, power aware and opportunistic forwarding algorithms proposed in PI-SOFA framework defined in Section 3.1, along with their simulation using SAROS and provides an analysis of the results of these implemented versions. We briefly demonstrate three prominent social-aware opportunistic forwarding approaches that were previously developed for mobile opportunistic message delivery: PeopleRank [22], SocialCast [24] and SCAR [27]. For each of these ’power-and-interest-insensitive’ algorithms, we first implement the in- tegration of interest awareness by proposing three corresponding interest-aware-power-oblivious versions:

IPeR [113], ISCast [134] and ISCAR [134]. We then supplement the interest-aware version with power awareness to produce three power-and-interest aware versions: PIPeR [158], PISCast [134] and PISCAR

[134]. The next stage integrates the threshold opportunistic forwarding to produce three comprehensive versions: PIPeROp [158], PISCastOp [134] and PISCAROp [134]. Finally, we introduce depletion-rate- aware versions to both the power-and-interest aware versions: PIPeRDep, PISCastDep and PISCARDep; and the opportunistic forwarding versions: PIPeROpDep, PISCastOpDep and PISCAROpDep. Each of the proposed versions are listed in Table 5.1.

Table 5.1: The Proposed Implemented Versions

Social-Aware PeopleRank SocialCast SCAR Forwarding Alg. vs PI-SOFA Integration Interest-Awareness IPeR ISCast ISCAR Power-Awareness PIPeR PISCast PISCAR Threshold-Opportunistic PIPeROp PISCastOp PISCAROp Depletion-Rate Aware PIPeRDep PISCastDep PISCARDep PIPeROpDep PISCastOpDep PISCAROpDep

In Section 5.5, a simulation-based performance evaluation demonstrates that when compared to the performance of the original versions, the proposed algorithms show an improvement in effectiveness and efficiency, a reduction in power consumption, and fairer utilization. The results show an improvement in

98 f-measure that is mainly achieved by disregarding uninterested nodes and focusing on the potentially in- terested ones. Moreover, integrating power-awareness by primarily focusing on power-capable interested nodes preserves some power, reduces cost, and thus attains more fairness in power utilization. Finally, this research analyzes the proposed algorithms’ performance across various environments. These findings are applicable in several domains where the delivery of different forms of content in opportunistic mobile networks is required.

The following sections will start first by describing the various algorithm contributions that result from integrating interest awareness and power awareness to each of the PeopleRank algorithm, the

SocialCast algorithm and the SCAR algorithm. The experimental results will follow in Section 5.5 to illustrate the resulting performance of each contribution.

5.1 PI-SOFA Integration with PeopleRank

The PeopleRank algorithm is a power-oblivious, social-aware message forwarding algorithm based on forwarding messages by utilizing the socially popular people nodes in a particular place [22]. PeopleRank is based on the hypothesis that socially popular nodes constitute better candidates for delivering messages to destinations since there is a higher probability that such nodes will encounter destinations more quickly.

Nodes are socially ranked as per their social relationship whether through a declared friendship, or if they share common interests.

To achieve a balance between social based forwarding and opportunistic forwarding, PeopleRank introduces a damping factor that decides on the degree of reliance on the social ranking of a node versus the reliance on the opportunistic encounter of carriers. A damping factor (d) ranges from value 0 for total reliance on opportunistic forwarding, to 1 for total reliance on social based forwarding. Values between 0 and 1 indicate the weight given to each of the two approaches respectively. Empirical runs demonstrate that the optimal value for the damping factor d is 0.87 [22].

The researchers proposed a contact-aware version of PeopleRank (CA-PeR) that ranks nodes using their social rank and social activeness. Node activeness is measured by how frequent a node encounters social contacts. In effect, a node’s social rank is either rewarded or penalized according to a count of the node’s encounters with its social contacts. The CA-PeR rank is computed as follows:

X CA − P eR(j) ∗ wi,j CA − P eR(i) = (1 − d) + d ∗ (5.1) |F (j)| j∈F (i)

d is the damping factor, CA−P eR(j) is friend j’s PeopleRank value, F (i) is the set of friends of node i who are available in vicinity, wi,j is the contact-component and computed by the following equation.

99 |encountersi,j| wi,j = P (5.2) k∈F (i) |encountersi,k|

The CA-PeR value consists of a pure opportunistic component (1 − d), a contact-aware component w and a social-aware ranking component. The social-aware component is based on the concept that the node has a high social rank if, on average, its friends have high social ranks. This component is computed by calculating the sum of the CA-PeR values of all the friends j of the node i which is then averaged by the count of those friends |F (i)|.

5.1.1 IPeR: Interest-Aware PeopleRank

IPeR [158], adds a new dimension to the PeopleRank (CA-PeR) algorithm by integrating social interest into the forwarding process of the social-based ranking. In effect, IPeR introduces another parameter for ranking the nodes besides the typical social ranking and activeness used in the CA-PeR version. To consider a node for forwarding a message, we compute the similarity in interest between the candidate forwarding node and the forwarded message, and use this information for further decision making. IPeR integrates an SInt (Similarity Interest) parameter into the CA-PeR ranking function to accommodate not only social ranking, but also an ”interest-aware” social ranking component. While the damping factor

(d) used in CA-PeR will continue to determine the amount of reliance on opportunistic forwarding, it will also be applied to the new interest-aware social ranking component. Interest-Aware social ranking consists of the usual social ranking component of CA-PeR which will now also be rewarded if the there is interest similarity in place, and penalized if otherwise.

The higher the rank of a node and of its contacts, the more likely it will be that a node will become a candidate for message forwarding. A candidate node is highly ranked if it is linked to popular friends and also if the node and its friends are partially interested in the forwarded content. The following equation formalizes this concept:

X CA − P eR(j) ∗ wi,j IP eR(i) = (1 − d) + d ∗ F rInter(i, msg) ∗ (5.3) |F (j)| j∈F (i)

where FrInter(j,msg) is the Penalized Similar Interest which is computed as per equation (3.2). Thus,

IPeR magnifies the node’s social rank if the node and its friends are interested in the message above a certain interest threshold T hrInt and it is penalized if otherwise. where SInt(j, msg) is Similarity Interest and computed by the Jaccard set similarity between the user’s interest vector and the message specialization interest vector. Thus, IPeR magnifies the node’s social rank if the node and its friends are interested in the message above a certain interest threshold

T hrInt and it is penalized if otherwise. According to the logic of the IPeR algorithm (Algorithm 1), first, the nodes that initiate the message

100 Algorithm 1 Distributed IPeR Algorithm Require: |F (i)| ≥ 0, SInt(source,msg) = 0 1: IP eR(i) ← 1 − d 2: ∀ time t every n seconds 3: while i is in contact with j do 4: if j ∈ F (i) then 5: send(IP eR(i), |F (i)|, IntF V (i)) 6: receive(IP eR(j), |F (j)|, IntF V (j)) 7: update(IP eR(i)) (Eq. 3 and 4) 8: end if 9: while ∃Ad ∈ buffer(i) do 10: send(IntF V (msg)) 11: receive(SInt(j, msg)) 12: if SInt(j,msg) ≥ Destination-Interest-Threshold or (SInt(j,msg) > SInt(i,msg) and IPeR(j) ≥ IPeR(i)) then 13: Forward(Ad, j) 14: end if 15: end while 16: end while rank the users in proximity using the IPeR function based on their candidacy to forward the message.

The sender then sends the message to the ’interested forwarders’ whose Similarity Interest SInt(j, msg) is above a certain threshold. As these forwarders encounter other nodes, they check their interest, their social rank, whether they have already received this message or not, and their willingness to forward it to others along their way. In case of any matches, the senders forward the message to the new candidates. This process is repeated until the specified target time duration t expires or the target number of recipients is achieved.

Initially, all the nodes’ IPeR values favor opportunistic forwarder selection i.e. IPeR = (1-d). When- ever two nodes come in contact, if they are friends, they exchange the following information: their IPeR values, the count of each one’s friends |F (i)| where F (i) is the list of friends of the user that can be accessed from several sources such as the user’s contact list or from the user’s social network application, and their interest feature vectors IntF V (i) on the basis of which they update their IPeR ranks as per the equations 3 and 4 (lines 3-8). Whenever an message holder i comes in contact with another node j, they exchange their current IPeR ranks, and node i sends the message interest vector IntF V (msg) to node j in order to receive the computed SInt(j, msg) (lines 9-11). If node j belongs to the destination set of this message (line 12), node i forwards to node j a copy of the message. Also, if IP eR(j) ≥ IP eR(i) and SInt(j, msg) > SInt(i, msg), node i forwards to node j a copy of the message (lines 12-13). This logic is clarified in the flow diagram illustrated in Figure 5.1.

5.1.2 PIPeR: Power-and-Interest-Aware PeopleRank

PIPeR [158], adds a new dimension to the IPeR algorithm by integrating power-awareness with the interest-based social forwarding process. Accordingly PIPeR introduces another parameter for ranking the nodes besides the interest-aware social ranking and activeness used in IPeR. More specifically, in order to consider a mobile node for forwarding a message, the algorithm elicits the candidate node’s available battery level, and use this information as a means of indicating the node’s willingness to forward.

PIPeR emphasizes power awareness by rewarding the node whose battery level is above a certain

101 Figure 5.1: Flow Diagram of the Interest-Aware PeopleRank Algorithm battery threshold, and penalizes it if otherwise. Accordingly a candidate node is highly ranked when the following conditions are met: its battery is above a certain threshold, its owner is linked to popular friends, and when the node owner and their friends are interested in the message being sent. Thus, the higher the rank of a mobile node and its contacts, the more likely a node will become a candidate for forwarding the message. The node’s power-aware rank P IP eR is formalized by this equation:

  (Bat(i) + reward) ∗ IP eR(i) ifBat(i) ≥ T hrbat, P IP eR(i) = (5.4)  (Bat(i) − reward) ∗ IP eR(i) otherwise

where IPeR is the interest-aware version of PeopleRank and is computed as per equation (5.3).

According to the logic of the PIPeR algorithm (Algorithm 2), first, the advertiser node ranks the users in proximity using the PIPeR function based on their candidacy to forward the message. The message sender then sends the message to the ”power-capable interested forwarders” whose Similarity

Interest SInt(j,msg) is beyond a certain threshold and whose PIPeR value is rewarded by its current battery level for exceeding the battery threshold T hrbat. As these new message holders encounter other nodes, they check their interest, social rank, battery level, and whether or not they have already received the forwarded message; the message senders base the decision to forward the message to new candidates accordingly. This process is repeated until the target time duration t expires or the target number

102 Algorithm 2 Distributed PIPeR Algorithm Require: |F (i)| ≥ 0, SInt(source,msg) = 0.3 {i: node i, F(i): Friend list, IntFV(i): Interest Feature Vector, bat(i): current battery level, T hrbat: battery level threshold, PIPeR(i): PIPeR value, SInt(i,msg): Similarity Interest between IntFV(i) and IntFV(msg), buffer(i): buffer of the to-be-forwarded ads} 1: P IP eR(i) ← 1 − d 2: ∀ time t every n seconds 3: while i is in contact with j do 4: if j ∈ F (i) {if j is a friend of i} then 5: send(P IP eR(i), |F (i)|, IntF V (i), bat(i), T hrbat) 6: receive(P IP eR(j), |F (j)|, IntF V (j), bat(j), T hrbat) 7: update(P IP eR(i)) (Eq. 2) 8: end if {for all encountered nodes whether they are friends or not} 9: while ∃Ad ∈ buffer(i) and Scanning-Condition = true do 10: 1-hop-broadcast(IntF V (msg), T hrbat) 11: receive(SInt(j, msg), T hrbat) 12: if Opportunistic-Interest-Condition or SInt(j,msg) ≥ Destination-Interest-Threshold or (SInt(j,msg) ≥ SInt(i,msg) and PIPeR(j) ≥ PIPeR(i)) then 13: Forward(Ad, j) 14: end if 15: end while 16: end while

Figure 5.2: Flow Diagram of the Interest-and-Power-Aware PeopleRank Algorithm

of recipients is achieved. Any message carrier node whose battery level goes below the preset battery

threshold ceases to scan or forward message in order to avoid depleting the remaining battery level.

To explain in detail as shown in algorithm 2, initially, all the nodes’ PIPeR values favor opportunistic

forwarder selection i.e. PIPeR = (1-d). Whenever two nodes come in contact and if their owners are

friends, the nodes exchange 4 pieces of information: their PIPeR values, the count of each one’s friends

|F(i)|, their interest feature vectors IntFV(i), and the nodes’ current battery levels. The exchanged information is used to update their PIPeR ranks as per equation 2 (as shown in lines 3-8). Whenever a message holder i comes in contact with another node j, they exchange their current PIPeR ranks, and node i sends the message interest vector IntFV(msg) to node j so that it can receive the computed

SInt(j,msg) (lines 9-11). If node j belongs to the destination set of this particular message (line 12), node i delivers a copy of the message to node j. If node j is not a destination node but its PIPeR rank

103 and similarity interest exceed those of node i, then node i will forward a copy of the message to node j

(lines 12-13). Note that the PIPeR algorithm sets Scanning-Condition to bat(j) ≥ T hrbat in code line 9. This logic is illustrated in Figure 5.2.

Here, several variations of the PIPeR algorithm that meet various metrics are presented. The varia- tions are:

Adaptive Battery Threshold version (PAd): The logic behind this approach is to utilize the nodes whose ’wealth’ in battery power is above the current average ’wealth’ of the battery community.

This PIPeR version continuously adapts the battery threshold that is used in selecting candidates ac- cording to the obsAvgBat values noted from the battery levels of the encountered nodes. In this way, the candidates who maintain a battery level above the current observed average battery level are selected to become the next message carriers. The obsAvgBat is computed as follows:

Bat(i) + P obsAvgBat(j) obsAvgBat(i) = j∈contact(i) (5.5) 1 + |j|

This version sets the Scanning-Condition in code line 9 to bat(j) ≥ obsAvgBat(i) to select the next forwarders. Also, the nodes exchange their observations of obsAvgBat instead of relying on a fixed battery threshold T hrbat in code lines 5, 6, 10 and 11. Fixed Battery Threshold version (Px): This PIPeR version compares the candidate’s battery level to a fixed battery threshold T hrbat instead of relying on an adaptive battery threshold. The application fixes a battery threshold above which the power ’wealthy’ members of the community become suitable candidates for forwarding the message.

5.1.3 PIPeROp: Threshold Opportunistic PIPeR

PIPeROp adds an extra opportunistic portion to the candidate selection process. This is achieved by forwarding the message to any interested forwarder whose battery level is above the fixed threshold

T hrbat. These favored forwarders need not be socially popular users, but should be power-capable interested forwarders. Thus, this approach sets the Opportunistic-Interest-Condition to the condition bat(j) ≥ T hrbat and SInt(j,msg) ≥ T hrint in code line 12. Accordingly, this algorithm selects the next message holder to be any candidate whose similarity interest with the message exceeds the interested forwarder threshold and whose node power is above the given battery threshold. In addition, it selects any candidate whose power-aware rank is not less than that of the current message holder. These variations can be combined to achieve collective benefits.

Four combinations of these variations are: PIPeR for a fixed threshold 50%, PIPeROp for Opportunis- tic fixed threshold 50%, PAd for adaptive threshold and PAdOp for adaptive Opportunistic combination.

Each of these are presented in the evaluation section 5.4 and their achieved benefits are exhibited.

104 5.1.4 PIPeRDep: Depletion-Rate-Aware PIPeR

This subsection shows how the awareness of the depletion rate of the power source of each node would impact the performance of the PIPeROp forwarding algorithm, and whether this awareness guides or misguides the algorithm in the forwarder selection process. The Depletion Rate aware PIPeROp exper- iments compare the performance of the PIPeROp algorithm with and without depletion rate awareness in terms of the set of metrics that are in detailed in section 4.2.5.

PIPeROp: This is the opportunistic PIPeR version which is based on a 50% fixed threshold. It does not rely on the depletion rate of the battery or on the message’s TTL in making candidate selection decisions.

WP-and-D: This version computes the PIPeR rank as a combination of a weighted PIPeR and a weighted depletion rate as shown in equation 5.1.4. The resulting PIPeR rank is used to compare the nodes when making candidate selection decisions.

WP −and−D(i) = (P IP eR(i)∗wP IP eR)+(DepletionRate(i)∗P redicted−Battery(i)∗wDepRate) (5.6)

P/D: This version computes the new rank by dividing the original PIPeR rank by the depletion rate of the battery. The result of this division becomes the new rank that is then used in candidate selection as per equation 5.1.4.

P/D(i) = (P IP eR(i)/DepletionRate(i)) (5.7)

P/DcompD: This version divides the original PIPeR rank by the depletion rate as per equation

5.1.4, and also compares the carrier’s battery-depletion rate to the candidate node’s battery-depletion rate in order to maintain a non-decreasing depletion-rate node selection criteria. That is, the selected candidate node’s depletion rate must be ≥ carrier node’s depletion rate in addition to the compared

P IP eR/DepletionRate

wPDcompD: This version provides an additional weight for the original PIPeR rank and then adds this weighted rank to the weighted depletion rate of the node in order to compute the new PIPeR rank as per equation 5.1.4. The nodes are compared in terms of the new PIPeR rank and their depletion rates. Again, the selected candidate node’s depletion rate must be ≥ carrier node’s depletion rate.

noP-wD: This version does not rely at all on the PIPeR rank in forwarder selection, but rather relies on the weighted value of the candidate node’s depletion rate. In this computation, the depletion rate is weighed according to the predicted battery portion and the adaptive range weight. The adaptive range

105 weight is a weight value that is not fixed, but rather decreases or increases as the battery value enters or leaves a preset critical range as shown in equation 5.1.4. A similar concept has been proposed by SCAR to adaptively weigh the node’s utility in terms of battery level [27] as discussed in detail in section 4.2.4.

noP − wD(i) = (DepletionRate(i) ∗ AdaptiveRange(i) ∗ wDepRate) (5.8)

P-noD-TTL: This version uses the original PIPeR rank without including the depletion rate in the computation. However, the comparison between the candidate and the current carrier node is based on: their PIPeR ranks, their depletion rates, the current battery level and the remaining time in the message

TTL. Thus, the carrier node checks whether the current depletion rate will allow the potential node(s) to deliver the message before the message’s time-to-live (TTL) and whether the node’s power will remain above the preset battery threshold. That is, the candidate node must satisfy the inequality 5.1.4

P IP eR(i) >= P IP eR(carrier) and

battery(i) − ((DepRate(i) ∗ (TTL)) + forwardpower + W iF iconnectionpower) ≥ (batteryT hrd)

(5.9)

Note that setting the inequality to ≥ 0 means the node’s battery may be exhausted by the time the message TTL arrives. Also, setting the right hand side of the inequality to batteryThrd is too restrictive as a node is allowed to carry the message only if by the time TTL occurs its battery remains above the preset battery threshold. A compromise, therefore, would be to set the right hand side of the inequality to (batteryThrd / 2) which would neither deplete the battery completely, nor be too restrictive.

noP-wD-TTL: This version does not consider the PIPeR rank value in candidate selection; it rather relies on the weighted depletion rate value and the awareness of whether the candidate node’s battery can survive - in other words, whether it remains above half the battery threshold - until the message

TTL. This is attained by satisfying the inequality 5.1.4.

battery(i) − (((DepRate(i) ∗ wDepRate ∗ AdaptiveRange(i)) ∗ (TTL))

+ forwardpower + W iF iconnectionpower) ≥ (batteryT hrd/2) (5.10)

106 5.1.5 Integrating Contact Duration Awareness with PIPeR

We propose integrating contact-duration awareness to the PIPeR algorithm by factoring the expected contact duration between the carrier node and the candidate node into the node’s social rank. Checks for the expected contact duration and whether it will be sufficient for the message transfer between the two nodes to occur are used to amend the criteria for candidate selection. The algorithm seeks contact durations that are long enough to complete the message transfer from one node to another node. Two sets of experiments are conducted: While the first does not include the prediction of the expected period of contact (contact duration) for the next timeslots in the implementation of the PIPeR algorithm, the second set does.

In the experiments, the performance of PIPeR when it relies on the Kalman filter prediction technique

[26] [27] is compared to the remaining time expected for the nodes while they are in contact (coined wExp-withAssoc and wExp-woAssoc); to the performance of PIPeR when there is a technique to predict accurate expectation of the contact durations (and which we refer to as AccExp-withAssoc and AccExp- woAssoc); and as well to the PIPeR’s performance when it does not rely on any expectations of contact duration (and which we refer to as woExp-withAssoc and woExp-woAssoc).

We propose another variation of the algorithm which transfers portions of the message between 2 nodes. In such a case, the carrier drops the transferred portion and turns its attention to transferring the remaining portions of the same message. This solution can consider retaining n copies of each transferred portion to safeguard against failure to complete delivery of the already transferred part to the final destination nodes.

5.2 PI-SOFA Integration with SocialCast

SocialCast [24] is an interest-based routing protocol in DTNs which supports the publish-subscribe mechanism. It uses a prediction-based mechanism to guide in the process of message holder selection, combined with a store-and-forward option to cope with intermittently connected networks. SocialCast relies on observing previous co-location and mobility patterns in order to predict the upcoming mobility patterns of the users using Kalman filter forecasting techniques [26] in order to select the carriers most able to forward messages from the publisher to the interested subscribers. SocialCast complements the information about the recipients’ interests, which is necessary to route information, with any data about the social ties among people and their consequent predicted movements. SocialCast assumes that users with common interests are more likely to meet with each other more often than other users. Thus, users with common interests often have similar mobility patterns and have high values of co-location with other nodes. That is, SocialCast relies on the likelihood of the forwarder candidate’s co-location with the destination nodes; it also relies on its change degree of connectivity when selecting the best forwarder

107 nodes.

Having provided the above overview, we consider the way in which SocialCast builds a utility rank

for each node that is then computed as shown in equation (5.11).

util(i) = wcol ∗ Pcol(i) + wcdc ∗ Pcdc(i) (5.11)

where Pcol(i) and Pcdc(i) comprise the predicted co-location of node i with any of the destination nodes, and the predicted Change Degree Of Connectivity of node i respectively. Their corresponding weights

wcol and wcdc represent the relative importance of the co-location attribute and the change of degree of connectivity attribute.

It is worth noting that SocialCast is restrictive in forwarder selection for two reasons: 1) it allows

each node to forward only one copy of the message; 2) it allows only the best candidate among the group

of currently contacted nodes to receive that copy of the message. This is based on the assumption that

users with common interests tend to meet with each other more often than they do with other users.

Relying on the likelihood of colocation with destination nodes, SocialCast includes the following two

attributes in the utility function calculation: 1) The co-location attribute is included in the calculation

of the utility value of each node which is used for selecting the next message holders. 2) The SocialCast

approach takes into consideration the change in degree of connectivity for each candidate node to choose

the one with a frequently changing set of connected neighbors. The change in degree of connectivity can

be attributed to a moving node or when a static node is surrounded by dynamic nodes. Consequently,

the underlying connectivity graph is time-variant depending on the mobility model. SocialCast uses the

Girvan-Newman clustering approach to detect communities in the synthetic social network.

Generally speaking, SocialCast selects the carriers whose predicted mobility pattern will most prob-

ably bring them in contact with anyone of the destination nodes. Thus, the co-location and change in

degree of connectivity attributes are included in the calculation of the utility value of each node which

is used in the process of selecting the next message holders.

5.2.1 ISCast: Interest-Aware SocialCast

ISCast presents a forwarding process that is aware of the node’s interest in the forwarded message without

reliance on the node’s friends’ interest. This type of forwarding is applicable in areas where co-location

of the node and its friends is rare or is at least not the norm. The ISCast algorithm introduces interest

awareness in the forwarder node selection process by rewarding or penalizing the utility rank of the node

based on the node’s penalized similarity interest PSInt(i,msg), which is computed as per equation (3.1).

Thus, ISCast magnifies the node’s social rank if the node is interested in the message above a certain

interest threshold T hrInt and penalized it if otherwise. Then the utility value of the node util(i) is

108 weighed by PSInt(i,msg) to produce the interest-aware utility value of the node as shown in equation

(5.12).

Iutil(i, msg) = P SInt(i, msg) ∗ util(i) (5.12)

When the message holder i encounters node j, it forwards the ad if node j satisfies either of the

following conditions: 1) SInt(j, msg) is above the T hrint. 2) Iutil(j,msg) exceeds Iutil(i,msg) and as well exceeds Iutil of all the nodes k that node i is in contact with at that moment.

SInt(j, msg) ≥ T hrInt ||(Iutil(j) > Iutil(i)& Iutil(j) > Iutil(k) ∀k ∈ InContact(i)) (5.13)

5.2.2 PISCast: Power-and-Interest-Aware SocialCast

The PISCast version improves the interest and power-awareness of the SocialCast algorithm. According

to PISCast, the node’s utility value is rewarded or penalized based on the node’s similarity interest PSInt

and the remaining power level Bat.

  −(P SInt(i, msg) ∗ Bat(i)) ∗ util(i) ifP SInt < 0   P Iutil(i) = & Bat < 0, (5.14)    P SInt(i, msg) ∗ Bat(i) ∗ util(i) otherwise

where util(i) is computed as per equation (5.11), and PSInt(i,msg) is computed as per equation (3.1).

Furthermore, the selected forwarder node must satisfy either of the following conditions: 1) PIu-

til(j,msg) exceeds PIutil(i,msg) and as well exceeds PIutil of all the nodes k that node i is in contact

with at that moment. 2) SInt(j, msg) is above the T hrint. This is clear in the following inequalities.

SInt(j, msg) ≥ T hrInt ||

(P Iutil(j) > P Iutil(i)& P Iutil(j) > P Iutil(k) ∀k ∈ InContact(i)) (5.15)

5.2.3 PISCastOp: Threshold-Opportunistic PISCast

The threshold opportunistic version of PISCast, namely PISCastOp, seeks candidates that satisfy the

PISCast conditions, or satisfies the Opportunistic-Interest-Power condition. The Threshold Opportunis-

tic version conditions are met by satisfying the following inequalities.

109 (SInt(j, msg) ≥ T hrint & bat(j) ≥ T hrbat)||

(P Iutil(j) > P Iutil(i)& P Iutil(j) > P Iutil(k) ∀k ∈ InContact(i)) (5.16)

5.2.4 PISCastDep: Depletion-Rate-Aware PISCast

This version is aware of the battery’s depletion rate where the node is allowed to keep holding a copy of the message as long as its battery level can be sustained till the TTL of the carried message. This is achieved by computing a T T Lcheck value that predicts the battery level of this node by the time the

TTL(msg) is reached. This T T Lcheck value is compared against a predefined battery threshold value; if the battery level is expected to be above this threshold, it is allowed to keep a copy of the message and continues to seek the best candidates for message forwarding, otherwise, this node has to drop the message to conserve the remaining battery level it has reached. The T T Lcheck value is computed as per

Equation 5.17.

T T Lcheck(i, msg) = (bat(i) − ((deprate(i) ∗ (TTL(msg) − time(now)))+

(forwardpower(i) ∗ transfertime(msg)) + W iF ipower(i)) (5.17)

where bat(i) is the current battery level of node i, deprate(i) is the detected rate of depletion of this node’s battery, TTL(msg) is the time-to-live of the forwarded message, time(now) is the current time slot, forwardpower(i) is the amount of power consumed due to forwarding messages by the mobile brand of node i, transfertime(msg) is the time duration needed for a complete transfer of the message between two nodes to occur, and finally W iF ipower(i) is the amount of power consumed due to a WiFi scan initiated by node i.

After the computation of T T Lcheck(i), the message carrier - the node whose IsAholder(i, msg) = true - checks whether its current battery level is above a predefined threshold in order to decide whether to keep a copy of the message - by setting inBuffer(i, msg) to be true - and continues to seek the best candidates as illustrated in the following pseudo-code:

  true if IsAholder(i, msg)    & inBuffer(i, msg) inBuffer(i, msg) = (5.18)  &(T T Lcheck(i) >= batteryT hrd),    false otherwise

110 Each encountered node j computes its T T Lcheck(j) and exchanges this piece of information with the message carrier. If the T T Lcheck(j) is above the threshold and its interest-and-power-aware utility

value P Iutil(j, msg) exceeds that of node i (P Iutil(i, msg)) and those of the other nodes k which are

in contact at the moment with node i, then node j is selected to be the best candidate and it receives a

copy of the message and its IsAholder(j, msg) becomes true. This logic is summarized as follows:

T T Lcheck(j) = (bat(j) − ((deprate(j) ∗ (TTL(msg) − time(now)))

+ (forwardpower(j) ∗ transfertime(msg)) + W iF ipower(j))

if(IsAholder(i, msg)& inBuffer(i, msg)&(T T Lcheck(j) >= batteryT hrd)

&(P Iutil(j) > P Iutil(i)& P Iutil(j) > P Iutil(k) ∀k ∈ InContact(i)))

bestcandidate = j

IsAholder(j, msg) = true

(5.19)

5.2.5 PISCastOpDep: Depletion-Rate-Aware PISCastOp

This version is aware of the battery’s depletion rate and also takes into consideration the interest-and-

power-aware opportunistic selection. This is achieved by applying the conditions of PISCastDep and the

conditions of PISCastOp.

5.3 PI-SOFA Integration with SCAR

SCAR [27] is a sensor context-aware routing protocol for opportunistic routing. SCAR is an adaptive

power-aware version of SocialCast. SCAR relies on predicted values of the utility attributes, instead of

on current values, to improve performance. The SCAR protocol is applicable in opportunistic routing

since it relies on existing routes between mobile sensor nodes. The forecast mechanism is based on

observing the candidate node’s history of co-location with the destination nodes, its change of degree of

connectivity with other sensor nodes, and its battery level. SCAR includes a monotonically decreasing

function that adapts to the predefined ranges of values of a certain attribute. The algorithm mainly sets

an adaptive range for the battery level. This range, Rangebat, monotonically decreases as the battery level enters a critical range. The utility function weighs all the attributes so that it maximizes the benefit

gained from them all and in order to balance the trade-off between all of these attributes. The route

selection process is based on selecting the sensor nodes with the higher utility value from among their

111 neighbor sensors. The node’s utility rank as defined in [27] is shown in equation (5.20).

Util(i) = Rangecol ∗ wcol ∗ Pcol(i) + Rangecdc ∗ wcdc ∗ Pcdc(i) + Rangebat ∗ wbat ∗ Pbat(i) (5.20)

where Pcol(i), Pcdc(i) and Pbat(i) respectively are the predicted co-location of node i with any of the sink nodes, the predicted change degree of connectivity of node i, and the predicted remaining battery level of node i. The three predicted attributes are forecasted by applying Kalman Filter prediction techniques. The weights wcol, wcdc and wbat respectively represent the relative importance of the co- location attribute, the change of degree of connectivity attribute and the battery level attribute. The authors of the SCAR contribution mention that these weights’ values depend on the application scenario.

Also, Rangecol, Rangecdc and Rangebat are the adaptive monotonically decreasing range functions for the respective attributes.

The SCAR authors declare that this approach is suitable for environments where mobility patterns can be predicted to a certain extent. The authors have shown how the protocol forwards sensor data through mobile sensor nodes to fixed or mobile destination nodes. This protocol is aware of the power consumption and selects the routes that consume less power. The SCAR protocol selects the best candidates for forwarding sensor data based on forecasting the greater likelihood that they will meet destination nodes. The protocol is completely distributed since each node computes its attributes locally.

From their simulation experiments, the SCAR authors find that the optimum values for the weights and range functions are as follows: Rangecol = 1, Rangecdc = 1, Rangebat = remaining battery portion, wcol = 0.75, wcdc = 0.25 [27]. Because they have not conducted experiments that include the battery attribute, they do not mention any optimum set of weights that include wbat.

5.3.1 ISCAR: Interest-Aware SCAR

This interest-aware version of SCAR rewards or penalizes the adaptive SCAR utility value of a node

based on the collective interest of the node and that of its friends in the forwarded message. This

version minimizes contact with uninterested ones and favors potentially interested nodes that are socially

connected to potentially interested friends. The new utility function is computed as follows:

IUtil(i) = F rInter(i, msg) ∗ Util(i) (5.21)

where Util(i) is computed as per equation (5.20) and FrInter(i,msg) is computed as per equation (3.2).

Furthermore, the criteria for forwarder selection follows the same as that which is applied by ISCast as

shown in the condition (5.13).

112 5.3.2 PISCAR: Power-and-Interest-Aware SCAR

The PISCAR version integrates power-awareness to the interest-aware ISCAR version. Through this ver- sion, the partially interested nodes that are power capable become favored forwarder candidates. This version considers whether the power-aware approach preserves more power than the interest-oblivious power-aware SCAR. This comparison is made by assessing their performance in terms of power con- sumption and fair utilization of power resources.

5.3.3 PISCAROp: Threshold-Opportunistic PISCAR

The opportunistic version of PISCAR, namely PISCAROp, seeks candidates that satisfy either the

PISCAR conditions or the Opportunistic-Interest-Power condition. The Opportunistic-Interest-Power condition is met by satisfying the inequalities mentioned in 5.16.

5.3.4 PISCARDep: Depletion-Rate-Aware PISCAR

This version of PISCAR considers the depletion rate of the node’s battery and whether the node can sustain until it delivers the message before the message’s TTL. This version also takes into consideration the amount of power consumed in the WiFi scanning and forwarding actions. If the node’s TTLcheck value - which is calculated according to Equation 5.3.4 - exceeds a predefined power threshold T hrbat, the weight coefficient of the node’s battery level attribute is set to a high predefined value (in our simulation we set it to 0.25), otherwise, the weight coefficient is set to the remaining ratio of the battery level which would be a small ratio for exhausted nodes as shown in the pseudo-code 5.3.4.

T T LCheck(i, msg) = (bat(i) − ((DepRate(i) ∗ (TTL(msg) − T ime(now)))+

(forwardpower(i) ∗ T ransferT ime(msg)) + W iF iconnectionpower(i))

(5.22)

  0.25 if (T T LCheck >= T hrbat) wbat = (5.23)  bat(i) otherwise

This way of controlling the weight coefficient of the battery level attribute enables the algorithm to control the participation of the current message carrier node where the algorithm forces this node to drop the message if its power level drops below a predefined threshold when its P Iutil dramatically decreases

113 in value accordingly. This control process is similar to the one applied by the P ISCastDep algorithm and detailed in the pseudo-code in 5.2.4.

5.3.5 PISCAROpDep: Depletion-Rate-Aware PISCAROp

Finally, a performance comparison between this version and the proposed adaptive ranking algorithm is conducted. Details are provided in the following section.

5.4 Evaluation

For each of the aforementioned algorithmic contributions, we will present in detail the simulation details, as well as experimental results. First, the simulation assumptions are listed, followed by the methodology applied in these simulation runs. Next, the set of simulation environment parameters is displayed and discussed. Finally, the simulation metrics used in these simulations are described.

5.4.1 Simulation Assumptions

For the proposed algorithms to operate effectively, a set of attainable assumptions are made based on the capabilities of the technology that is currently available. These assumptions are:

1) That an ontology of interest is present among nodes.

2) That the node of each user has an installed client that carries a local copy of the user’s social profile that is cached from his online social network.

3) That direct interest can be extracted from the social profile of the candidates.

4) That all the messages are of the same size (for simplicity of cost calculations).

5) That the messages are short-duration meaning that target users located will be in a place for a short period of time e.g. a couple of hours.

6) The existence of a fully connected social graph among the users in place is not required, since these algorithms are based on interest and friendship and for applicability in a mall environment, for example, within a mobility duration of a couple of hours (in contrast to CA-PeR algorithm). Note that our algorithms have been simulated once with a fully connected social graph and another time without this precondition in order to prove their applicability in either environment.

7) That the message sender is the source that has no interest in receiving its own message. Thus, for the IPeR and PIPeR versions, the sender’s rank starts as (1-d) and never improves due to the zero- interest component in the equation. However, all the forwarders update their ranks as they encounter their friends so that the algorithms are able to be more selective in light of the candidate’s popularity and interest.

114 8) That each node is able to provide its current power level when requested in order to achieve power- awareness. In addition, if the depletion rate of the node’s battery is included in the calculations of the node’s rank, each node is assumed to provide this information as needed.

5.4.2 Methodology

This study empirically evaluates the improvement in performance of PeopleRank [22], SocialCast [24] and SCAR [27] algorithms after integrating the PI-SOFA framework as follows:

1) Integrating interest-awareness to produce their new interest-aware versions (IPeR, ISCast and

ISCAR).

2) Integrating power-awareness to empirically evaluate the power and interest-aware versions of Peo- pleRank, SocialCast and SCAR (PIPeR, PISCast and PISCAR).

3) Integrating threshold-opportunistic forwarding to produce PIPeROp, PISCastOp and PISCAROp.

4) Implementing the interest-oblivious and power-oblivious Epidemic [157] algorithm to set its per- formance as a benchmark for all the algorithms.

5) Comparing the new versions’ performance to another popular social aware forwarding algorithm, namely, ProfileCast [25], which relies on the assumption that users of similar interest have similar mobility patterns. ProfileCast relies on users’ behavioral profiles in the selection of a forwarder node. For behavioral profile computations in ProfileCast, the SInt (similarity interest) of each node to the forwarded message is used.

6) Depicting the importance of integrating contact-duration-awareness in forwarder selection by introducing contact-duration-awareness in the forwarding decisions of the PIPeROp algorithm. This takes place through the implementation of a contact-duration aware version of PIPeROp that relies on the Kalman filter prediction mechanism [26] in candidate node selection. The contact-duration aware

PIPeROp is implemented twice: once with the Kalman filter prediction, and once with an accurate contact-duration prediction mechanism. The contact-duration oblivious PIPeROp version is set as the benchmark in both cases.

7) The PIPeR algorithm integrates power and interest awareness into the PeopleRank algorithm by reward or penalty based on remaining power, expected contact duration and depletion rate. The four proposed modes of PIPeR mainly vary in their selection of either a fixed predefined battery level threshold or an adaptive threshold that is dynamically updated over the course of the forwarding process. These

PIPeR modes also vary based on whether or not they include an extra ranking component that explicitly favors power-capable nodes that are held by potentially interested users.

8) Studying the impact of introducing the depletion rate parameter in decision making. We simu- lated this depletion-aware performance by utilizing the real battery depletion rate dataset. This set of experiments compared the performance of the depletion-rate aware versions to that of the depletion-rate-

115 Table 5.2: Common Simulator Environment Parameters

Parameter Nominal Value Range No. of users 100 10 - 300 No. of senders 5 1 - 20 Set of Interests 10 2 - 20 Similarity interest discrete uniform normal, discrete distribution uniform distribution Destination set 18% 10% - 50% Damping factor 0.87 d = 1 - div [22] SInt(source,msg) 0.3 0 - 1 Interested 27% 0 - 54% Forwarder Set ProfileCast T hrnbr = 0.7, T hrfwd = 0.3 (CSI:D) mode SocialCast wcdc = 0.25, wcol = 0.75

oblivious versions.

9) Finally, the proposed adaptive ranking algorithm was empirically evaluated and its performance

was compared to the adaptive SCAR algorithm. We specifically compare the performance of the new

adaptive ranking algorithm to that of the opportunistic interest and power-aware (PISCAROp) version.

5.4.3 Simulation Environment Parameters

The SAROS simulator detailed in Chapter 4 has been used to evaluate the implemented versions of the

PI-SOFA framework. The set of parameter values that is applied in this evaluation process is mainly

comprised of interest-aware parameters and power-aware parameters:

Interest-Aware Parameters In reality, not all users are interested in the same messages. To simulate this consideration, the simulator receives a set of interest-aware parameters which are detailed in Chapter 4. The values of the parameters that are applied to produce the results are detailed in

Chapter 5.5 and are listed in Table 5.2.

Power-Aware Parameters This set of parameters cover the initial battery level distribution, the simulated model of power consumption and the simulated distribution of the battery depletion rate.

Table 5.3 lists the values of the power-aware parameters that are applied to produce the results detailed in Chapter 5.5.

To implement SCAR with an adaptive version that follows the equation 5.20, various combinations of weights and ranges were tested, as shown in Table 5.4. The version names represented in the table are extracted from the weights of the three attributes; namely, the change in degree of connectivity, the colocation and the remaining battery. For instance, the version 15-10-75 is the SCAR version with the following weights: wcdc = 0.15, wcol = 0.1 and wbat = 0.75. The results of these experiments show that the best performance of SCAR is obtained with the following weight values: wcdc = 0.1875, wcol =

0.5625 and wbat = 0.25. Also, the Rangebat is set according to the remaining portion of the battery. With

116 Table 5.3: Common Simulator Environment Parameters

Parameter Nominal Value Range Mall Area 1kmX1km 1kmX1km , 2kmX2km, 4kmX4km No. of users 50 10 - 300 No. of senders 100 1 - 100 Exp. Duration 1 hour 1hr , 2hr Set of Interests 10 2 - 20 Similarity Discrete Normal, Discrete Interest Dist. Uniform Uniform Distribution Destination set 18% 2% - 50% Damping factor 0.87 d = 1 - div [22] SInt(source,msg) 0.3 0 - 1 Interested 36% 0 - 54% Forwarder Set reward 0.5 0 - 1 Initial Battery Full Battery Discrete Normal, Full Distribution Distribution Battery, Real Dataset [1] Fixed Battery 50% 20% - 80% Threshold Mobile Brands Samsung i900 Omnia, for Power HTC Diamond 2 T5353, Consumption Samsung Galaxyi7500 and Samsung Spica [5] Depletion Usage Random Distribution in Rate Profiles [4] range [0%-50%], Real Dataset [1] SCAR wcdc = 0.1875, wcol = 0.5625, Parameters wbat = 0.25, Rangecdc = 1, Rangecol = 1, Rangebat = RemainingBatRatio ProfileCast T hrnbr = 0.7, T hrfwd = 0.3 SocialCast wcdc = 0.25, wcol = 0.75 Connection 10 sec. 0 - 20 sec. Association Time

117 Version Name wcdc wcol wbat 33-33-33 0.33 0.33 0.33 25-25-50 0.25 0.25 0.5 5-5-90 0.05 0.05 0.9 15-10-75 0.15 0.1 0.75 18-56-25 0.18 0.56 0.25 16-50-33 0.16 0.5 0.33 56-18-25 0.56 0.18 0.25 Dynamic 0.25 * (1 - wbat) 0.75 * (1 - wbat) remaining battery level

Table 5.4: Parameters of the examined SCAR versions these weights SCAR achieves the lowest power consumption with a moderate delivery ratio compared to its performance with all other weight variations. This version also significantly reduces the ratio of contacted uninterested forwarders but with very little reliance on interested forwarders.

It is worth noting from the results shown in Figures 5.3, 5.4, 5.5 and 5.6 that the dynamic version of

SCAR achieves the highest delivery ratio with the least power consumption per delivery ratio. However, it consumes the highest percent of power and forwards the largest number of messages. Also, it relies on a high ratio of contacted uninterested forwarders as well as a high ratio of contacted interested forwarders.

From these results, the SCAR version with the weights wcdc = 0.1875, wcol = 0.5625 and wbat = 0.25 was chosen to be the most comparable to PIPeR in terms of power awareness and uninterested forwarders avoidance.

(a) Cost over Time (b) Cost vs. Delivery Ratio (c) Delivery Ratio over Time

Figure 5.3: Cost and Delivery Time of SCAR versions

5.4.4 Simulation Metrics

The goals of this research are: 1) to seek opportunistic contact with the interested users in the least possible time while minimizing the overall cost and consumed power, especially for users not interested in the message. 2) To maintain fair power consumption among the nodes over time. To measure the performance of the compared algorithms, our simulator uses the following metrics which are categorized under the following 4 main categories:

1) Effectiveness: ratio of interested forwarders, ratio of uninterested forwarders, recall, precision, accuracy, F-measure and the Effectiveness Performance Index (which is the harmonic mean of F-measure,

118 (a) Interest-Based Classification (b) Precision/Recall/Accuracy

Figure 5.4: Effectiveness of SCAR versions

(a) Power Consumption over Time (b) Power Consumption vs. Delivery Ratio

Figure 5.5: Power Consumption of SCAR versions

Ratio of Interested Forwarders and Ratio of Uninterested Forwarders);

2) Efficiency: delivery ratio, cost (as represented in the number of forwarded copies of the message), delay in delivering the message, and the Efficiency Performance Index (which is the harmonic mean of

Cost, Delivery Ratio and Delay);

3) Power-Awareness: the ratio of the consumed power, the fairness index as defined in [160], the

final battery distribution, the mean, standard deviation, variance among the final battery levels, and the

Power-Awareness Performance Index (which is the harmonic mean of Fairness Index and Ratio of Power

Consumption).

4) Contact Duration Expectation: the amount of power and time wasted due to incomplete transfers.

All these metrics are discussed in more detail in Section 4.2.5.

5.5 Results

This section analyzes the results of the simulation experiments that were conducted. First, the per- formance of the compared algorithms is presented in terms of the interest-aware metrics. Second, the performance of the algorithms in terms of the power-aware metrics is given. Finally, a brief performance

119 (a) Cost over Time (b) Cost vs. Delivery Ratio (c) Delivery Ratio over Time

Figure 5.6: Fairness of SCAR versions analysis of the PIPeR algorithm after introducing contact-duration-awareness is provided.

5.5.1 Interest Awareness

Effectiveness

Integration of interest awareness improves the three algorithms’ effectiveness in terms of an increase in the achieved delivery ratio, an increase in the ratio of contacted interested forwarders, and a decrease in the ratio of contacted uninterested forwarders. As shown in Figure 5.7a, the interest-aware algorithms maintain higher effectiveness compared to that of the interest-oblivious versions. This is clear from both the significant increase in the ratio of contacted interested forwarder nodes (the yellow portion of the bar), and the significant decrease in the ratio of uninterested nodes (the blue portion of the bar). This effectiveness is also depicted in Figure 5.7b where the interest-aware forwarding algorithms achieve higher f-measure than the interest-oblivious versions. For instance, both figures illustrate that the power and interest-aware PeopleRank version (PIPeROp) achieves almost the same delivery ratio as that achieved by the PeopleRank algorithm and the same ratio of contacted interested forwarders, while effectively

(a) Interest-based Effectiveness (b) F measure

Figure 5.7: Effectiveness

120 reducing the ratio of contacted uninterested nodes to 4.5% instead of the 33.27 % that are contacted by the PeopleRank algorithm. This great achievement is mainly attributed to the interest and power-aware reward/penalty of the node’s rank. Similarly, when compared to the original SCAR and SocialCast algorithms, the interest and power versions of SCAR and SocialCast achieve higher delivery ratio, a much higher ratio of contacted interested forwarders, totally avoids contacting any uninterested nodes.

This performance boosted their f-measure to the range of 0.3 to 0.43 instead of the 0.07 f-measure of the original SCAR and SocialCast algorithms. This is also attributed to the introduced interest and power awareness that rewards/penalizes the node’s rank.

Efficiency

The efficiency of the compared algorithms is measured in terms of paid cost in the form of number of message replicas forwarded till the destination nodes received a copy of the message, the achieved delivery ratio, and delay in delivering content. In comparison to PeopleRank, the IPeR and PIPeR versions significantly reduce cost by the half while reaching 80% of the destination nodes as depicted in Figure 5.8a. When compared to SCAR and SocialCast, their interest-aware and their power-aware versions increased the cost six fold for the sake of making contact with 19% extra destination nodes and

28% additional interested nodes, while also achieving total avoidance of contact with the uninterested nodes. These great achievements are mainly attributed to the selective feature of the interest and power aware reward/penalty factor of the node’s rank which reduces the message replicas to only those who are interested also those who carry power capable nodes.

(a) Cost vs. Del. Ratio (b) Delay

Figure 5.8: Efficiency

In terms of delay in delivery time, Figure 5.8b shows how all the interest-aware and power-aware versions maintain a comparable delay - if not a reduced one - to the interest and power-oblivious ones.

For instance, all the interest-aware and power-aware SCAR versions reduce delay to 29% of that incurred by SCAR. The figure shows that SocialCast, achieves a very reduced delay in only 5 minutes, but this

121 is because it made contact with just one destination node; if it had achieved a better delivery ratio, the delay would have been longer.

(a) Power Consumption vs. Delivery Ratio (b) Fairness

(c) 4-category Power Consumption (d) Mean and STD

Figure 5.9: Power Consumption and Fairness

5.5.2 Power Awareness

Power Consumption Awareness

Figure 5.9a illustrates the portion of power consumed versus the achieved delivery ratio. Compared to

PeopleRank, the power-aware versions of PeopleRank successfully conserve 10% of the power PeopleRank consumes to reach comparable delivery ratio. However, the power-aware versions of the SCAR and

SocialCast algorithms consume 3.6% extra power to achieve a higher delivery ratio and to attain the highest level of interested node selection. The difference in performance between the power-aware SCAR and SocialCast versions on one side and the power-aware PeopleRank versions on the other side is attributed to the core logic of SCAR and SocialCast which selects a single candidate out of the currently contacted nodes after consuming several power-consuming comparison computations. Their power-aware versions add power-capability and interest-level as extra criteria for node selection, which inevitably

122 consumes more power but achieves more precise node selection.

Fairness

In terms of fairness in utilizing the nodes in the place, the fairness index defined in Section 4.2.5 is measured for all the compared algorithm. From the fairness index figure (Figure 5.9b), which depicts the fairness in utilizing the power resources of all the nodes in the simulation, it is apparent that the

PIPeR versions are fairer than the power-oblivious PeopleRank and IPeR algorithms. On the other hand, the power-aware SCAR and power-aware SocialCast versions are less fair than SCAR and SocialCast which hardly make any contact with nodes; Paradoxically, SCAR is considered the fairest algorithm even though it barely contacts nodes leading to the least power consumption and the least delivery ratio.

However, all the power-aware SCAR and SocialCast versions maintain higher level of fairness than all the

PeopleRank versions and the benchmark algorithms. The ISCast and PISCastOp algorithms consume more power as they are concerned with the opportunistic selection of the interested forwarders with reliable power capabilities. It is worth noting that the adaptive PIPeR versions constitute the fairest algorithms since they barely consume the power of the prevailing majority of the community. Figures

5.9c and 5.9d second these observations since the PeopleRank algorithm produces a very low mean and a wide standard deviation, while the SCAR and SocialCast algorithms and their interest and power aware versions achieve the least standard deviation from the mean value.

Overhead of Exchanged Control Messages

Figure 5.10: Number of Produced Control Messages

In order to measure the overhead each algorithm poses in terms of extra control messages that are exchanged among nodes, this section illustrates the number of control messages exchanged within each algorithm during the forwarding process. This section also presents the percent of power consumed due

123 Figure 5.11: Percent of Consumed Power for the forwarded Control Messages to forwarding these control messages among nodes.

Figure 5.10 illustrates the number of control messages exchanged among nodes by applying each of the proposed algorithms within the simulation. From this figure, it is clear that the Epidemic algorithm consumes the highest number of control messages as it floods all the neighboring nodes with control information all the time. It is worth mentioning that the proposed interest and power aware versions of PeopleRank incur more control messages than that of the original PeopleRank algorithm as they exchange the interest vector of the forwarded content and the candidate node’s Rank before making decisions of message forwarding. On the other side, the proposed interest and power-aware SCAR and

Socialcast versions incur the same number of control messages whereas the content of the messages differ a little bit.

As shown from Figure 5.11, it is noticeable that the proposed versions of the algorithms consume more amount of power in comparison to that consumed by the original algorithms. It is worth mentioning that despite the higher amount of consumed power in control message exchange, the proposed versions consume less amount of overall power consumption in the whole forwarding process - as illustrated in

Figure 5.9a. This indicates that the proposed algorithms incur more control messages power consumption to preserve power in the overall forwarding process. This in itself is an improvement in performance.

Depletion Rate Awareness

Performance of the Depletion-Rate-Aware PIPeR versions

According to the depletion-rate-aware PIPeR, the nodes with the lowest depletion rate are selected as forwarders. Thus, it is expected that the overall consumed power will be less than that consumed by the original PIPeR. Also, a fewer number of forwarders are utilized by the depletion-rate-aware PIPeR than those used by PIPeR. Moreover, a greater portion of the nodes should lie in the high 2 categories

124 (a) Cost over Time (b) Cost vs. Delivery Ratio

Figure 5.12: Cost of Depletion-Rate-Aware PIPeR

(a) Delivery Ratio over Time

(b) Effectiveness

Figure 5.13: Delivery Ratio and Effectiveness of Depletion-Rate-Aware PIPeR of 4-category battery clustering when the depletion-rate-aware PIPeR is applied in comparison to the clustering produced from applying the PIPeR algorithm.

We have conducted the experiments with the full battery distribution once with random depletion rate

(its results are illustrated in Figures 5.12, 5.13, 5.14 and 5.15) and another time with uniform distribution of depletion rate. Another set of experiments were conducted with the heatmap battery distribution and the depletion rate was computed accordingly. The results of the heatmap distribution experiments are shown in the figures 5.16, 5.17, 5.18 and 5.19. As these figures show, there is no significant difference in the delivery ratio attained by all the depletion-rate-aware PIPeR versions. However, the noP-wD version reaches the highest delivery ratio with the least power consumption per unit delivery ratio; and thus constitutes the fairest version.

125 (a) Power Consumption over Time (b) Power Consumption vs. Delivery Ratio

Figure 5.14: Power Consumption of Depletion-Rate-Aware PIPeR

The TTL-aware versions attain the lowest delivery ratio and the highest power consumption per unit delivery ratio; and as such are the least fair versions. This is attributed to the fact that this version keeps on checking several candidates and, with difficulty, it finds candidates that satisfy the TTL-aware criteria. Maybe this criteria has to be loosened a little bit in order to enable the selection of more forwarders.

It is worth noting that P-noD-TTL reaches the highest portion of interested forwarders, while the noP-wD-TTL version avoids all uninterested forwarders. Also, it is noticeable that the versions that compare the depletion rate of the carrier to that of the candidate reach the lowest delivery ratio but avoid uninterested forwarders.

Performance of the DepletionRate-Aware SocialCast Versions From Figures 5.20, 5.21 and

5.22, it is noticeable that the depletion-rate-aware versions - namely PISCastDep and its Opportunistic

version PISCastOpDep - successfully reduce cost, preserve power consumption, attain a higher level of

utilization fairness and are more precise. However, these algorithms lead to more delay and a lower

delivery ratio as a result of their being more selective in their choice of the next forwarder.

It is worth noting that the depletion-rate-aware PIPeR versions maintain similar performance to that

of their depletion-oblivious peer versions with only a slight difference in power preservation and a slight

decrease in delivery ratio.

5.5.3 Contact Duration Expectation

Table 5.5 depicts the results of forwarding decisions based on the expected contact duration with the se-

lected forwarder using each of the following contact expectation methods: accurate expectation, Kalman

filter prediction, and no expectations at all. The results show that the accurate expectation of contact

duration significantly reduces the amount of power and time wasted while preserving the f-measure.

Also, applying Kalman filter reduces the wasted resources at the cost of a reduced f-measure. From the

conducted experiments, we noticed that introducing Kalman filter prediction mechanism in the forwarder

selection process may not be effective unless the proper initialization parameters are applied to the filter.

126 (a) Mean and STD (b) Variance over Time

(c) 4-category Battery Distribution

Figure 5.15: Fairness of Depletion-Rate-Aware PIPeR

Also, from the experiments we deduce that the Kalman filter requires a calibration period in order to reach a stage of accurate prediction. Thus, for the short-duration forward processes, the Kalman filter prediction mechanism may not be efficient and there is a need for more accurate and faster calibration mechanisms.

Table 5.5: Contact-Duration-Awareness Wasted Power and Time

Contact-Aware Version Accurate Expectation Kalman Filter No Expectation Wasted Time (sec.) 2107.05 2069.15 2288.55 Wasted Power (joules) 6485499.9 6368843.7 7044156.9 Total Consumed 10.836 10.795 10.805 Power (percent) F-measure 0.7389 0.7229 0.7265

5.5.4 Normalized Performance Indices

An analysis of Figures 5.23, 5.24 and 5.25 facilitates performance prediction of the algorithms across various environments. The 3 figures illustrate the power-awareness, the effectiveness, and the efficiency of the proposed algorithms in environments encompassing a range of distributions. These distributions include various interest distributions, power distributions, user densities, and message sizes. From the analysis, the following is deduced:

1) Along the various environment changes, the interest and power-aware PeopleRank versions main- tain a higher level of power awareness than that achieved by PeopleRank except for PIPeROp in the

127 (a) Cost over Time (b) Cost vs. Delivery Ratio

(c) Delivery Ratio over Time

Figure 5.16: Cost and Delivery Ratio of Depletion-Rate-Aware PIPeR - Heatmap Distribution normal battery distribution environment.

2) The interest and power-aware SCAR and SocialCast versions achieve a slightly lower level of power awareness compared to that achieved by SCAR and SocialCast as they succeed in contacting a higher percent of destination nodes and interested forwarders except for PISCAROp in the normal battery distribution environment.

3) SCAR and SocialCast always fail in the effectiveness metric, while the level of effectiveness of

PeopleRank is always stable.

4) The efficiency metric of all algorithms is predictable in all environments except for the normal interest distribution, and this is due to the challenge imposed by the normal interest distribution that results in generating a very small set of destination nodes.

5) The efficiency of SCAR and SocialCast decreases as the user density increases; thus, they are not recommended in crowded areas.

6) PIPeROp maintains the highest level of effectiveness in the majority of the environments while retaining a very high level of efficiency and a fairly high level of power-awareness. Furthermore, the performance of PIPeROp improves as the user density increases.

The 8-metrics Analysis: Figure 5.26 depicts, via an 8-metric space, a performance comparison of the PeopleRank versions. In comparison to PeopleRank, the interest and power-aware versions reduce

128 (a) Interest-based Classification (b) Precision/Recall/Accuracy

Figure 5.17: Effectiveness of Depletion-Rate-Aware PIPeR - Heatmap Distribution

(a) Power Consumption over Time (b) Power Consumption vs. Delivery Ratio

Figure 5.18: Power Consumption of Depletion-Rate-Aware PIPeR - Heatmap Distribution cost and power consumption significantly. These versions focus mainly on interested forwarders and avoid the uninterested ones, thus attaining the highest level of f-measure with comparable delay and some reduction in delivery ratio. More precisely, the PIPeR versions attain the highest levels of fairness while consuming a little less power to that consumed by PeopleRank. It is worth noting that the PIPeR versions successfully attain significant f-measure values which costs them more than the cost incurred by the IPeR algorithm, and leads to a slight reduction in their utilization fairness.

Furthermore, integrating interest and power-awareness to SCAR improves its effectiveness, delivery ratio and delay. As depicted in Figure 5.27, the proposed interest and power-aware SCAR versions achieve higher delivery ratio, f-measure, and maintain a significant increase in the ratio of contacted interested forwarders while also avoiding contact with uninterested nodes. Such increase in delivery ratio and in interested forwarder contact incurs reasonable increase in cost and power consumption with comparable delay. These versions overcome the defect of SCAR in achieving the least delivery ratio.

The simulation results illustrated in Figure 5.28 demonstrate that the interest and power-aware So- cialCast versions, in comparison to SocialCast, achieve higher f-measure, delivery ratio. This achievement comes at the cost of a comparable power consumption, fairness, cost and delay.

129 (a) Mean and STD (b) Variance over Time

(c) 4-category Battery Distribution

Figure 5.19: Fairness of Depletion-Rate-Aware PIPeR - Heatmap Distribution

Figures 5.26, 5.27 and 5.28 illustrate the extra f-measure level achieved by the threshold-opportunistic versions in comparison to their non-opportunistic peers. The threshold-opportunistic versions also main- tain less delay and slightly more cost from contacting more destinations and more interested nodes.

5.6 Analysis

5.6.1 Simulation-based Analysis

The PI-SOFA framework is evaluated via real-data based simulations using our simulator (SAROS).

These simulations utilize datasets that include both realistic [149] [150] and synthesized mobility traces

[137], social profiles [149], social relationships [163], power consumption models [1] [5], and data that has been generated by SAROS itself. Moreover evaluation metrics are devised for a performance comparison in order to measure the algorithms’ effectiveness, efficiency, power consumption, and utilization fairness.

The results of the simulation-based evaluation are as follows:

1) Integrating interest-awareness into the social aware forwarding process accomplishes up to 560% extra f-measure by excluding uninterested nodes and by focusing on the potentially interested ones.

2) Introducing power-awareness consumes 8% less power, incurs 41% less cost and attains higher fairness in power utilization. This is achievable via focusing on power-capable interested nodes.

130 (a) Cost vs. Del. Ratio (b) Delay

Figure 5.20: Efficiency - Depletion Rate Awareness

(a) Interest-based Effectiveness (b) F measure

Figure 5.21: Effectiveness - Depletion Rate Awareness

3) Making accurate contact-duration expectations reduces 8% of the consumed power and time wasted in incomplete message transfers between nodes. Subsequently, a set of normalized performance indices are proposed to aid in evaluating the performance of the presented algorithms across various environment setups. The proposed performance indices can be used to guide content senders (such as advertisers) in choosing the optimum algorithm depending on the environment in which the message (such as adver- tisement) will be forwarded. Overall, the proposed versions promote a trade-off between delivery ratio and delay on one side, and power preservation and fair utilization on the other side.

5.6.2 Statistical Analysis

To examine the statistical significance of the performance of the proposed versions in comparison to the original algorithms, we apply non-parametric test for 2-independent samples. Note that we did not apply the 2-independent t-test since the set of samples is not large enough for t-tests and its distribution is not known beforehand. Alternatively, the non-parametric Mann Whitney test for independent samples

131 (a) Power Consumption vs. Del. Ratio (b) Fairness

(c) 4-category Power Consumption (d) Mean and STD

Figure 5.22: Power Consumption and Fairness - Depletion Rate Awareness suits small number of samples with unknown distributions.

PeopleRank Versions Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the

PIPeROp algorithm in comparison to the original PeopleRank algorithm for the following reasons: 1)

We propose the PIPeROp algorithm as the interest-and-power-aware opportunistic version of PeopleR- ank; and 2) The simulation-based analysis proves that it is the best version in terms of performance in comparison with both the original PeopleRank algorithm and the other proposed interest and power aware PeopleRank versions.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results displayed in Table 5.6 and in Table 5.7. The results show significance (p-value) 0.0 in performance be- tween the PeopleRank algorithm and the PIPeROp algorithm across all metrics except in the utilization fairness metric which does not show statistical significance > 0.7. Accordingly, the mean ranks table,

Table 5.6 shows that the PIPeROp algorithm achieves higher mean rank in F-measure, contacts a signif- icantly less ratio of uninterested nodes, and significantly reduces the cost paid in forwarding content and the amount of power consumed in this process. However, the test indicates that the mean ranks of the

132 (a) Effectiveness in Various Interest Distributions (b) Effectiveness in Various Battery Distributions

(c) Effectiveness in Various Message Sizes (d) Effectiveness in Various User Densities

Figure 5.23: Effectiveness Performance Index

PIPeROp algorithm for both the delivery ratio and the interested forwarders ratio are less than those of

the PeopleRank.

SCAR Versions Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the PIS-

CAROp algorithm in comparison to the original SCAR algorithm for the following reasons: 1) We propose

the PISCAROp algorithm as the interest-and-power-aware opportunistic version of SCAR; and 2) The

simulation-based analysis proves that it is the best version in terms of performance in comparison with

both the original SCAR algorithm and the other proposed interest and power aware SCAR versions.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results

displayed in Table 5.8 and in Table 5.9. The results show statistical significance (p-value < 0.05) in performance between the SCAR algorithm and the PISCAROp algorithm across all the metrics.

According to the Mean ranks presented in Table 5.8, the PISCAROp algorithm achieves higher mean rank than that of SCAR in the following metrics: F-measure, ratio of contacted interested forwarders,

133 Metric Alg. N Mean Rank Sum of Ranks Fmeasure PeR 28 18.54 519.00 PIPeROp 20 32.85 657.00 Total 48 IntFWD PeR 28 29.61 829.00 Ratio PIPeROp 20 17.35 347.00 Total 48 UnIntFWD PeR 28 34.50 966.00 Ratio PIPeROp 20 10.50 210.00 Total 48 Delivery PeR 28 31.00 868.00 Ratio PIPeROp 20 15.40 308.00 Total 48 Cost PeR 28 33.39 935.00 PIPeROp 20 12.05 241.00 Total 48 Power-awareness PeR 28 30.54 855.00 PIPeROp 20 16.05 321.00 Total 48 Utilization PeR 28 25.04 701.00 Fairness PIPeROp 20 23.75 475.00 Total 48

Table 5.6: Ranks of PeopleRank and PIPeROp

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 113.000 137.000 0.000 98.000 31.000 111.000 265.000 Wilcoxon W 519.000 347.000 210.000 308.000 241.000 321.000 475.000 Z -3.492 -2.991 -5.856 -3.910 -5.207 -3.534 -0.314 Asymp. Sig. (2-tailed) 0.000 0.003 0.000 0.000 0.000 0.000 0.754 Exact Sig. (2-tailed) 0.000 0.002 0.000 0.000 0.000 0.000 0.764 Exact Sig. (1-tailed) 0.000 0.001 0.000 0.000 0.000 0.000 0.382 Point Probability 0.000 0.000 0.000 0.000 0.000 0.000 0.008

Table 5.7: Statistical Significance of PeopleRank and PIPeROp

134 (a) Efficiency in Various Interest Distributions (b) Efficiency in Various Battery Distributions

(c) Efficiency in Various Message Sizes (d) Efficiency in Various User Densities

Figure 5.24: Efficiency Performance Index delivery ratio, and utilization fairness. Moreover, the PISCAROp algorithm contacts a less ratio of uninterested nodes than those contacted by SCAR, but at the expense of more paid cost and more amount of consumed power.

SocialCast Versions Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the PIS-

CastOp algorithm in comparison to the original SocialCast algorithm for the following reasons: 1) We propose the PISCastOp algorithm as the interest-and-power-aware opportunistic version of SocialCast; and 2) The simulation-based analysis proves that it is the best version in terms of performance in com- parison with both the original SocialCast algorithm and the other proposed interest and power aware

SocialCast versions.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results displayed in Table 5.10 and in Table 5.11. The results show statistical significance (p-value < 0.05) in performance between the SocialCast algorithm and the PISCastOp algorithm across all the metrics

135 Metric Alg. N Mean Rank Sum of Ranks Fmeasure SCAR 29 15.00 435.00 PISCAROp 20 39.50 790.00 Total 49 IntFWD SCAR 29 16.55 480.00 Ratio PISCAROp 20 37.25 745.00 Total 49 UnIntFWD SCAR 29 32.69 948.00 Ratio PISCAROp 20 13.85 277.00 Total 49 Delivery SCAR 29 19.52 566.00 Ratio PISCAROp 20 32.95 659.00 Total 49 Cost SCAR 29 15.66 454.00 PISCAROp 20 38.55 771.00 Total 49 Power-awareness SCAR 29 20.90 606.00 PISCAROp 20 30.95 619.00 Total 49 Utilization SCAR 29 28.62 830.00 Fairness PISCAROp 20 19.75 395.00 Total 49

Table 5.8: Ranks of SCAR and PISCAROp

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 0.000 45.000 67.000 131.000 19.000 171.000 185.000 Wilcoxon W 435.000 480.000 277.000 566.000 454.000 606.000 395.000 Z -5.899 -4.990 -4.794 -3.234 -5.513 -2.421 -2.136 Asymp. Sig. (2-tailed) 0.000 0.003 0.000 0.001 0.000 0.015 0.033 Exact Sig. (2-tailed) 0.000 0.002 0.000 0.001 0.000 0.015 0.033 Exact Sig. (1-tailed) 0.000 0.001 0.000 0.000 0.000 0.007 0.016 Point Probability 0.000 0.000 0.000 0.000 0.000 0.000 0.001

Table 5.9: Statistical Significance of SCAR and PISCAROp

136 (a) Power Awareness in Various Interest Distributions (b) Power Awareness in Various Battery Distributions

(c) Power Awareness in Various Message Sizes (d) Power Awareness in Various User Densities

Figure 5.25: Power Awareness Performance Index except for the power awareness and the utilization fairness metrics where the significance values exceed

0.05. According to Table 5.10, the PISCastOp algorithm achieves higher performance than that of the

SocialCast algorithm in the following metrics, F-measure, the ratio of contacted interested forwarders, and the delivery ratio. Moreover, the PISCastOp algorithm succeeds to significantly reduce the ratio of contacted uninterested node than those contacted by the SocialCast algorithm, but at the expense of more paid cost.

5.6.3 Conclusion

Generally speaking, integrating interest-awareness in social-based forwarding approaches maintains a balance between utilizing interest and social context information, which improves the performance of these forwarding algorithms in case of any discrepancy in the interest/social information available.

In contrast to the interest-oblivious algorithms, the interest-aware versions significantly concentrate their power consumption on contacting interested nodes while avoiding contact with uninterested ones.

Consequently, the interest-aware algorithms maintain a high level of precision. The power-aware algo-

137 Metric Alg. N Mean Rank Sum of Ranks Fmeasure SCast 12 7.25 87.00 PISCastOp 12 17.75 213.00 Total 24 IntFWD SCast 12 8.67 104.00 Ratio PISCastOp 12 16.33 196.00 Total 24 UnIntFWD SCast 12 18.50 222.00 Ratio PISCastOp 12 6.50 78.00 Total 24 Delivery SCast 12 6.50 78.00 Ratio PISCastOp 12 18.50 222.00 Total 24 Cost SCast 12 6.67 80.00 PISCastOp 12 18.33 220.00 Total 24 Power-awareness SCast 12 12.25 147.00 PISCastOp 12 12.75 153.00 Total 24 Utilization SCast 12 12.67 152.00 Fairness PISCastOp 12 12.33 148.00 Total 24

Table 5.10: Ranks of SocialCast and PISCastOp

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 9.000 26.000 0.000 0.000 2.000 69.000 70.000 Wilcoxon W 87.000 104.000 78.000 78.000 80.000 147.000 148.000 Z -3.637 -2.657 -4.373 -4.157 -4.042 -0.173 -0.115 Asymp. Sig. (2-tailed) 0.000 0.008 0.000 0.001 0.000 0.862 0.908 Exact Sig. (2-tailed) 0.000 0.007 0.000 0.000 0.000 0.887 0.932 Exact Sig. (1-tailed) 0.000 0.003 0.000 0.000 0.000 0.444 0.466 Point Probability 0.000 0.000 0.000 0.000 0.000 0.022 0.022

Table 5.11: Statistical Significance of SocialCast and PISCastOp

138 Figure 5.26: The 8-Metric Analysis of PeopleRank Versions

Figure 5.27: The 8-Metric Analysis of SCAR Versions rithms also maintain a trade-off between reducing power consumption and securing a comparable delivery ratio. These power-aware versions are able acquire this high level of precision, even though they may fall in moderate utilization fairness levels due to the burden the message carriers exert while seeking interested nodes with high power capabilities. The handling of extra computations by the message carri- ers causes the carrier nodes’ power exhaustion, and increase the variance among the power levels of the community of nodes. For this reason, these power-aware versions set a power threshold to prevent the message carriers from approaching the power exhaustion border.

From the perspective of integrating contact-duration-awareness in forwarder selection, the more ac- curate the algorithm is in contact duration expectation, the less power and time are wasted in incomplete message transfers. This saves power consumption.

It can be concluded that integrating interest and power-awareness to social-aware forwarding algo- rithms improves their effectiveness and efficiency. The proposed variations of the interest and power-

139 Figure 5.28: The 8-Metric Analysis of SocialCast Versions aware versions promote a trade-off between seeking a higher delivery ratio and f-measure on one side, and reducing power consumption and improving the utilization of fairness on the other side. Not only that, but the above analysis also recommends certain algorithms be utilized in specific environments.

We speculate that the proposed versions have a number of valuable applications particularly in do- mains where there is a need to send content (text or multimedia) to interested individuals and parties through opportunistic networks where wireless or network infrastructures may be limited. Most appar- ently, this would be of value to advertisers wishing to send messages to people in a mall environment, or delivering portions of lecture notes or video content among students or sending notifications about venue or session times in a conference, for example. Content is commonly relevant to a specific group of people or communities - often valuable at a particular time and place. In the context of social media, individuals are often categorized in terms of their interests as stated in their profiles. For example, the abstracts of specific papers are relevant to some conference attendees (interested parties), but not to others (uninterested parties). Many interest groups could benefit from the proposed applications: as patients in a hospital or suffering from the same condition, students in a school or university, shoppers in a mall, to name but a few. In urban settings in particular, people may be classified by location or form more temporary so called ”opportunistic encounters”.

140 Chapter 6

Space Syntax Forwarding Proposed

Implementations

In this chapter we introduce Space Syntax based forwarding framework that is described in Section

3.2 into our simulation environment. This is achieved by proposing better attraction point definition approaches for more accurate forwarding decisions that contribute towards better space planning. First, we question the accuracy of Space syntax metrics in defining the attraction points in a space and argue that this negatively affects the performance and the accuracy of forwarding decisions as proven in Section

6.1. Next, we propose new Space Syntax metrics which are described in detail in Section 6.2. We then evaluate these proposals through our SAROS simulator where we define the evaluation metrics is Section

6.3, while the applied methodology is described in Section 6.4. Finally, the results of the simulations of the proposed implemented versions are analyzed in Section 6.5.

6.1 Space-Syntax-based Forwarding versus Real Traces based

Forwarding

First, we examine the correlation between the popularity of selected spots on the AUC map and their popularity based on the frequent association of mobile nodes with these spots. We consider the association frequency to be our benchmark as it reflects reality. First, we compute the spots’ popularity using the

Space Syntax metrics detailed in Section 3.2.1; namely, the Integration value, the Location index, and the

Attraction value. After computing the popularity of each spot according to each of the three mentioned basic Space Syntax metrics, we run IBM SPSS (Statistical Package for the Social Sciences) to compute the Pearson and Spearman correlations between the popularity value of each one of the metrics and the popularity as per the association frequency. The significance value is also computed. The results of these

141 Metric Pearson Correlation Sig. Value Spearman Correlation Sig. Value Attraction Value -0.057 0.164 0.077 0.061 Integration Value 0.057 0.164 -0.077 0.061 Location Index -0.057 0.164 0.077 0.061 Popularity Index -0.104 0.012* -0.040 0.332 * Correlation is significant at the 0.05 level (2-tailed).

Table 6.1: Correlations between Space-Syntax Metrics and Frequency of Association correlation analyses are illustrated in Table 6.1. Note that a correlation is considered significant at the

0.05 level (2-tailed).

We notice outliers in the dataset, and so adopt the steps taken by [164] in order to determine whether the outliers are influencing the results or not. They state that if there is not much difference between

Pearson’s correlation coefficient for the dataset with and without the outliers, then we can be pretty confident that the outliers are not influencing the results. The same process can be applied using the

Spearman’s rank coefficient.

When we conducted Pearson Correlation on the dataset with and without the outliers, we found the difference in Pearson Correlation values to be negligible, which indicates that these outlier values did not influence the results. We received the same results when we applied the comparison of values using the Spearman’s rank coefficient.

From the results of the comparison using the Spearman’s rank correlation, it is deduced that com- puting the popularity rank of the spots through each of the basic Space Syntax metrics does not reflect their popularity rank as that achieved based on the observed frequency of association. This indicates the inaccuracy of these metrics in defining the attraction points of a given place, and their inaccuracy in setting these spots’ relative popularity rank among their peers in the place.

6.2 Proposed Solutions

From the conclusion reached in the previous section, we propose another way for defining and ranking attraction points; the attraction points are ranked according to their degree of popularity among the remaining spots in the space. The popularity of these attraction points (also known as spots) is computed based on real-time user density measurement. To measure the user density in a space in real-time, it is extracted from selecting some spots on the map, and then measuring the frequency of association of the users with these spots (by the way, the duration of association with these spots can be used instead of the frequency). The association of the users with the spot can be detected through monitoring and counting the number of visitors to this spot. Another way of monitoring the association of these users with the spot, is to monitor their mobile nodes’ association with the network available in this spot.

The collected measures are then used in computing the popularity of these spots (either based on the frequency of association with these spots or based on the duration of association with these spots). The

142 mobile nodes’ popularity values are then computed based on the popularity of the attraction point (the spot) these nodes are currently located in. That is, the node’s popularity becomes a function of the spot(s) popularity.

In our experiments, we ran content forwarding simulations based on the original computation of Space

Syntax metrics and another set of simulations based on the newly proposed way of Space Syntax metric computation. We then compared the performance of content forwarding in opportunistic networks based on both approaches.

We experimented with several formula that represent the mobile node’s popularity which is then integrated into their rank for forwarding decision making. The performances of opportunistic forwarding of these alternative formula are compared taking into consideration that the Epidemic algorithm is the benchmark opportunistic forwarding approach in these experiments.

We examined 6 alternative formula for computing the mobile node popularity value which are defined in detail as follows:

6.2.1 MostPopAP

As the mobile node traverses across predefined spots, it gains its popularity value from the current spot it is located in as long as this new popularity value is higher than the mobile node’s already attained value. Accordingly, as it traverses across more popular spots it gains a higher popularity value.

If the mobile node is currently located close to n spots, it gains its popularity value from the spot with the highest popularity value among these close spots. Thus, the popularity value of the mobile node is computed as follows:

P opnew(mobile) = Max(P opold(mobile), P op(1), P op(2), ..., P op(n))

∀ SetOfCloseSpots = {1, 2, ..., n} (6.1)

where P opnew is the new popularity value of the mobile node, P opold is the old popularity value of the mobile node, P op(n) is the popularity value of the spot n where SetOfCloseSpots is the set of spots that are currently within proximity to the mobile node.

That is, if the mobile node is currently located in proximity with n spots, it gains the popularity value of the spot x which is the most popular spot among these n spots in case the popularity value of the most popular spot is higher than the already acquired popularity value of the mobile node. For instance, if the popularity value of the mobile node (P op(mobile) = 0.5), and there are three spots currently in proximity with this mobile node. The popularity values of the three spots are 0.3, 0.6, and 0.8. Since, the most popular spot - whose popularity value is 0.8 - has a popularity value that is higher than popularity

143 value of the mobile node - which is 0.5 - then the new popularity value of the mobile node becomes 0.8.

6.2.2 MostFreqAP

The mobile node gains its popularity value from the popularity value of the most frequented spot asso-

ciated with this mobile node across its mobility trace.

P opnew(mobile) = P op(MostF reqSpotnew)

where MostF reqSpotnew = Spot(Max(F req(MostF reqSpotold, mobile),

F req(1, mobile), F req(2, mobile), ..., F req(n, mobile)))

∀ SetOfCloseSpots = {1, 2, ..., n} (6.2)

where P opnew is the new popularity value of the mobile node, MostF reqSpotnew is the newly detected spot whose frequency of co-location with the mobile node is the highest recorded frequency of co-location

so far,

MostF reqSpotold is the old detected spot whose frequency of co-location with the mobile node was the highest recorded frequency of co-location before detecting the new spot,

F req(n, mobile) is the frequency of co-location between the spot n and the mobile node,

Spot(F req(x, mobile)) is the spot x whose frequency of co-location with the mobile node is F req(x, mobile),

and SetOfCloseSpots is the set of spots that are currently within proximity to the mobile node.

6.2.3 MostdurationAP

The mobile node gains its popularity value from the popularity value of the spot at which the mobile node stays for the longest duration.

P opnew(mobile) = P op(MostdurSpotnew)

where MostdurSpotnew = Spot(Max(ContactDur(MostdurSpotold, mobile),

ContactDur(1, mobile), ContactDur(2, mobile), ..., ContactDur(n, mobile)))

∀ SetOfCloseSpots = {1, 2, ..., n} (6.3)

where P opnew is the new popularity value of the mobile node, MostdurSpotnew is the newly detected

spot with which the mobile node is co-located for the longest duration of time, MostdurSpotold is the old detected spot with which the mobile node is co-located for the longest duration of time so far,

144 ContactDur(n, mobile) is the longest recorded contact duration of colocation between the spot n and

the mobile node, Spot(ContactDur(x, mobile)) is the spot x whose contact duration of colocation with the mobile node is ContactDur(x, mobile), and SetOfCloseSpots is the set of spots that are currently within proximity to the mobile node.

6.2.4 CloseAPs

The mobile node achieves its popularity value based on a computed function of the popularity values of the spots that are currently close to this mobile node. This node’s popularity value, P op(mobile),

is calculated as the average of the popularity values of the spots currently close to the mobile node as

follows:

P op(mobile) = Average(P op(1), P op(2), ..., P op(n)) ∀ SetOfCloseSpots = {1, 2, ..., n} (6.4)

Other functions such as the median can be applied when computing the mobile node’s popularity value.

If the mobile node gains its popularity value from the most popular spot among those spots in proximity, not from the average of the popularity values of the spots currently in proximity, then this version is called mostPopClsAP.

6.2.5 WeightedCloseAPs

The mobile node’s popularity value is computed based on a weighted average of the popularity values of all the current close spots. The weight of each spot is inversely proportional to its current distance from the mobile node, distance(spot, mobile), where the closest spot(s) have the highest weight among the rest. Thus, the mobile node’s popularity value is computed as follows:

P op(mobile) = Average(A ∗ P op(1),B ∗ P op(2), ..., N ∗ P op(n)) ∀ SetOfCloseSpots = {1, 2, ..., n} 1 1 1 whereA = ,B = , ..., N = (6.5) distance(1, mobile) distance(2, mobile) distance(n, mobile)

For instance, assume the mobile node is currently located at 2 meters from spot 1 whose popularity

value is 0.4, and is located at 3 meters from spot 2 whose popularity value is 0.8. Thus, the popularity

value of the mobile node will be the average of (1/2) ∗ 0.4 and (1/3) ∗ 0.8 which is 0.233.

145 If the algorithm only considered the spots close the mobile node within a certain proximity range

Prox, then this algorithm is called WeightedCloseAPsProx.

Another variation of this algorithm is the version that replaces the average function with a sum function, but still attenuates the popularity value of each spot by the distance between this spot and the mobile node. This version of the algorithm is referred to as divDist.

6.2.6 WeightedCloseAPsPerX

From the simulation runs of these versions that are illustrated in the results section of this chapter, we

find that setting the weight attribute in the popularity function to the reciprocal of the distance between the mobile node and the spot is a quite aggressive attenuation of popularity which leads to a low delivery ratio and a low f-measure. This is attributed to the fact that the popularity of the node is highly affected as it moves away from the popular spots. For instance, the node’s popularity drops to its half as the node moves from the proximity of 1 meter to the proximity of 2 meters. Thus, we propose that it is more practical to set this attenuation in phases where for every X meters the weight of the popularity attribute decreases. This can be achieved by setting the weights A, B, ..., N in Equation 6.2.5 to be

distance(1, mobile) A = P roximityRange − T runc( ), X distance(2, mobile) B = P roximityRange − T runc( ), X distance(n, mobile) ..., N = P roximityRange − T runc( ) (6.6) X

For instance, assume we set the ProximityRange to 30 meters, and we set the X meters to be 5

meters. If the mobile node is located at 3 meters from spot 1 whose popularity value is 0.4, and is also

located at 7 meters from spot 2 whose popularity value is 0.6, then its popularity value is the average of

(30 − T runc(3/5)) ∗ 0.4 and (30 − T runc(7/5)) ∗ 0.6 which is equal to 14.7.

6.3 Evaluation Metrics

We evaluate the performance of the above mentioned algorithms using the following categories of evalu- ation metrics:

Effectiveness: This metric is measured in terms of interest-based effectiveness, and f-measure.

Efficiency: This metric is measured in terms of delivery ratio, cost, and delay. Also, the cost versus delivery ratio is studied.

Power-Awareness: This metric is measured in terms of the percent of power consumed versus delivery ratio, the utilization fairness which is measured through the Fairness index.

146 These metrics are detailed in Section 4.2.5.

6.4 Methodology

To compute the Space Syntax metrics, the following steps are applied:

6.4.1 Extract Streets and Hot Spots Coordinates

The coordinates of the streets and the coordinates of the access points or the selected spots are extracted from the area map. The extracted coordinates are stored in a file to be used later when computing various Space Syntax metrics.

6.4.2 Construct the Axial Maps and the Space Syntax Connectivity Graph

The axial map is constructed by extracting the streets available on the map in a separate graph then giving each street a number. The axial map illustrates the layout of the streets on the map and shows the intersections among these streets. To generate a connectivity graph, the streets are represented as nodes while the intersection between each two streets is represented by an edge. The information of each street such as length, coordinates, and the streets intersecting with this street is stored in a file for later use in the computation of various Space Syntax metrics.

6.4.3 Calculate Space Syntax Metrics

The basic Space Syntax metrics are computed for each street, hot spot and mobile device. The basic computed Space Syntax metrics include the Integration Value, The Attraction Value, the Location Index, the Connectivity Value, the Popularity Value, and the Popularity Index.

Space Syntax Metric Calculation for Streets and Spots

First, the above mentioned basic Space Syntax metrics are calculated for each and every street in the map. The Popularity Index for each predefined spot is computed based on the Popularity Index of the streets close to this spot.

Space Syntax Metric Calculation for Mobile Devices

For the mobile nodes, the Popularity Index and the newly proposed Space Syntax metrics are computed as per their definitions detailed in Section 6.2.

147 Abbreviation Algorithm Defined in Section Ep The Epidemic algorithm 4.2.4 PIPeROp The Opportunistic version of PIPeR integrated with a Space Syntax metric 5.1.3 mostPopClsAP The most popular current close spot 6.2.4 mostPopColAP the most popular colocated spot along the node’s path 6.2.1 wgtClsAPs weighted average popularity of the current close spots 6.2.5 wgtClsAPsProx average popularity of the current close spots weighted 6.2.5 by distance from the node within a predefined proximity range wgtevery10 weighted average popularity of the current close spots 6.2.6 every 10 meters within the proximity range PopIndex The Popularity Index Algorithm of the close spots 3.2.1 PopIndexStr The Popularity Index Algorithm of the close streets 3.2.1 IntVal Integration Value based Popularity 3.2.1 IntOnly Interest-Only Selection Algorithm 4.2.4 every10 Popularity of the current close spot boosted within 6.2.6 10 meters ranges of the proximity range divDist The sum of the popularity of the current close spots 6.2.5 attenuated by distance within the proximity range mostdurAP the popularity of the spot colocated for the longest 6.2.3 duration with the node mostfreqAP the popularity of the spot most frequently colocated 6.2.2 with the node

Table 6.2: List of Algorithms simulated in the Space Syntax Experiments

6.5 Results and Analysis

This section displays the results of the Space Syntax based forwarding experiments. The proposed algorithms are compared in terms of performance against the opportunistic Epidemic algorithm (Ep), the Interest-Only algorithm (IntOnly), and the interest-and-power-aware social-opportunistic PIPeR algorithm integrated with a Space Syntax metric (PIPeROp).

It is important to explain the abbreviations of the examined versions in the following figures before we start analyzing the illustrated results. Table 6.5 lists the abbreviated algorithms’ names and provides their full names, and the section where the algorithm is defined in detail.

6.5.1 Interest Awareness

Effectiveness

Effectiveness of the proposed solutions is studied in terms of F-measure, and the interested-based Effec- tiveness classification that pinpoints the contacted ratio of interested forwarders, the contacted ratio of uninterested nodes.

The results shown in Figure 6.1 illustrate how forwarding based on the Integration Value metric achieves 0.7 F-measure, while forwarding based on the Attraction Value metric achieves 0.64, and on the

Popularity Index version achieves 0.48 F-measure. The majority of the newly proposed versions achieve around 0.6 F-measure. Note that the ClosestAP penalized every 10m versions achieve the highest F-

148 measure (0.7 and 0.69). Furthermore, the integrated version of PI-SOFA and Space Syntax achieves an

F-measure of 0.93 which is a high value indicating that their integrations produces a boosted F-measure

value.

It is worth noting that the highest contacted ratio of interested forwarders is achieved by the basic

Space Syntax forwarding approach based on the Integration Value IntVal which is 0.99, while the pro-

posed weighted ClosestAP penalized every 10m wgtevery10 version only contacts 0.988 of the interested

forwarders.

The lowest ratio of contacted uninterested nodes is 0.38 which is achieved by the WeightedAPs

approach. It is worth mentioning that the integration of PI-SOFA approaches with the Space Syntax

approach avoids contact with the majority of uninterested nodes, thus significantly reducing the ratio to

0.028 which is a great achievement.

(a) Interest-based Effectiveness (b) F-measure

Figure 6.1: Effectiveness of Space Syntax-based Forwarding Algorithms

Efficiency

The efficiency of the proposed solution is illustrated in both the delivery ratio, the paid cost, and delay.

The results of efficiency are shown in Figure 6.2. It is worth noting that the integration value-based

forwarding algorithm and the WeightedAPs algorithm achieve the highest delivery ratios 0.99 and 0.984

respectively.

From another aspect, the integrated version of PI-SOFA and Space Syntax successfully reduces the

paid cost to reach 0.49 in order to achieve 0.88 delivery ratio. From sub-figure 6.2c, it is noticeable that

the PIPeROp and the Interest-Only algorithms are more cost effective.

From the delay sub-figure (see Figure 6.3a), it is apparent that the mostdurAP algorithm incurs the least delay in comparison to the other Space Syntax versions, followed by the Integration Value Algorithm.

However, when considering how much delay the algorithms incur for the sake of the achieved delivery ratio, as shown in Figure 6.3b, the Integration Value algorithm achieves the best combination (i.e. the steepest slope) of delivery ratio and delay among the Space Syntax versions. From this perspective, the

149 (a) Delivery Ratio over Time

(b) Cost over Time (c) Cost vs. Delivery Ratio

Figure 6.2: Efficiency of Space Syntax-based Forwarding Algorithms

Integration Value algorithm performs better than the mostdurAP algorithm in terms of Delivery Ratio versus Delay. Compared to the other Space Syntax versions, both the wgtClsAPsProx and the PopIndex algorithms incur the longest delay to achieve a medium delivery ratio.

6.5.2 Power Awareness

The power awareness of the proposed algorithms is evaluated in terms of both the percent of the consumed power and the utilization fairness. The results of power awareness are shown in Figure 6.4.

Power Consumption Awareness

It is noticeable from both figures, the power consumption figure (Figure 6.4) and the Performance

Comparison table (Table 6.5.4), that the integration of PI-SOFA and Space Syntax reduces the power consumption to 26.17% of all the nodes’ power, while the Space Syntax forwarding algorithm alone consumes 28.09 % of all the nodes’ power whereas the PI-SOFA algorithm consumes 26.46% of the all nodes’ power. It is worth noting that the mostDurAP algorithm achieves the least power consumption

150 (a) Delay

(b) Delay vs Delivery Ratio

Figure 6.3: Efficiency of Space Syntax-based Forwarding Algorithms- Cont. among our proposed Space Syntax forwarding algorithms as it only consumes 26.81% of all the nodes’ power yet it achieves a very low delivery ratio when compared to the other proposed Space Syntax forwarding algorithms.

Another striking observation is that the IntegrationValue algorithm is the one that consumes the highest amount of power due to the exchange of control messages and also since it is the one that consumes the largest amount of power in various types of communications such as WiFi scanning, and forwarding and receiving content messages.

Fairness

When comparing the fairness index of all the proposed Space Syntax algorithms as shown in Figure 6.5, it is worth noting that the weightedAPs (every10m) algorithm maintains the highest level of utilization fairness with a fairness index equals 0.855. The wgtClsAPsProx algorithm maintains the highest mean value of the batteries at the end of the simulation, the wgtClsAPs algorithm succeeds to attain the least variance (and the least standard deviation accordingly) among the levels of batteries setting the battery

151 (a) Power Consumption vs. Del. Ratio

(b) 4-category Power Consumption (c) Control Power Consumption

Figure 6.4: Power Consumption of Space Syntax Forwarding community close to each other and reducing the variation among them, thus, achieving more fairness in utilizing these batteries.

6.5.3 Normalized Performance Indices

From the normalized performance indices illustrated in the Figures 6.6, 6.7 and 6.8, it is clear that the

IntOnly and PIPeROp algorithms achieve the highest effectiveness and efficiency indices with a high - but it is not the highest - power awareness index in comparison to the other illustrated algorithms.

Among the proposed Space Syntax based forwarding algorithms, the weighted Close Spots algorithms

(wgtClsAPsProx) achieves the highest effectiveness and power awareness indices and a very high power awareness index which is comparable with the mostdurAP algorithm - the algorithm which relies on the popularity based on the spot which is associated with the node for the longest duration - which achieves the highest efficiency index.

152 (a) Fairness (b) Mean and STD

Figure 6.5: Utilization Fairness of Space Syntax Forwarding

Figure 6.6: Effectiveness Performance Index of Space Syntax Forwarding

It is worth noting that the IntegrationValue algorithm achieves the lowest values in the three indices.

This is attributed to the fact that it contacts almost all the uninterested nodes, consumes the highest amount of power, and consumes almost the highest amount of cost. These indices indicate that the

IntegrationValue algorithm, which is the basic Space Syntax based forwarding algorithm, is characterized by inefficiency, ineffectiveness and lack of power awareness. This is a bold reason that calls for more effective and efficient Space Syntax metrics which the forwarding algorithms should reply on instead of the Integration Value metric.

6.5.4 Analysis

Table 6.5.4 summarizes the performance of all the Space Syntax simulated algorithms and the simulated benchmark algorithms. Our proposed variations of Space Syntax-based forwarding algorithms show

153 Figure 6.7: Efficiency Performance Index of Space Syntax Forwarding

Figure 6.8: Power Awareness Performance Index of Space Syntax Forwarding some improvement in f-measure above the f-measure of the Space Syntax forwarding algorithms that are based on the Attraction Value, Integration Value or the Popularity Index. However, the Integration Value based algorithm achieves a higher f-measure than our proposed algorithms. From another perspective, the PI-SOFA algorithm achieves a higher level of f-measure in comparison to all the Space Syntax- based algorithms - both the basic ones (namely, the Integration Value algorithm, the Attraction Value algorithm, and the Popularity Index algorithm) and our proposed Space Syntax-based algorithms that we proposed in Section 6.2 - while integrating it with Space Syntax shows an improvement in f-measure, ratio of contacted uninterested nodes, cost, power consumption, and utilization fairness. This improvement sacrifices a slight decrease in delivery ratio and the ratio of contacted interested forwarders.

A striking observation is that the Interest-Only algorithm shows better performance in all the metrics except of a slight extra power is consumed which is attributed to the energy exerted to contact more destination nodes and more interested forwarders.

154 Table 6.3: Performance Comparison among Representative algorithms of all the Studied Categories of Space Syntax based Opportunistic Forwarding Algorithms

155 Chapter 7

Dynamic Adaptation Proposed

Implementations

The PI-SOFA framework implementations and the Space Syntax framework implementations seek inter- est awareness, power awareness, and space-popularity awareness, but they lack the ability to dynamically adapt the weights of the components that compose these ranking functions. Through the dynamic adap- tive ranking approach, we present a ranking function that constitutes all the features offered by both the

PI-SOFA framework and the Space Syntax framework, and also provides a set of dynamically adaptive weights for all the components in order to be able to dynamically adapt to the current context.

As an application of the Dynamic Adaptive Ranking framework that is defined in Section 3.3, we have implemented a forwarding algorithm whose ranking function dynamically adapts to the current context. This function is aware of and takes into consideration the relative interest of the node in the forwarded content, the power capabilities of the node, the social popularity of the node, the activeness of the nodes, and, finally, the popularity of the place.

We have implemented two primary versions of this ranking function where one of them (DynAdp) only considers the dynamic adaptive ranking function when ranking candidates for forwarding decisions, while the other one (DynAdpOp) additionally considers the explicit interest-and-power opportunistic selection of forwarders. Two additional versions that integrate Space Syntax (DynAdpSpSyn and Dy- nAdpSpSynOp) are then proposed.

7.1 DynAdp

The DynAdp algorithm is the basic version of the Dynamic Adaptive ranking function. It does not include the explicit opportunistic selection option and it also does not include the Space-Syntax awareness in ranking nodes. This version basically considers the following parameters in ranking nodes: the pure

156 opportunistic component, the interest-and-power aware component, the social activeness component, and the node’s power component. This ranking function is formulated as follows:

DynAdpRank(i) = wOpp ∗ Opportunistic + wIntP ower ∗ CumInterest(i) ∗ P ower(i)+

wSocActive ∗ SocialRank(i) ∗ Activeness(i) + wP owerAware ∗ P ower(i)

(7.1)

where wOpp = 1 - (wIntP ower + wSocActive )

wIntP ower = SInt(i, msg) ∗ bat(i)

wSocActive = expectedusage(i) Note that the opportunistic component is computed as follows:

opp = SInt(i, msg) ∗ remainingP ower(i) (7.2)

where remainingP ower(i) is the remaining battery level of the node i whose function is illustrated in Equation 7.4, and SInt(i) is the similarity in interest between the node i and the forwarded message msg.

The social active component of a node is computed as follows:

SocActive(i) = remainingP ower(i) ∗ normalCumInt(i, msg) ∗ Activeness(i) ∗ SocialRank(i) (7.3)

where remainingP ower(i) is the remaining battery level of the node i whose function is illustrated in Equation 7.4, normalCumInt(i, msg) is the normalized cumulative interest of node i in the content msg, Activeness(i) is the measure of activeness of the node i and its formula is detailed in Equation 7.6, and SocialRank(i) is the social rank of node i which is detailed in Equation 7.10.

remainingP ower(i) = f(battery(i), expectedusage(i)) (7.4)

where expectedusage(i) is the expected amount of power that will be consumed due to the usage profile of the node i which indicates the average time consumed per activity. In addition, the power con- sumed due to network communications such as WiFi scanning, forwarding or receiving control messages is factored in to the calculation of the expected usage. While we are proposing the above method, we would like to acknowledge that there may be several methods of computing this component. For the simulation whose results are illustrated in this chapter, we have implemented the following version of the expected usage function:

157 ExpectedUsage(i, t, msg) = KalmanF ilter( bat(i), Avg(Count(Contacts(i))),TTL(msg),

Consumption(W iF i(i),FWD(i), Rcp(i), idle(i))) (7.5)

This function predicts the expected amount of battery usage for a specified user - using the Kalman

filter predictor - based on its current battery level bat(i), the expected power consumption due to future contacts as an extrapolation from the average count of contacts recorded so far Avg(Count(Contacts(i))), and also based on the expected power consumption due to WiFi connection W iF i(i) and forward or receive actions (FWD(i) and Rcp(i)) from now till the expiration time of the TTL of the specified message TTL(msg).

Activeness(i) = ((1/3) ∗ normalizedContact(i, F (i)) + ((1/3) ∗ CDC(i)) + ((1/3) ∗ Col(i, Dest)) (7.6)

where normalizedContact is the normalized count of contacts between node i and its set of friends is F (i), CDC(i) is the change of degree of connectivity of node i, and Col(i, Dest) is the normalized count of co-location of node i with the destination node.

P Int(i) + ∀v ∈ F (i)P Int(v) normalCumInt(i) = (7.7) |F (i)| + 1 where normalCumInt(i) is the cumulative interest of the node which is based on the P Int(i). The

P Int(i) is the rewarded or penalized SInt(i) when its value is compared to the predefined interest threshold.

In this context, the change in degree of connectivity is computed as the intersection of the count of nodes co-located with node i at t and those co-located with node i at time t − 1, divided by the union of the count of those co-located with node i at time t and those co-located with node i at time t − 1. This is formulated as follows:

Count(col(m, i, t)) ∩ Count((m, i, t − 1)) CDC(i) = (7.8) Count(col(m, i, t)) ∪ Count(col(m, i, t − 1))

Col(i, Dest) = Normalize(∀Dest ∈ DestinationSet(msg)∀t Count(col, i, Dest)) (7.9)

X SocialRank(v) SocialRank(i) = (7.10) |F (v)| ∀v∈F (i)

158 Here we are going to combine the interest-aware component and the power-aware component to come

up with the interest-and-power-aware component which is computed as follows:

IntP owerComponent(i) = RewardP ower(i) ∗ normalCumInt(i) (7.11)

where

RewardP ower(i) = bat(i) ± reward (7.12)

The full pseudocode of the algorithm is detailed in Algorithm 3

Algorithm 3 Dynamic Adaptive Algorithm Require: SInt(source, msg) = 0 1: ∀ time t every n seconds 2: while i is in contact with j do 3: SocialRank(i) = Avg(∀v ∈ F (i)SocialRank(v) 4: if j ∈ F (i) then 5: Update(SocialRank(i), SocialRank(j)) 6: end if P ContactCount(i,v) 7: normalizedContact(i, F (i)) = ∀v ∈ F (i) |F (i)| |Neighbors(i,now)|∪|Neighbors(i,now−1)| 8: CDC(i) = |Neighbors(i,now)|∩|Neighbors(i,now−1)| 9: Col(i, Dest) = |ContactCount(i, d)|∀d ∈ Dest 10: Activeness(i) = ((1/3) ∗ normalizedContact(i, F (i)) + ((1/3) ∗ CDC(i)) + ((1/3) ∗ Col(i, Dest)) 11: remainingP ower(i) = f(bat(i), expectedusgae(i)) P Int(u,msg)+∀v∈F (i)P Int(v,msg) 12: normalCumInt(u, msg) = |F (i)|+1 13: SocActive(i) = remainingP ower(i) ∗ normalCumInt(u, msg) ∗ Activeness(i) ∗ SocialRank(i) 14: wSocActive = expectedusage(i) 15: RewardP ower(i) = bat(i) ± reward 16: IntP owercomponent(i) = RewardP ower(i) ∗ normalCumInt(i) 17: wIntP ower = SInt(i, msg) ∗ bat(i) 18: wOpp = 1 − (wIntP ower + wSocActive) 19: Opportunistic(i) = SInt(i, msg) ∗ remainingP ower(i) 20: DynAdpRank(i) = (wIntP ower ∗ IntP owercomponent(i)) + (wSocActive ∗ SocActive(i)) + (wOpp ∗ Opportunistic(i)) 21: if DynAdpRank(j) ≥ DynAdpRank(i) AND T T L(msg) > 0 then 22: F orward(msg, j) 23: end if 24: end while

7.2 DynAdpOp

This version includes the option of selecting the next forwarder node if both its SInt(i, msg) and its

remaining power bat(i) exceed their respective predefined thresholds. This criteria of selection is added

to the selection criteria defined in the DynAdp algorithm detailed in Section 7.1.

Accordingly a node is selected as the next forwarder if it satisfies the following conditions:

DynAdpRank(j) ≥ DynAdpRank(i) AND

SInt(j) ≥ T hrint AND bat(j) ≥ T hrbat (7.13)

159 7.3 DynAdpSpSyn

This version introduces space popularity to the DynAdp algorithm as a factor in ranking nodes for the

purpose of selection. The ranking function is formulated as follows:

DynAdpSpSyn(i) = wOpp ∗ Opportunistic + wIntP ower ∗ CumInterest(i) ∗ P ower(i)+

wSocActive ∗ SocialRank(i) ∗ Activeness(i)+

wP owerAware ∗ P ower(i) + wP laceP opularity ∗ P laceP opularity(i) (7.14)

where wOpp = 1 - (wIntP ower + wSocActive + wP laceP opularity) and

wP laceP opularity = P op(Location(i, now) As for the place popularity of the node, it is computed as the popularity of the location that this

node is most frequently detected in and is expressed in the following equation:

P laceP opularity(i) = P op(F reqLocation(i)) (7.15)

The full pseudocode of this version of the algorithm is detailed in 4

Algorithm 4 Dynamic Adaptive Algorithm Require: SInt(source, msg) = 0 1: ∀ time t every n seconds 2: while i is in contact with j do 3: SocialRank(i) = Avg(∀v ∈ F (i)SocialRank(v) 4: if j ∈ F (i) then 5: Update(SocialRank(i), SocialRank(j)) 6: end if P ContactCount(i,v) 7: normalizedContact(i, F (i)) = ∀v ∈ F (i) |F (i)| |Neighbors(i,now)|∪|Neighbors(i,now−1)| 8: CDC(i) = |Neighbors(i,now)|∩|Neighbors(i,now−1)| 9: Col(i, Dest) = |ContactCount(i, d)|∀d ∈ Dest 10: Activeness(i) = ((1/3) ∗ normalizedContact(i, F (i)) + ((1/3) ∗ CDC(i)) + ((1/3) ∗ Col(i, Dest)) 11: remainingP ower(i) = f(bat(i), expectedusgae(i)) P Int(u,msg)+∀v∈F (i)P Int(v,msg) 12: normalCumInt(u, msg) = |F (i)|+1 13: SocActive(i) = remainingP ower(i) ∗ normalCumInt(u, msg) ∗ Activeness(i) ∗ SocialRank(i) 14: wSocActive = expectedusage(i) 15: RewardP ower(i) = bat(i) ± reward 16: IntP owercomponent(i) = RewardP ower(i) ∗ normalCumInt(i) 17: wIntP ower = SInt(i, msg) ∗ bat(i) 18: wOpp = 1 − (wIntP ower + wSocActive + wP laceP opularity ) 19: Opportunistic(i) = SInt(i, msg) ∗ remainingP ower(i) 20: P laceP opularity(i) = f(P op(F reqLocation(i)))) 21: wP laceP opularity = P op(Location(i, now) 22: DynAdpSpSyn(i) = (wIntP ower ∗ IntP owercomponent(i)) + (wSocActive ∗ SocActive(i)) + (wP laceP opularity ∗ P laceP opularity(i)) + (wOpp ∗ Opportunistic(i)) 23: if DynAdpSpSyn(j) ≥ DynAdpSpSyn(i) AND T T L(msg) > 0 then 24: F orward(msg, j) 25: end if 26: end while

160 7.4 DynAdpOpSpSyn

This version includes selecting the next forwarder node if both its SInt(i, msg) exceeds a predefined interest threshold and its remaining power bat(i) exceeds a predefined battery threshold. This criteria of selection is added to the selection criteria defined in the DynAdp algorithm detailed in Section 7.1.

Accordingly, a node is selected as the next forwarder if it satisfies the following conditions:

DynAdpSpSyn(j) ≥ DynAdpSpSyn(i) AND

SInt(j) ≥ T hrint AND bat(j) ≥ T hrbat (7.16)

7.5 Evaluation Metrics

To evaluate the performance of the proposed dynamic adaptive algorithms, we apply the following categories of evaluation metrics:

Effectiveness, Efficiency and Power-Awareness categories. The metrics applied in each of these cate- gories are detailed in Section 4.2.5.

After that, a multiple linear regression analysis is conducted on each of the main evaluation metrics to come up with a recommendation of the correlation among these metrics and a set of affecting param- eters. With such a recommendation, any message source can expect a certain performance based on the properties of the available mobile nodes in the place.

7.6 Results

7.6.1 Interest Awareness

(a) Interest-based Effectiveness (b) F-measure

Figure 7.1: Effectiveness of the Dynamic Adaptive Algorithms - AUC Dataset

161 Effectiveness

From the Interest-based effectiveness sub-figure 7.1a,it is noticeable that introducing Space Syntax within the ranking function highly avoids contacting uninterested nodes. It is also noticeable that integrating the interest-threshold opportunistic selection into the ranking function enables contact with a higher ratio of interested forwarder nodes.

From the F-measure sub-figure 7.1b, it is clear that integrating Space Syntax accompanied by the interest-threshold opportunistic selection into the ranking function increases the f-measure by 0.1. How- ever, the f-measure of the non-opportunistic version is much less than the other version because it contacts a lower ratio of interested forwarders. The opportunistic version overcomes this defect and explicitly selects the interested forwarders.

(a) Cost and Delivery Ratio (b) Delay

Figure 7.2: Efficiency of the Dynamic Adaptive Algorithms - AUC Dataset

Efficiency

From the Cost sub-figure 7.2a, it is clear that integrating Space Syntax in the ranking function reduces cost by 0.13 when compared to the corresponding Space Syntax oblivious versions. It is also worth noting that this integration is cost effective since it requires less cost per unit delivery ratio, and also requires less cost per unit delivery and contacted interested forwarder ratio. Also, note that the non-opportunistic

Space Syntax version incurs less cost as it contacts fewer forwarders.

From the Delay sub-figure 7.2b, it is noticeable that the opportunistic threshold versions successfully shorten the delay by one fold or two when compared to their corresponding non-opportunistic versions.

It is also worth noting that the opportunistic Space Syntax based Dynamic adaptive ranking algorithm

(DynAdpOpSpSyn) incurs reasonable delay in order to achieve the highest delivery ratio and the highest ratio of contacted interested forwarders out of the four compared versions. Overall, the non-Space Syntax

Opportunistic version (DynAdpOp) incurs the least delay among the four compared versions.

162 Figure 7.3: Power Consumption of the Dynamic Adaptive Algorithms - AUC Dataset

7.6.2 Power Awareness

Power Consumption Awareness

From the Power consumption figure (Figure 7.3), it is clear that integrating Space Syntax in the ranking function reduces the percent of consumed power by 0.46% when compared to the corresponding Space

Syntax-oblivious versions. It is worth noting that it reduces the power consumption per unit delivery ratio and contacted interested forwarder ratio by 0.06 % than that consumed by the Space Syntax- oblivious versions. However, the non-opportunistic Space Syntax version incurs more power per unit delivery ratio and per unit interested forwarders contacted ratio than all the other versions.

Figure 7.4: Utilization Fairness of the Dynamic Adaptive Algorithms - AUC Dataset

Fairness

From the Fairness figure (Figure 7.4), it is clear that integrating Space Syntax in the ranking function when accompanied by interest-based opportunistic selection slightly improves the level of utilization fairness. However, the non-opportunistic Space Syntax version achieves a slightly lower level of fairness.

Overhead of Exchanged Control Messages

In order to measure the overhead each proposed Dynamic Adaptive Ranking algorithm poses in terms of extra control messages that are exchanged among nodes, this subsection illustrates the number of control messages exchanged within each algorithm during the forwarding process. This section also presents the percent of power consumed due to forwarding these control messages among nodes.

163 Figure 7.5: Effectiveness Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset

Figure 7.6: Efficiency Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset

Figure 7.8 illustrates the number of control messages exchanged among nodes by applying each of the proposed Dynamic Adaptive Ranking algorithms within the simulation, while Figure 7.9 illustrates the percent of power consumed due to the control messages forwarded by these algorithms. From both

figures, it is clear that the DynAdpOpSpSyn algorithm consumes the highest number of control messages, while its non-opportunistic version consumes the least number of control messages and the least power in forwarding them. this observation is attributed to the same observed relative performance among the four versions of Dynamic adaptive ranking in terms of effectiveness, efficiency, cost, power consumption and utilization fairness.

7.6.3 Normalized Performance Indices

The Effectiveness Performance Index illustrated in Figure 7.5 indicates that Space Syntax based Oppor- tunistic version of the dynamic adaptive algorithm (DynAdpOpSpSyn) successfully achieves the highest effectiveness among the four proposed versions. Also, as confirmed by the previous effectiveness metrics, the opportunistic versions (DynAdpOpSpSyn and DynAdpOp) achieve a higher degree of effectiveness than the non-opportunistic ones.

On the level of efficiency, the Efficiency Performance Index Figure 7.6 surprisingly shows that the non- opportunistic Space Syntax based dynamic adaptive algorithm (DynAdpSpSyn) maintains the highest degree of efficiency. This is attributed to its cost effectiveness capability where it pays the least cost among the four versions to achieve a quite comparable delivery ratio.

164 Figure 7.7: PowerAwareness Performance Index of the Dynamic Adaptive Algorithms - AUC Dataset

Figure 7.8: Number of Produced Control Messages of the Dynamic Adaptive Algorithms - AUC Dataset

Finally, the results of comparing the four algorithms’ performance in terms of power awareness using the Power Awareness Performance Index are illustrated in Figure 7.7. From this figure, it is deduced that integrating Space Syntax in the ranking function improves the power awareness of the algorithms.

It is also observable that the DynAdpSpSyn algorithm achieves the highest power awareness index value among the four versions as it consumes the least amount of power among these algorithms due to the least paid cost in terms of number of message copies that are forwarded.

7.7 Statistical Analysis

7.7.1 Statistical Significance of the Dynamic Adaptive Algorithm

To examine the statistical significance of the performance of the proposed versions in comparison to the original algorithms, we apply non-parametric test for 2-independent samples. Note that we did not apply

165 Figure 7.9: Percent of Consumed Power for the forwarded Control Messages of the Dynamic Adaptive Algorithms - AUC Dataset the 2-independent t-test since the set of samples is not large enough for t-tests and its distribution is not known beforehand. Alternatively, the non-parametric Mann Whitney test for independent samples suits small number of samples with unknown distributions.

DynAdp Versions Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the Dy- nAdpSpSyn algorithm in comparison to the original DynAdp algorithm to show the improvement in performance of the Dynamic Adaptive Ranking algorithm when integrates with Space Syntax awareness.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results displayed in Table 7.1 and in Table 7.2. The results show significance (p-value) 0.0 in performance between the DynAdp algorithm and the DynAdpSpSyn algorithm across all metrics. Accordingly, the mean ranks table, Table 7.1 shows that the DynAdp algorithm achieves higher mean rank than that of

DynAdpSpSyn in the following metrics: F-measure, ratio of contacted interested forwarders, the delivery ratio, and also in maintaining a higher level of utilization fairness. However, the DynAdpSpSyn excels in contacting a significantly less ratio of uninterested nodes, and also in significantly reducing the cost paid to forward content and the amount of power consumed in this process.

DynAdpOp Versions Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the Dy- nAdpOpSpSyn algorithm in comparison to the original DynAdpOp algorithm to show the improvement

166 Metric Alg. N Mean Rank Sum of Ranks Fmeasure DynAdpSpSyn 14 8.21 115.00 DynAdp 10 18.50 185.00 Total 24 IntFWD DynAdpSpSyn 14 7.50 105.00 Ratio DynAdp 10 19.50 195.00 Total 24 UnIntFWD DynAdpSpSyn 14 7.50 105.00 Ratio DynAdp 10 19.50 195.00 Total 24 Delivery DynAdpSpSyn 14 8.57 120.00 Ratio DynAdp 10 18.00 180.00 Total 24 Cost DynAdpSpSyn 14 7.50 105.00 DynAdp 10 19.50 195.00 Total 24 Power-awareness DynAdpSpSyn 14 8.43 118.00 DynAdp 10 18.20 182.00 Total 24 Utilization DynAdpSpSyn 14 9.86 138.00 Fairness DynAdp 10 16.20 162.00 Total 24

Table 7.1: Ranks of DynAdp and DynAdpSpSyn

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 10.000 0.000 0.000 15.000 0.000 13.000 33.000 Wilcoxon W 115.000 105.000 105.000 120.000 105.000 118.000 138.000 Z -3.514 -4.099 -4.101 -3.222 -4.100 -3.338 -2.167 Asymp. Sig. (2-tailed) 0.000 0.000 0.000 0.001 0.000 0.001 0.030 Exact Sig. [2*(1-tailed Sig.)] 0.000b 0.000b 0.000b 0.001b 0.000b 0.000b 0.031b Exact Sig. (2-tailed) 0.000 0.000 0.000 0.001 0.000 0.000 0.029 Exact Sig. (1-tailed) 0.000 0.000 0.000 0.000 0.000 0.000 0.015 Point Probability 0.000 0.000 0.000 0.000 0.000 0.000 0.001 b Not corrected for ties.

Table 7.2: Statistical Significance of DynAdp and DynAdpSpSyn

167 Metric Alg. N Mean Rank Sum of Ranks Fmeasure DynAdpOp 10 7.30 73.00 DynAdpOpSpSyn 22 20.68 455.00 Total 32 IntFWD DynAdpOp 10 18.25 182.50 Ratio DynAdpOpSpSyn 22 15.70 345.50 Total 32 UnIntFWD DynAdpOp 10 27.50 275.00 Ratio DynAdpOpSpSyn 22 11.50 253.00 Total 32 Delivery DynAdpOp 10 18.80 188.00 Ratio DynAdpOpSpSyn 22 15.45 340.00 Total 32 Cost DynAdpOp 10 27.50 275.00 DynAdpOpSpSyn 22 11.50 253.00 Total 32 Power-awareness DynAdpOp 10 25.30 253.00 DynAdpOpSpSyn 22 12.50 275.00 Total 32 Utilization DynAdpOp 10 15.90 159.00 Fairness DynAdpOpSpSyn 22 16.77 369.00 Total 32

Table 7.3: Ranks of DynAdpOp and DynAdpOpSpSyn

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 18.000 92.500 0.000 87.000 0.000 22.000 104.000 Wilcoxon W 73.000 345.500 253.000 340.000 253.000 275.000 159.000 Z -3.740 -0.712 -4.473 -0.939 -4.472 -3.578 -0.244 Asymp. Sig. (2-tailed) 0.000 0.476 0.000 0.348 0.000 0.000 0.807 Exact Sig. [2*(1-tailed Sig.)] 0.000b 0.483b 0.000b 0.366b 0.000b 0.000b 0.826b Exact Sig. (2-tailed) 0.000 0.489 0.000 0.359 0.000 0.000 0.826 Exact Sig. (1-tailed) 0.000 0.245 0.000 0.180 0.000 0.000 0.413 Point Probability 0.000 0.006 0.000 0.005 0.000 0.000 0.016 b Not corrected for ties.

Table 7.4: Statistical Significance of DynAdpOp and DynAdpOpSpSyn in performance of the opportunistic version of the Dynamic Adaptive Ranking algorithm when integrates with Space Syntax awareness.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results displayed in Table 7.3 and in Table 7.4. The results show significance (p-value) 0.0 in performance between the DynAdpOp algorithm and the DynAdpOpSpSyn algorithm in the following metrics: F- measure, ratio of contacted uninterested nodes, cost and power consumption. Accordingly, the mean ranks table, Table 7.3 shows that the DynAdpOpSpSyn algorithm achieves higher mean rank than that of DynAdpOp in F-measure. In addition, the DynAdpOpSpSyn excels in contacting a significantly less ratio of uninterested nodes, and excels in significantly reducing the cost paid to forward content and the amount of power consumed in this process.

168 Metric Alg. N Mean Rank Sum of Ranks Fmeasure PIPeROp 28 16.00 448.00 DynAdpOpSpSyn 22 37.59 827.00 Total 50 IntFWD PIPeROp 28 19.02 532.50 Ratio DynAdpOpSpSyn 22 33.75 742.50 Total 50 UnIntFWD PIPeROp 28 35.00 980.00 Ratio DynAdpOpSpSyn 22 13.41 295.00 Total 50 Delivery PIPeROp 28 18.46 517.00 Ratio DynAdpOpSpSyn 22 34.45 758.00 Total 50 Cost PIPeROp 28 29.05 813.50 DynAdpOpSpSyn 22 20.98 461.50 Total 50 Power-awareness PIPeROp 28 27.82 779.00 DynAdpOpSpSyn 22 22.55 496.00 Total 50 Utilization PIPeROp 28 22.36 626.00 Fairness DynAdpOpSpSyn 22 29.50 649.00 Total 50

Table 7.5: Ranks of PIPeROp and DynAdpOpSpSyn

Fmeasure IntFWD UnIntFWD Delivery Cost Power Utilization Ratio Ratio Ratio awareness Fairness Mann-Whitney U 42.000 126.500 42.000 111.000 208.500 243.000 220.000 Wilcoxon W 448.000 532.500 295.000 517.000 461.500 496.000 626.000 Z -5.199 -3.548 -5.199 -3.851 -1.945 -1.270 -1.720 Asymp. Sig. (2-tailed) 0.000 0.000 0.000 0.000 0.052 0.204 0.085 Exact Sig. (2-tailed) 0.000 0.000 0.000 0.000 0.052 0.209 0.087 Exact Sig. (1-tailed) 0.000 0.000 0.000 0.000 0.026 0.105 0.044 Point Probability 0.000 0.000 0.000 0.000 0.001 0.004 0.002

Table 7.6: Statistical Significance of PIPeROp and DynAdpOpSpSyn

DynAdpOpSpSyn versus PIPeROp Statistical Significance Analysis

The statistical analysis presented in this subsection demonstrates the statistical significance of the Dy- nAdpOpSpSyn algorithm in comparison to the PIPeROp algorithm illustrating the improvement in performance of the DynAdpOpSpSyn algorithm in comparison to that of the PIPeROp algorithm.

The non-parametric 2-independent samples test based on Mann-Whitney Test produces the results displayed in Table 7.5 and in Table 7.6. The results show significance (p-value) 0.0 in performance be- tween the PIPeROp algorithm and the DynAdpOpSpSyn algorithm in the following metrics: F-measure, ratio of contacted interested nodes, ratio of contacted uninterested nodes, and delivery ratio. Accord- ingly, the mean ranks table, Table 7.5 shows that the DynAdpOpSpSyn algorithm achieves higher mean rank than that of PIPeROp in F-measure, the delivery ratio and the contacted ratio of interested for- warders. In addition, the DynAdpOpSpSyn excels in contacting a significantly less ratio of uninterested nodes.

169 7.7.2 Regression Analysis of the Performance Metrics of the Dynamic Adap- tive Ranking Algorithm

This section analyzes the relationship between each of the performance metrics of the Dynamic Adaptive

Ranking Algorithm and the main factors that compose the ranking function of the Dynamic Adaptive

Ranking Algorithm. From this analysis, the section presents the predicted regression of each one of these metrics.

Structure of the Imported Data

The results of the simulation runs are imported into SPSS and R, two statistical packages, to study the regression analysis of each of the following dependent variables: delivery ratio (DeliveryRatio), the interested forwarder ratio (IntFWDRatio), the uninterested forwarder ratio (UnIntFWDRatio), the cost

(Cost), the power consumption (ConsumedPower), the f-measure (Fmeasure), and the utilization fair- ness (Fairness). These dependent variables are analyzed against the following independent variables: the Similarity interest of the node to the message (SimilarityInterest), the remaining power of the node

(Battery), the average contact duration of the node with other nodes since the beginning of the simula- tion (ContactsDurSoFar), the average contact duration of the currently contacted nodes with this node

(AvgContDurNow), the popularity of the current location of the node (PlacePopHere), the popularity of the most frequently visited spot by the node (PlacePopFreq), the change of degree of connectivity of the node (ucdc), the measure of co-location of the node with destination nodes (ucol), the average dynamic rank of the node’s friends (AvgFriendsRank), and the normalized contact duration with the node’s friends (normalContactFr).

The structure of the imported data files consists of the ten parameters of the 50 nodes (i.e. 10 x 50

= 500 parameters are listed per row) that have been simulated for 3600 seconds (each second represents a row). The simulation run simulates 100 messages being forwarded; thus for each message there is a separate row per second. This makes 3600 x 100 rows in the data file where each row contains 500 parameters in addition to the dependent variables.

Pre-Analysis Data Manipulation

Before analyzing the data, the 500 parameters need to be summarized in ten main parameters that represent the distribution of the nodes according to each parameter. For instance, the 50 battery levels of the nodes each second need to be summarized in one variable named (Battery). This representative variable can be computed as the mean (or the median) of the 50 values. It can also be computed as the

Factor value (i.e. a code) that summarizes all the 50 values. Either of the two techniques can be applied using SPSS or R. Accordingly, the data file will consist of 3600 x 100 rows where each row consists of 50 parameters (i.e. 50 independent variables) and the corresponding dependent variables.

170 Alternatively, the ten parameters of each user are listed in a separate row appended with the depen- dent variables for each second and each message. In this case, the data file will consist of 3600 x 100 x

50 rows (i.e. 18,000,000 cases) where each row consists of 10 parameters (i.e. 10 independent variables) and the corresponding dependent variables.

Due to the huge data files generated by the simulator (18 million cases per file) which overwhelm the statistical packages due to the limitation of the memory of the machine in use (16 GB RAM), extracting samples from these data files resolve the problem. Accordingly, a sample data per data file is extracted using SPSS with a size of 500 cases.

Moreover, since we have conducted simulations from several traces hours in various days, we will apply stratified sampling by extracting a sample from each hour in each day then gather all these samples in one file that will be the input data for the following statistical analysis.

Collinearity Check

Before applying the regression analysis, a collinearity diagnosis among all the independent variables and also in relation with the dependent variables is conducted. The R package summarizes this diagnosis in a scatter matrix whose upper half contains scatter plots between each dependent variable and one of the independent variables or a scatter plot between two independent variables, while its lower half contains the corresponding correlation coefficient accompanied with its significance value (p-value). If there is a significant correlation between the two variables, the p-value will be less than 0.01 else this correlation is not significant.

The resulting scatter matrixes indicate the following:

1) Figure 7.10 illustrates the correlation of DeliveryRatio and the ten independent variables and indicates that there is significant correlation relationship between DeliveryRatio and each of Battery,

ContactDurSoFar and ucol. This figure also illustrates the collinearity between some of the indepen- dent variables as follows: Battery versus AvgContDurSoFar, AvgContDurNow versus AvgFriendsRank,

AvgContDurNow versus normalContactFr, AvgFriendsRank versus normalContactFr, ContactDurSoFar versus PlacePopHere, and finally PlacePopHere versus ucol.

2) Figure 7.11 illustrates the correlation of IntFWDRatio versus the ten independent variables. The

figure illustrates a correlation between IntFWDRatio and each of Battery, ContactDurSoFar and ucol.

3) Figure 7.12 illustrates the correlation of UnIntFWDRatio versus the ten independent variables.

Strange enough that this figure indicates that there is no relationship between the UnIntFWDRatio and any of the independent variables.

4) Figure 7.13 illustrates the correlation of Cost versus the ten independent variables. This figure displays a relationship between Cost and PlacePopFreq.

5) Figure 7.14 illustrates the correlation of ConsumedPower versus the ten independent variables. This

171 Figure 7.10: Scatter Matrix with Multi Correlation Coefficients for DeliveryRatio

figure demonstrates a relationship between ConsumedPower and each of the Battery, ContactDurSoFar,

PlacePopFreq, and ucol.

6) Figure 7.15 illustrates the correlation of Fmeasure versus the ten independent variables. This

figure shows that there is a relationship between Fmeasure and each of Battery and ucol.

7) Figure 7.16 illustrates the correlation of Fairness versus the ten independent variables. This figure depicts a relationship between Fairness and each of the Battery, ContactDurSoFar, PlacePopFreq, and ucol.

8) Figure 7.17 depicts the correlation between any two of the dependent variables. The following correlations are inferred from the figure: DeliveryRatio and IntFWDRatio, DeliveryRatio and UnInt-

FWDRatio, DeliveryRatio and ConsumedPower, DeliveryRatio and Fmeasure, IntFWDRatio and Cost,

IntFWDRatio and ConsumedPower, IntFWDRatio and Fairness, IntFWDRatio and Fmeasure, UnInt-

FWDRatio and ConsumedPower, UnIntFWDRatio and Fairness, UnIntFWDRatio and Fmeasure, Cost and Fmeasure, ConsumedPower and Fairness, ConsumedPower and Fmeasure, Fairness and Fmeasure.

172 Figure 7.11: Scatter Matrix with Multi Correlation Coefficients for IntFWDRatio

Multiple Linear Regression

The next step is to conduct the multiple linear regression analysis for each of the dependent variables.

It is rather better to conduct forward stepwise regression for each of these dependent variables versus the proposed independent variables.

The results of the stepwise forward regression analysis conducted using R recommends the following regression functions for each of the dependent variables each with its corresponding minimum AIC value:

DeliveryRatio = f(ucol , Battery , ucol:Battery) with AIC = -4249.54

IntFWDRatio = f(ucol , Battery , ucol:Battery) with AIC = -4341.59

UnIntFWDRatio is not a function of any of the abovementioned independent variables with AIC=-

6106.71

Cost = f(PlacePopFreq) with AIC=5161.51

ConsumedPower = f(ucol + Battery + PlacePopHere + ucdc + PlacePopFreq + ucol:Battery +

Battery:PlacePopHere) with AIC=1557.07

Fairness = f(ucol + Battery + PlacePopHere + ucdc + PlacePopFreq + ucol:Battery + Bat- tery:PlacePopHere) with AIC=-7160.14

173 Figure 7.12: Scatter Matrix with Multi Correlation Coefficients for UnIntFWDRatio

Fmeasure = f(ucol + Battery + ucol:Battery) with -4427.48

The Table 7.7 shows the significance of the above-mentioned functions the degree by which these independent variables successfully describe the dependent variables as deduced from the R2 value (the coefficient of determination). Note that R2 is the proportion of variance in the dependent variable that can be explained by the independent variables. From this table, it is clear that the selected independent variables can describe 6% of the delivery ratio, 5% of the ratio of contacted interested forwarders, but they failed to describe the ratio of contacted uninterested forwarders. Furthermore, these independent variables were able to describe 1.3 % of the cost, but successfully describe 56 % of the consumed power and 55% of the utilization fairness, while only 3.6 % of the Fmeasure.

The coefficients of these variables are detailed in Tables 7.8, 7.9, 7.10, 7.11, 7.12 and 7.13 . Based on the t-value and the p-value, the independent variables’ coefficients are tested to see whether they are statistically significantly different to 0. Thus, all the variables with p < 0.05 are statistically significantly different to 0; they are significant parameters in the regression model. Based on the above analysis, now we are able to come up with specific values for the coefficients; from the above mentioned six tables, the deduced regression relationships are:

174 Dependent Multiple Adjusted Std. Error of Variable R Square R Square the Estimate Delivery Ratio 0.06204 0.05921 0.1181 Interested 0.05436 0.05151 0.1127 Forwarders Ratio Cost 0.01425 0.01326 13.13 Consumed Power 0.5651 0.562 2.127 Fairness 0.5603 0.5572 0.02722 Fmeasure 0.03946 0.03656 0.108

Table 7.7: Regression Analysis

Estimate Std. Error t value Pr(< ktk) (Intercept) 1.403e+00 7.933e-02 17.683 < 2e-16 ucol -4.936e-05 1.139e-05 -4.333 1.62e-05 Battery -4.986e-03 8.538e-04 -5.840 7.05e-09 ucol:Battery 5.979e-07 1.245e-07 4.803 1.80e-06

Table 7.8: Coefficients of the Regression Analysis of the Delivery Ratio

Estimate Std. Error t value Pr(< ktk) (Intercept) 1.368e+00 7.577e-02 18.051 < 2e-16 ucol -4.535e-05 1.088e-05 -4.168 3.34e-05 Battery -4.508e-03 8.154e-04 -5.529 4.11e-08 ucol:Battery 5.469e-07 1.189e-07 4.600 4.77e-06

Table 7.9: Coefficients of the Regression Analysis of the Interest Forwarders Ratio

Estimate Std. Error t value Pr(< ktk) (Intercept) 13.8913 0.4271 32.525 < 2e-16 PlacePopFreq -542.4784 142.8288 -3.798 0.000155

Table 7.10: Coefficients of the Regression Analysis of Cost

Estimate Std. Error t value Pr(< ktk) (Intercept) 7.858e+01 1.514e+00 51.883 < 2e-16 ucol 8.006e-04 2.071e-04 3.866 0.000118 Battery 1.674e-01 1.631e-02 10.263 < 2e-16 PlacePopHere 1.518e+02 6.651e+01 2.283 0.022662 ucdc -4.404e+00 1.489e+00 -2.957 0.003182 PlacePopFreq -6.109e+01 2.326e+01 -2.627 0.008758 ucol:Battery -1.555e-05 2.262e-06 -6.877 1.08e-11 Battery:PlacePopHere -2.366e+00 7.745e-01 -3.054 0.002315

Table 7.11: Coefficients of the Regression Analysis of Consumed Power

Estimate Std. Error t value Pr(< ktk) (Intercept) 7.807e-01 1.938e-02 40.283 < 2e-16 ucol 7.002e-06 2.650e-06 2.642 0.00837 Battery 1.900e-03 2.087e-04 9.100 < 2e-16 PlacePopHere 1.689e+00 8.511e-01 1.984 0.04752 ucdc -5.659e-02 1.906e-02 -2.969 0.00306 PlacePopFreq -8.207e-01 2.976e-01 -2.758 0.00593 ucol:Battery -1.633e-07 2.894e-08 -5.642 2.19e-08 Battery:PlacePopHere -2.733e-02 9.911e-03 -2.758 0.00593

Table 7.12: Coefficients of the Regression Analysis of Fairness

175 Figure 7.13: Scatter Matrix with Multi Correlation Coefficients for Cost

Estimate Std. Error t value Pr(< ktk) (Intercept) 1.306e+00 7.258e-02 17.993 < 2e-16 ucol -3.901e-05 1.042e-05 -3.743 0.000192 Battery -3.830e-03 7.811e-04 -4.903 1.10e-06 ucol:Battery 4.649e-07 1.139e-07 4.082 4.83e-05

Table 7.13: Coefficients of the Regression Analysis of Fmeasure

DeliveryRatio = - 4.936e-05 * ucol - 4.986e-03 * Battery 5.979e-07 * (ucol * Battery) + 1.403

IntFWDRatio = -4.535e-05 * ucol - 4.508e-03 * Battery + 5.469e-07 * (ucol * Battery) + 1.368

Cost = -542.4784 * PlacePopFreq + 13.8913

ConsumedPower = 8.006e-04 * ucol + 1.674e-01 * Battery + 1.518e+02 * PlacePopHere -4.404 * ucdc -6.109e+01 * PlacePopFreq + -1.555e-05 * (ucol * Battery) + -2.366 * (PlacePopHere * Battery)

+ 7.858e+01

Fairness = 7.002e-06 * ucol + 1.900e-03 * Battery + 1.689 * PlacePopHere -5.659e-02 * ucdc -8.207e-

01 * PlacePopFreq -1.633e-07 * (ucol * Battery) -2.733e-02 * (PlacePopHere * Battery) + 7.807e-01

Fmeasure = -3.901e-05 * ucol -3.830e-03 * Battery + 4.649e-07 * (ucol * Battery) + 1.306

176 Figure 7.14: Scatter Matrix with Multi Correlation Coefficients for ConsumedPower

df Sum of Squares Mean Square F Sig. ucol 1 0.4405 0.44050 31.605 2.452e-08 Battery 1 0.1561 0.15612 11.202 0.0008478 ucol:Battery 1 0.3215 0.32151 23.068 1.804e-06 Residuals 996 13.8818 0.01394

Table 7.14: ANOVA for Delivery Ratio

ANOVA

The F-ratio in the ANOVA tables tests whether the overall regression model is a good fit for the data.

The table shows that the independent variables statistically significantly predict the dependent variable.

Thus, this multiple regression model is a good fit of the data as shown in Tables 7.14, 7.15, 7.16, 7.17,

7.18, and 7.19.

df Sum of Squares Mean Square F Sig. ucol 1 0.3382 0.33818 26.6035 3.013e-07 Battery 1 0.1206 0.12061 9.4877 0.002125 ucol:Battery 1 0.2690 0.26897 21.1590 4.771e-06 Residuals 996 12.6611 0.01271

Table 7.15: ANOVA for the Interested Forwarder Ratio

177 Figure 7.15: Scatter Matrix with Multi Correlation Coefficients for Fmeasure

df Sum of Squares Mean Square F Sig. PlacePopFreq 1 2487 2486.66 14.426 0.0001546 Residuals 998 172034 172.38

Table 7.16: ANOVA for Cost

df Sum of Squares Mean Square F Sig. ucol 1 4989.4 4989.4 1102.3860 < 2.2e-16 Battery 1 349.6 349.6 77.2332 < 2.2e-16 PlacePopHere 1 181.3 181.3 40.0655 3.714e-10 ucdc 1 42.8 42.8 9.4555 0.002163 PlacePopFreq 1 31.2 31.2 6.9010 0.008747 ucol:Battery 1 196.4 196.4 43.3986 7.234e-11 Battery:PlacePopHere 1 42.2 42.2 9.3293 0.002315 Residuals 992 4489.8 4.5

Table 7.17: ANOVA for Consumed Power

df Sum of Squares Mean Square F Sig. ucol 1 0.81305 0.81305 1097.0825 < 2.2e-16 Battery 1 0.05496 0.05496 74.1620 < 2.2e-16 PlacePopHere 1 0.02921 0.02921 39.4127 5.121e-10 ucdc 1 0.00692 0.00692 9.3433 0.002298 PlacePopFreq 1 0.00564 0.00564 7.6109 0.005909 ucol:Battery 1 0.02143 0.02143 28.9185 9.415e-08 Battery:PlacePopHere 1 0.00564 0.00564 7.6054 0.005926 Residuals 992 0.73517 0.00074

Table 7.18: ANOVA for Fairness

178 Figure 7.16: Scatter Matrix with Multi Correlation Coefficients for Fairness

df Sum of Squares Mean Square F Sig. ucol 1 0.1962 0.196194 16.8181 4.449e-05 Battery 1 0.0867 0.086742 7.4357 0.006507 ucol:Battery 1 0.1944 0.194358 16.6607 4.828e-05 Residuals 996 11.6190 0.011666

Table 7.19: ANOVA for Fmeasure

179 Figure 7.17: Scatter Matrix with Multi Correlation Coefficients among Dependent Variables

180 Chapter 8

Discussion and Conclusion

This research first provided a survey of the progress that has been made in two fields in particular:

Pervasive Computing and Social Networking in order to highlight the evolution of what we have coined

Social Pervasive Systems. We then concentrate on a sub-domain of SPSs, socially-influenced context- aware systems. More specifically, we argue that limitations in social-based context-aware forwarding algorithms that do not leverage the knowledge of users’ interest or the nodes’ power capabilities, can be overcome by integrating interest-awareness and power-awareness into the forwarding decision-making process and thus achieve greater effectiveness, efficiency and power utilization fairness. In light of this, three main frameworks were proposed. The first framework is PI-SOFA, a framework for integrating interest-awareness and power-awareness into any social-based opportunistic forwarding algorithm in order to achieve the aforementioned improvements in performance. The following set of implemented versions concentrate on the benefits of introducing interest and power awareness.

To demonstrate the effectiveness and efficiency of our proposed solutions, we display the improve- ment in performance of the three state-of-the-art social-based context-aware forwarding systems that is achieved by integrating interest-awareness into the forwarding decision-making process. This is im- plemented by introducing an interest-based reward/penalty function when computing the social-based rank of each user node. The proposed Interest-Aware contributions are the Interest-aware PeopleRank

(IPeR), the Interest-aware SocialCast (ISCast) and the Interest-aware SCAR (ISCAR). The performance of these three proposed algorithms is evaluated through simulations. With the illustrated results of the simulation runs, these improvements achieve a marked cost reduction, a higher delivery ratio, a greater interest-based effectiveness in terms of an increased contact with the interested forwarders and a reduced contact with uninterested nodes.

A demonstration of the improved power awareness and utilization fairness is exemplified in the proposed Power-and-Interest-Aware versions, namely, the power-and-interest-aware PeopleRank al- gorithm versions (PIPeR and PIPeROp), the power-and-interest-aware SocialCast algorithm versions

181 (PISCast and PISCastOp), and the power-and-interest-aware SCAR algorithm versions (PISCAR and

PISCAROp). The results of the simulation evaluation of these proposed versions prove a marked re- duction in power consumption and a higher utilization fairness. Our experiments also demonstrate the impact of introducing depletion-rate awareness and contact-duration awareness into these power-aware social-based forwarding algorithms.

A second framework integrated Space Syntax metrics into the ranking function of the PI-SOFA framework in order to leverage the popularity of the spots frequented by a node or the popularity of a node’s current location or even the popularity of those spots close to it. This proposal is based on the Space Syntax theory which states that the space layout of a particular place guides the movement of the people that visit it, and the popularity of the place indicates that there is a high probability of encounter, which in turn increases the opportunity to deliver messages to destination nodes. Through the

Space Syntax-based framework, we propose variations of Space syntax-based opportunistic forwarding algorithms. We further propose integrating Space Syntax into PI-SOFA algorithms. In the simulations conducted, it was found that the traditional Space Syntax metrics were ineffective and inefficient, and that the metrics based on the popularity of the places frequented by nodes exhibited higher efficiency and effectiveness.

A third framework introduced dynamically adaptive ranking to opportunistic forwarding algorithms.

This framework is illustrated in the four versions that were developed, namely, the Dynamic adaptive ranking algorithm (DynAdp), the opportunistic threshold Dynamic adaptive version (DynAdpOp), the

Space Syntax-Aware Dynamic adaptive algorithm (DynAdpSpSyn), and its opportunistic threshold ver- sion (DynAdpOpSpSyn). When these four were compared to all the other proposed algorithms and to the benchmark algorithms, the simulation results demonstrated a great improvement in performance in terms of effectiveness, efficiency, power awareness, and utilization fairness. Furthermore, we devised performance indices in order to evaluate the performance of all the proposed algorithms across various environments. On the basis of this evaluation, a set of recommendations for any parties that intend to implement any of the proposed algorithms were made.

All the above simulations and evaluations were conducted using our own developed Social-AwaRe

Opportunistic forwarding Simulator (SAROS). This simulator provides a set of interest distributions, a set of battery distributions, models of user profiles, a kinetic battery model, simulated depletion rates, imported and manipulated real mobility traces, a synthesized mobility model, an implementation of a set of state-of-the-art social-based opportunistic forwarding algorithms, and a variety of devised evaluation metrics. One of the strong features of this simulator is that it is able to simulate users’ interest in forwarded content as well as the power capabilities of their devices two features that are not provided by any of the state-of-the-art simulators currently available (to the best of our knowledge). The simulator also imports a variety of real traces that cover a mall environment, two university campuses, and two

182 conferences. These real traces include the gathered mobility traces made by the AUC community and accessed through the AUC wireless connectivity network. All of the simulation results presented in this research are handled by the SAROS simulator.

Finally, a statistical analysis is conducted on the bases of which a set of recommendations and predictions of the expected behavior of the evaluation metrics are made; namely, the delivery ratio, the ratio of contacted interested forwarders, the ratio of contacted uninterested forwarders, the paid cost, the consumed power, the utilization fairness, and the f-measure.

To summarize the achieved results and the deduced recommendations, the following set of figures are discussed in detail:

Figure 8.1 provides an illustrated comparison of the performance of the various studied categories of opportunistic forwarding algorithms ranging from the pure opportunistic Epidemic algorithm to a sample of the newly proposed Space Syntax forwarding algorithms. From this figure, we can see that the integration of PI-SOFA with Space Syntax metrics achieves better performance than that of each one of them separately, specifically in terms of f-measure, the ratio of contacted uninterested nodes, reduction in paid cost, and reduction in power consumption.

Figure 8.1: Performance Comparison among Representative algorithms of all the Studied Categories of Opportunistic Forwarding Algorithms

183 8.1 Effectiveness

From the Interest-based effectiveness figure (Figure 8.2), it is evident that the Dynamic Adaptive versions approach the Epidemic benchmark in delivery ratio. Moreover, Space Syntax integration within the ranking function maintains the same delivery ratio and contacted interested forwarder ratio as that achieved without Space Syntax but crucially, avoids contact with 86% of the uninterested nodes.

Furthermore, integrating the interest-threshold opportunistic selection into the ranking function fa- cilitates contact with a higher ratio of interested forwarder nodes. It is also apparent that introducing the PI-SOFA framework to Socialcast and SCAR boosts the ratio of contacted interested forwarders by

2925% in the case of PISCastOp and 13125% in the case of PISCAROp. All of that is achieved while preserving a low ratio of contacted uninterested nodes.

Figure 8.2: Interest-based Effectiveness of All Proposed Algorithms - AUC Dataset

From the F-measure figure, Figure 8.3, it is clear that integrating the PI-SOFA framework with any of the social-aware forwarding algorithm boosts its f-measure such as PISCAROp whose f-measure exceeds

SCAR by 475.5%. It is also worth noting that the integration of both Space Syntax and interest-threshold opportunistic selection when making forwarding decisions increases the f-measure by 0.122. However, the f-measure of the non-opportunistic versions is much less than the opportunistic versions because a lower ratio of interested forwarders are contacted. On the basis of this, the opportunistic versions are able to overcome this defect and explicitly select any encountered interested forwarders.

184 Figure 8.3: F-measure of All Proposed Algorithms - AUC Dataset

8.2 Efficiency

Efficiency is speculated in terms of paid cost, the paid cost per unit delivery ratio, in addition to the cost per unit delivery ratio and the ratio of contacted interested forwarder nodes.

Figure 8.4: Cost of All Proposed Algorithms - AUC Dataset

From the cost figure, Figure 8.4, it is apparent that Socialcast and SCAR are the least costly of all the compared algorithms. DynAdpSpSyn is the next in terms of reduced cost. However, if one compares the algorithms in terms of cost without paying attention the ratio of gained benefit, then one will reach a false conclusion. Thus, it is much more accurate to study the cost per unit delivery ratio - as shown in Figure 8.5 - in comparison to the cost figure (Figure 8.4). From this comparative study, we reach the

185 following conclusion: although SCAR and Socialcast seem to be the most cost-efficient algorithms, their effectiveness ranks lowest among all the compared algorithms. This is indicated by the high value of the cost per unit delivery ratio they exert in order to reach a low delivery ratio and a low ratio of contacted interested forwarders. It is clear that by analyzing the cost per unit delivery ratio, the DynAdpSpSyn algorithm is the best in terms of the cost required to reach a high delivery ratio. Thus, introducing dynamic adaptation and integrating it with Space Syntax metrics boosts the cost effectiveness of the forwarding algorithm.

It is also worth noting that by introducing the threshold opportunistic component to the dynamic adaptation at an additional cost, the new algorithm (DynAdpOpSpSyn) successfully approached the full delivery ratio. The same observation can be reported on the Interest-Only algorithm which achieves a comparable level of cost effectiveness to that of the DynAdpSpSyn algorithm.

Figure 8.5: Cost per Unit Delivery Ratio of All Proposed Algorithms - AUC Dataset

It is remarkable that the paid cost per unit of delivery ratio and per unit of contacted interested forwarders ratio, as shown in Figure 8.6, dissects the algorithms’ performance more rigorously and thus provides a more practical calculation of the efficiency of the compared algorithms. If one seeks to achieve the best delivery ratio and the best ratio of contacted interested forwarders with the least paid cost, they would opt for the DynAdpOpSpSyn and the Interest-Only algorithms.

Figure 8.7 shows the delay metric across all compared algorithms. It is noticeable that Dynamic adaptive versions achieve a small delay in time while the other PI-SOFA versions incur a long delay in delivery time. However, it is worth mentioning that SCAR and SocialCast show a minor delay which can actually be attributed to the fact that they deliver a very small number of messages and this explains that short delay in delivery time. Thus, the Dynamic adaptive versions were able to accomplish the difficult task of achieving a very high delivery ratio with a relatively short delay in delivery time, thus competing with SCAR and SocialCast in terms of their short delay in delivery time on one side, while also competing with Epidemic in terms of their high delivery ratio on the other side. Over and above, the

Dynamic adaptive versions contact fewer uninterested nodes, something that Epidemic fails to maintain.

186 Figure 8.6: Cost per Unit Delivery Ratio and Interested Forwarders Ratio of All Proposed Algorithms - AUC Dataset

Figure 8.7: Delay of All Proposed Algorithms - AUC Dataset

187 8.3 Power Consumption Awareness

By analyzing the power consumption performance of the compared algorithms illustrated in Figure 8.8, it is evident that overall, the power-and-interest-aware PeopleRank versions conserve more power than the amount of power conserved by the original PeopleRank algorithm. Also, generally speaking, the Space

Syntax based dynamic adaptive versions conserve power when compared to their corresponding Space

Syntax oblivious dynamic adaptive versions. The Socialcast and SCAR algorithms consume the least percent of power, while the power-and-interest-aware versions of these two algorithms consume more power. This is attributed to the boosted delivery ratio and the boosted ratio of contacted interested forwarders. It is clear from the figure that the Epidemic, E-BubbleRap and the PeopleRank algorithms are the ones that consume the highest percent of power.

Figure 8.8: Power Consumption of All Proposed Algorithms - AUC Dataset

It is, however, more comprehensive and justifiable to analyze the algorithms performance in terms of power consumption per unit delivery ratio and unit contacted interested forwarders ratio as illustrated in Figure 8.9 and Figure 8.10. From both figures, it is clear that integrating Space Syntax in the ranking function reduces the percent of consumed power per unit delivery ratio and contacted interested forwarder ratio by 0.27% as it facilitates more effective selection of forwarder nodes that frequent popular places, and thus, speed up the delivery ratio. It is also worth noting that the power-aware versions of the PeopleRank algorithm succeed in reducing the consumed power with comparable delivery ratio and comparable contacted interested forwarder ratio. On the other hand, for the sake of contacting an extra

0.4 delivery ratio and contact 0.2 more of the interested forwarders ratio, the power-aware versions of SCAR and SocialCast consume an extra 1.5% power when compared to their power-oblivious peer algorithms.

188 Figure 8.9: Power Consumption per unit Delivery Ratio of All Proposed Algorithms - AUC Dataset

Figure 8.10: Power Consumption per unit Delivery Ratio and Interested Forwarder Ratio of All Proposed Algorithms - AUC Dataset

8.4 Fairness

Power Utilization fairness is analyzed by looking at Figure 8.11, where it is clear that integrating both

Space Syntax and interest-based opportunistic selection into the ranking function slightly improves the level of utilization fairness. Nonetheless, the power-aware versions of Socialcast and SCAR algorithms fail to maintain a higher level of utilization fairness in comparison to their peer power-oblivious versions.

It is even worth mentioning that the Space-syntax based forwarding approach achieves the highest level of utilization fairness among all the compared algorithms.

8.5 Normalized Performance Indices

The performance of the proposed algorithms is evaluated across various environments with various in- terest distributions and various battery distributions. The normalized performance indices defined in

Section 4.2.5 are computed for each algorithm in each environment.

189 Figure 8.11: Utilization Fairness of All Proposed Algorithms - AUC Dataset

8.5.1 Effectiveness across various distributions

From Figure 8.12a which represents the effectiveness performance of the compared algorithms across var- ious interest distribution environments, and Figure 8.12b which represents the effectiveness performance of the compared algorithms across various battery distribution environments, it is noticeable that the

Interest-Only (IntOnly) and the Opportunistic Space Syntax based Dynamic adaptive (DynAdpOpSp-

Syn) algorithms maintain the highest effectiveness performance index across various distributions. It is worth noting that the computation of the effectiveness performance index for the two-distinct interest groups environment excludes the normalized ratio of contacted interested forwarders since the interest dis- tribution in this environment already has no interested forwarders, and the inclusion of this non-existing component significantly affects the index on a negative manner. It is also notable that the depletion-rate versions of SCAR and SocialCast fail to contact any nodes (whatever the nodes’ degree of interest) which lead to their zero effectiveness performance index values. In addition, the interest-aware versions and the opportunistic interest-and-power-aware versions of SCAR and SocialCast achieve higher effectiveness performance index values than their corresponding non-opportunistic interest-and-power-aware versions across distributions.

8.5.2 Efficiency across various distributions

From Figure 8.13a which illustrates the efficiency performance index across various interest distributions and Figure 8.13b which illustrates the efficiency performance index across various battery distributions, it is evident that the Space Syntax based Dynamic adaptive algorithms (DynAdpSpSyn) and (DynAdpOp-

SpSyn) achieve the highest efficiency. It is also observable that the PISCast and the PISCAR algorithms

190 maintain higher efficiency than the other interest-and-power-aware versions of SCAR and SocialCast.

This difference in efficiency could be attributed to the lower cost paid by these two versions. In addition, all the proposed versions attain a higher level of efficiency in comparison to the original SCAR and

SocialCast algorithms.

8.5.3 Power Awareness across various distributions

From Figure 8.14a which illustrates the Power Awareness performance index across various interest distributions and Figure 8.14b which illustrates the Power Awareness performance index across various battery distributions, it is clear that the algorithms that achieve low delivery ratio are able to preserve power and maintain a high level of utilization fairness. Thus, SCAR, SocialCast and their depletion- rate versions achieve high power awareness performance index values. Surprisingly, the DynAdpSpSyn algorithm achieves the highest index value in the two-distinct interest groups environment. This could be attributed to the fact that the DynAdpSpSyn algorithm consumed the least amount of power out of all the compared algorithms in this specific environment, while it also incurred the shortest delay in time to deliver the messages. However, this does not characterize its performance across the remaining environments.

8.6 Conclusion

From the statistical analysis conducted to define the relationship between each of the performance metrics and the suggested independent variables that control these metrics, we can infer that the most significant factors are the node’s battery level, the node colocation with destination nodes, and the popularity of the place frequented by the node. Furthermore, the scatter matrix plots of the multiple regressions constitute a useful tool to diagnose the collinearity between the independent variables in addition to the significant relationships between the each of the dependent variables and every independent variable. From these plots, one can notice the collinearity between the contact duration variables (ContactDursoFar) and

(AvgContDurNow). Also, there is a close relationship between ConsumedPower and Fairness which is deduced from the scatter plots and also from the equation of one of these two variables. This sounds logical since both of them relies on the consumed portion of the battery where the Fairness index is computed based on the mean and standard deviation of the remaining battery levels of the whole community.

As a final note, given that these algorithms have numerous real life applications, it is important to point out that the most suitable algorithm can be determined based on the needs and requirements of the message senders, the nature of the community of users, frequency and duration of contact, their willing- ness to participate, the capability of the devices or any other limitations in terms of cost, duration, etc.

191 Application Factors of Algorithm Selection Most Suitable Algorithm(s) Wireless sensor networks Sensors power limitation Power-aware depletion-rate aware versions of the proposed algorithms Forwarding content to Discourage contact with Interest-aware versions of interested parties without uninterested nodes the proposed algorithms. entirely contacting More specifically, uninterested ones the Interest-Only algorithm. Forwarding content in Dynamically changing Dynamic adaptive environments that are context information forwarding algorithms characterized by dynamically changing context

Table 8.1: Examples of Useful applications

One set of useful applications would be the wireless sensor networks where there are power limitation in the sensors used in this type of networks. Accordingly, the advisable forwarding algorithms would be the power-aware depletion-rate-aware versions of the proposed algorithms. From another aspect, for applica- tions that care for forwarding content to interested parties without entirely contacting uninterested ones, the preferable forwarding algorithms would be the interest-aware versions of the proposed algorithms and more specifically, the Interest-Only algorithm. From a different aspect, applications that forward content in environments that are characterized by dynamically changing context, the dynamic adaptive forwarding algorithms would fit the most in such environments since these algorithms will dynamically adapt the node ranking function according to the current context. Table 8.1 summarizes these set of examples.

192 (a) Effectiveness Performance Index in Various Interest Distributions

(b) Effectiveness Performance Index in Various Battery Distributions

Figure 8.12: Effectiveness Performance Index - AUC Dataset

193 (a) Efficiency Performance Index in Various Interest Distributions

(b) Efficiency Performance Index in Various Battery Distributions

Figure 8.13: Efficiency Performance Index - AUC Dataset

194 (a) Power Awareness Performance Index in Various Interest Distributions

(b) Power Awareness Performance Index in Various Battery Distributions

Figure 8.14: Power Awareness Performance Index - AUC Dataset

195 Chapter 9

Future Work

This research has focused on improving the performance of the social-based opportunistic forwarding algorithms by effectively merging pervasive and social systems. This is done by integrating interest- awareness and power-awareness into the forwarding decision-making process and by including a dynamic adaptive ranking function. As an extension to the challenges displayed in Figure 1.2, Figure 9.1 pinpoints eight challenges facing the emerging SPSs. As an emerging field, there is a great deal of potential for further research and experimentation. In future work, it will be useful to consider two specific challenges, on the basis of which a new set of experiments can be conducted in order to address them. These challenges are:

• Challenge 7: Data Staleness: This challenge pertains to a lack of balance between the trade-off

goals; namely, preserving power and fairness versus minimizing the delay in delivery in order ensure

that destination nodes do not lose interest in the delivered content.

• Challenge 8: Recommendation Relevance: This challenge pertains to the need to combine

more relevant social and context-aware metrics so that it is possible to improve the quality of the

recommendations made. Initial attempts in other research studies suggest that an improvement

in the efficiency of the recommendation process is mainly dependent on the selected social metrics

[165].

To resolve challenge 7, we propose maintaining fair social-aware content delivery that satisfies a

Quality of Experience (QoE) threshold. As we will show in the next section (Section 9.1), we intend to introduce the additional measure of user experience in terms of content delivery under a certain level of Quality of Experience (QoE). This will further enhance the performance of these systems in terms of user experience, adding to what has already been achieved in terms of power consumption, utilization fairness, effectiveness, and efficiency.

196 Figure 9.1: Collective Set of Challenges and Proposed Solutions for Two SPSs

To resolve challenge 8, we propose the proper selection of social and context-aware metrics when making recommendation decisions and to utilize the context-and-social-aware factors in assessment met- rics in order to generate proper social recommendations specifically in the context of academic social networks. We thus aim to demonstrate the merit of properly integrating social-and-context-aware met- rics in the decision making stages of social recommender systems. This contribution, which is presented in Section 9.2, demonstrates a simulation for social recommender systems in academic social networks and studies the improvement in this system’s performance upon social and context-aware metric fusion when generating recommendations.

9.1 Towards Fair Social-Aware Media Delivery with Quality of

Experience Threshold in Opportunistic Networks

The purpose here is to maintain the balance between two conflicting goals: reducing overall power consumption with fairness versus minimizing delay in delivery and avoiding the destination nodes’ loss of interest in the delivered content. The proposed solution is to keep an eye on the probability that, over

197 time, the destination nodes will lose interest in the forwarded content and to therefore insert this factor as a parameter into the ranking of candidates.

Building on a metric proposed by another research work [166], we set the QoE of the users as a measure once they receive the delivered content and by this way we can rank candidates in a better way. For this purpose, we consider the users’ interest in receiving content as something that is not constant; that is, their interest wanes over time. Accordingly, the user’s interest becomes a decaying function. When the destination nodes’ interest reaches a certain predefined threshold, the process of content delivery ceases since the cost paid in terms of power and computation after this moment will be wasted. The performance of this new modification of the fair social-aware forwarding algorithm is measured by the users’ QoE as they receive the content. The algorithm’s target will be to maintain a balance between power consumption and users’ QoE.

9.1.1 Motivation

According to the Mobile Economy 2013 review [6], ”The key driver of data consumption growth today is video content. The growth in [the video content] share of all traffic from 54% in 2013 to 66% in 2017”.

This expected growth in video content traffic is mainly attributed to the 4G’s ability to stream high definition video and the ever increasing fast connection speeds which are forecasted to increase sevenfold by 2017 [6]. This enables many emerging applications of mobile phones such as social networking, video streaming and video calls. The review expects that ”the forecast average speed around the world in 2016 would be enough to stream HD video through the mobile network to users’ phones” [6]. From another dimension, CISCO forecasts the IP traffic is expected to grow by 2020 (194 Exabytes per month) almost triple that of the year 2015 (72.5 Exabytes per month) [167]. CISCO also forecasts a growth in mobile data from 3.68 Exabytes per month in 2015 to become 30.56 Exabytes per month in 2020. More specif- ically, they project that the streaming and downloads are dominating the bandwidth by

28.76 Exabytes per month and will grow to more than 80 percent of all consumer Internet traffic by 2020

(109.9 Exabytes per month). With such high expectations for video-based mobile interactions, many

fields of applications especially in developing countries, have tended to rely on mobile networks rather than overwhelming the available infrastructure. In such areas, the existing network infrastructure is not designed to serve HD video streaming. Furthermore, in some other places the network infrastructure could be missing entirely. Alternative solutions are required to offer this digest service without over- whelming the network infrastructure with the frequent upload and download actions. The ubiquity of mobile phone usage worldwide elicits these devices to be better candidates for several applications. One of these interesting applications is video content delivery within a university campus or a conference venue. In areas such as these, the percentage of mobile users among community members is significant, and therefore could facilitate the deployment of mobile opportunistic networks applications and services.

198 With reliance on the available conference attendees’ or students’ mobile nodes, the required content can be delivered to interested users through the opportunistic encounter among these mobile nodes; this could be done by establishing ad hoc connections and without adding burden on the existing network infrastructure.

9.1.2 Problem Definition

Users in mobile environments frequently face interruptions while attending assembly events for various reasons which lead to interrupted attendance and hinders their ability to follow the information discussed in the event. Most of these attendees are looking for a mechanism that enables them to have access to a recording of the parts they missed in order to compensate for the important missed content. There is a call for a service(s) that detects their location through localization techniques in order to detect when they missed part of the assembly discussions. Accordingly this service compiles the recorded portions each user has missed to deliver them in near real-time to these users for later view at their own pace.

If there is a recording of these events and if there is a localization service that can detect when the attendee has temporarily left the place, both services feed in information into a third service that stitches the missed media content and hands it in to a fourth service that is responsible to deliver it to the targeted member in real-time so that they listen to the compiled digest at their own pace. These services are feasible and there are many contributions that fulfill these stages. However, we find the delivery stage challenging.

It frequently occurs that attendees of a certain event such as a lecture, a meeting or a conference session would need to leave the meeting for few minutes and then return back. However, this frequent interruption hinders the ability of the attendees to follow the information discussed in the event. It would be quite appreciable if these attendees have access to a recording of the parts they missed to compensate for the important missed content.

Although this service is favorable by audience, on another front, it could be an overhead on the serving entity whether it being the conference technical system or the university’s network coupled with the application server. Assume that a held conference is attended by 2000 participants. It is normal that a participant would like to attend several held-in-parallel sessions; then this participant would typically hop between parallel sessions to grasp some knowledge from each one. However, this participant has missed a lot of information from each of these sessions. If the application server detects this participant’s attendance in all of these parallel sessions - assume they are 3 parallel sessions -, then the application compiles all the time he missed in these sessions and uploads this digest either on an internal server or on the web. The next action would be that the server sends a notification for this participant to listen to the digest at his own pace. If we assume that at least half of the attendees missed on average 10 minutes from each session they attended, then the server has to compile and upload (1000 attendees x

199 10 min. x 2 sessions x 3 times a day x 5.3 MB = 310 GB = 2.5 TB). If on average the upload bandwidth in a conference venue is about 10Mbps then the system occupies 2.5TB off this bandwidth to upload the content taking into consideration that it should not occupy the whole network in order to keep the other conference network activities remain uninterrupted.

On another front, the users will download the content to view it in their free time which is nor- mally limited to the conference break time slots. Thus, the users will overload the conference download bandwidth in short periods which will lead to a slowdown in the network service for all the conference attendees in addition to a non-acceptable delay in downloading these digests. In other words, this com- mon group action causes a concentration of the peak download/access time within the course of certain time slots causing a higher probability of network failure or at least poor service during these peak times.

The users will not appreciate the offered service despite the incurred overhead of compiling, uploading and downloading process. There is a need for an alternative solution to offer the same digest service without overloading the network infrastructure with bursts of uploads and downloads.

Another challenge that such applications would encounter is whether or not the conference attendees are willing to convey video content to other conference members in the place. Each mobile owner would be reluctant to exhaust their mobile power and buffer in the process of forwarding video content to other attendees unless there is an incentive that motivates them to conduct this action. There can be several incentive techniques to encourage participants in order to participate in this service such as setting a credit system that counts the number of forwards each member has conducted so that this same member is rewarded in compensation, or this member is allowed to receive content that interests her in reward.

9.1.3 Proposed Solution

Without overloading the university network infrastructure, such applications may utilize the prevail- ing mobile devices held by the majority of the students and staff members within campus/conference.

Content such as video clips can be delivered within a university or a conference over the course of 12-

24 hours using the students’/attendees’ mobile devices instead of relying on the university/conference network infrastructure.

Typical Scenario

A typical scenario would involve delivering portions of a video in an opportunistic network. This would be most applicable in developing countries where universities have no network infrastructure or where the existing network infrastructure is not designed for this kind of extra overload. The same scenario applies to video content delivery of conference sessions during over the course of a conference.

• There is a laptop for recording the lectures/sessions in class/conference room.

200 • Client applications compute the missed ”n” minutes of the lecture that each attendee missed before

entering the lecture room.

• the client application delivers the ”n” missed minutes to attendee ”x” who has entered late to the

room.

• Member ”y” has to leave the room early to catch another class/event. This member needs the

remaining part of the lecture. The client application of member y sends a request to the client

applications of y’s classmates/session attendees to deliver the remaining part.

• The client applications of the other in-place members pick chunks of the missed video portion.

Each client application picks the chunk(s) it can relay based on its remaining buffer, power and the

higher probability this member will meet ”y” later in the day. This colleague might meet student

”y” in another class, in the assembly area or in another event based on their schedules/activity

histogram. If they are conference members, they may come into contact during another session or

in the lunch breaks and assembly meetings.

Objectives

There are objectives to be met which are:

• Content delivery in opportunistic network without reliance on the network infrastructure as

it might not be available, or to relieve it from the overhead of delivering this content.

• Maximizing the delivery ratio

• Limiting the replication overhead - as it indicates a waste of resources.

• In addition, reliance on friends in content delivery preserves privacy. Thus, the algorithm should

target friends/members of the same social group.

• To measure the delay in delivering content as this delay in delivery affects the destination node’s

interest in receiving it, we measure content delivery under a certain level of QoE. Based on

other work [166], QoE is represented by a user’s interest in receiving the content that is progressively

decreasing.

• Fairness in utilizing nodes within the forward process is another objective to target. This can be

measured via the fairness metric defined by [160].

9.1.4 Methodology

The suggested approach would rely on the contextual information and contact duration information when selecting the forwarder nodes. In addition, to satisfy the privacy preservation objective, the selected

201 forwarders must be members of the same social group that the destination node belongs to or friends of her friends (FoF).

We propose three alternative solutions that may be examined:

1. The candidates that are favored in the forwarder selection process are those with the following

properties: with higher social popularity, have higher power and buffer capabilities, who stay in

contact with the carrier for a longer duration than others, and have a higher frequency of contact

with the destination node(s). In this solution, the algorithm seeks candidates with contact durations

that are long enough to transfer the messages from one node to another node.

2. To build a weighted cognitive map [168] that includes all the targeted metrics and sets weights for

their causal relationships. According to temporal changes of the affecting parameters, the weights

are revisited and the final reward/penalty factors are computed and inserted in the rewarded

adaptive candidate ranking function. This should allow the ranking function to adapt its weights

dynamically based on the current context.

3. The algorithm transfers portions of the message between two nodes. In such a case after transferring

each portion t number of times, the carrier drops the transferred portion and focuses on transferring

the remaining portions of the same message. In addition, the priority of each chunk is lowered each

time it is transferred in order to increase the chances of transferring all the required portions of the

requested content. A header is added to each chunk that includes the order number of the chunk.

Thus, a priority function is computed for each chunk to ascertain whether it should be forwarded

or not. This priority function is inversely proportional to the number of times this chunk has been

forwarded by this carrier. Accordingly, the chunk that has not been forwarded yet has the highest

priority while the one that has been forwarded several times has a lower priority.

9.1.5 Related Work

By surveying the related research work, we mainly find contributions in content delivery for DTNS that are based on routing protocols and require route establishment. For instance, Mashhadi et al. [160] offer a fair content distribution in participatory DTNs. In this approach, each node locally estimates the load of other nodes in the network, and uses this information to select the paths that messages should follow, thus spreading them evenly in the network. Moreover, each node locally monitors how much traffic it has been forwarding so that, should a critical limit be reached, it can back off, and rejoin later when less loaded. The authors implemented this approach on top of Habit source-based DTN protocol and compared the new version to the original Habit to find that the new approach distributes load fairly, without compromising delivery ratio.

However, this work is applicable in DTNs and thus delay in delivery is comprehendible, whereas

202 in the domain of our applications the delay within a day is acceptable but not more than that delay

limit. Also, they build their approach on DTN routing protocols but do not benefit from opportunistic

networks as an asset. We intend to provide a fair content delivery approach in opportunistic networks.

Ma et al. present a content-aware transmission scheme with a preset delay threshold in two-tier

heterogeneous networks [166]. The authors claim that their transmission scheme is power-efficient in that

it reduces power cost by 18% while preserving the users’ interest by 90%. However, their application is

based on the assumption that each mobile node is covered by macrocells while some nodes are covered

by both macrocells and femtocells (femtocells are small, low-power cellular base stations designed for

use in home or small business) where most of the transmission will be completed once the users come

into the coverage of femtocells. We care for the metrics they apply to overcome the tradeoff between

users’ experience and power consumption; they consider power efficiency under a certain level of QoE to

reach a balance. The authors modify the Nobel-Winning option pricing Black-Scholes model to decrease

slowly to then use it in modelling the slowly decaying interest function. We propose applying this same

metric to measure the users’ experience and interest in receiving the delivered content. On another front,

McNamara et al. proposed media sharing based on co-location prediction in urban transport [169]. In

this research contribution, they consider the availability of a history of the user’s mobility pattern in

the transportation area is a guidance towards the prediction of how long will this user stay in contact

with the content carrier to increase the probability of successful media content transfer between the

two nodes. However, their contribution is based on long colocation probability upon commuting public

transports such as trains or buses. This application requires that the users explicitly keep records of

their interests and also keep record of previous colocation of other riders to facilitate future probability

of colocation. This is an overhead since one meets many strangers who ride the same public transport,

and thus maintaining this history of colocation is storage consuming. Moreover, exchanging media with

strangers makes the nodes vulnerable to security threats and privacy penetration. Thus, we propose

reliance on friends with whom the user has pre-established trust and privacy settings.

To enable adaptive ranking functions, we consider the contribution of El Mougy et al. [168] who

applied the weighted cognitive maps in order to dynamically infer the reward/penalty weight for each

of the metrics that the system should fulfill. The resulting values are then plugged into the ranking

function. This approach enables dynamic adaptation of the system as per the current situation. Their

system is applicable in WSN where they mainly care for the general relationship among the targeted

metrics. That is, the causal relationship between any two metrics takes one of three values; namely 1, 0

and -1, to indicate positively proportional relationship, no-effect in relation, and negatively proportional

relationship respectively. In this work, we will consider a better adaptive reward/penalty weighted

cognitive map that sets a causal relationship with values x ∈ [-1,1] instead of the discrete set 1,0,-1.

These weights show the relative causal relationship between each two targeted metrics.

203 9.1.6 Evaluation

In this section we discuss the simulation settings needed for evaluating our proposed approach. We briefly mention the datasets we intend to import into our simulation environment and the evaluation metrics we intend to use in order to compare the performance of the proposed solution in terms of the previously declared objectives.

Used Datasets

• For the mobility traces and social network data, we intend to use Reality Mining dataset [170] with

the Bluetooth recordings for location and contact duration detection. From the phone calls and

SMS, a social network can be constructed.

• SIGCOMM 2009 dataset [149] for a conference environment (which includes users encounters,

interests, friend lists, friend lists added during the conference duration, interests added during

the conference duration, questionnaire answers). We may extrapolate social and interest profiles

based on the existing data distribution to cover all the 12037 encountered users in the SIGCOMM

mobility traces.

Evaluation Metrics

• The delivery ratio metric. Note that in case that the approach divides the video into chunks, the

delivery ratio metric must take into account the ordered portions of video chunks being successfully

delivered.

• QoE which is the quality of user experience that relies on a progressively decreasing interest

function based on delay. It is a decreasing user’s interest function where delay is a deterministic

factor in addition to some other random factors [166].

• Fairness in utilizing nodes during the content delivery process. According to Mashhadi et al.

[160], fairness in content delivery = 1 - (SD/mean) where mean is the mean number of messages

delivered by any node and SD is the standard deviation of the delivered content by any node.

9.2 Social Recommender Systems for Academic Social Networks

To resolve the challenge of overlooking the importance of social and context-aware metrics integration in helper recommendation systems (challenge 8), we propose solutions that integrate social and context- aware metrics in recommendation systems in order to improve their recommendation quality in terms of precision and recall.

204 In academic environments that involve communities of researchers and students, a need arises to mix and match mobile users according to their specific interests. For example, the need for matching possible research partners who match or complement each other in terms of interest, or for instance, connecting students with student academic helpers are two examples of services that can enrich an academic community and that we take as potential scenarios below: the academic researchers case and the academic student helpers case. A social recommendation system that features useful recommendation lists, would provide a precious service to such members within an academic community.

In the academic researchers’ case, we have reviewed literature and found recent related work that proposes reliance on affiliation and geographical location as being social metrics for candidate selection

[171]. Another significant research contribution introduces a supportive collaboration index for each researcher upon which researchers are ranked, and thus the highest ranked candidates are recommended for collaborative research [172]. We hypothesize that these metrics are not sufficient to achieve accurate results and that there remains a need to propose more relevant combinations of metrics from both the social and the context-aware aspect.

Furthermore, in the social helpers’ case, students face problems in their academic study when working on their assignments/projects or while studying for exams. They seek help and guidance from other colleagues who are more experienced in the same field/course. However, these students may not be aware of who these skillful helpers may be. They also may not be aware of who among these skillful helpers who have already taken this course in previous semesters. On one side, the students do not know when their colleagues - who currently attend the same course and are likely good helpers - are free or available. On the other side, they also do not know which senior colleagues could be of great help, what their free time slots are, and which places they frequent.

It would be beneficial to follow suggestions made by the authors of the book [173] who point out that

Social Filtering can be a valuable source for recommendations in certain settings but may pose problems for others. For instance, domains in which social network data is too sparse are not well suited for the method. Their recommendation for future research work is to shed more light on these issues and to continue to investigate the limits and chances of Social Filtering.

The aim of our proposed solution is to provide a supporting means to improve the research pro- ductivity of researchers towards establishing a cooperative research environment via the provision of a more accurate academic researcher ranking and expert recommendation system. Figure 9.1 visualizes the challenges that Academic Social Recommender Systems face - the challenges are numbered 6 till 8 -, and lists the proposed solutions for each challenge. The link between each challenge and the proposed solution is numbered by the challenge number.

In more details, we propose integrating interest-awareness with the other social metrics for recom- mendation in order to resolve the undermined importance of integrating interest awareness into social

205 recommendation decisions and to improve the choice of proper social metrics for the recommendation process (challenges 6 and 7). To demonstrate these solutions, we model the case of academic researchers.

In this case, we will introduce more effective social metrics to improve the performance of social rec- ommender systems in academic social networks. These metrics include (but they may be changed upon experimentations): common interest in terms of the field of research, common co-authors that both researchers have worked with previously, common keywords in both researchers’ published papers, the researcher’s ranks as per the number of citations of their own work, the rank of the researcher’s published papers based on the rank of the publishing conferences/journal, and a modified version of the so-called collaboration supportiveness index [172] of each researcher.

To resolve the challenge of overlooking the importance of social and context-aware metrics integration in helper recommendation systems (challenge 8), we propose solutions that integrate social and context- aware metrics in recommendation systems. We can demonstrate our solutions in the academic student helpers’ model. In this scenario, there is a need for an academic social recommender that searches a database of university students and teaching assistants, and filters it based on social and context- aware criteria/filters in order to come up with a relevant set of recommended helpers. However, this recommender has to be provided with the students’ mobility profiles and their schedule of classes. We propose building a contextual histogram as a helpful tool in summarizing the user’s routine activities.

This tool can then be a building block in developing social recommender systems by providing the main activities of the users as well as the fine-grained activities. Such information supports the social recommender systems in the process of inferring high-level context as a step towards inferring situations and user behavior. In addition, we intend to set one of the metrics for helper selection in order to select candidates who are socially popular and active in communicating with others. This quality can be measured through the social rank of the candidates and their activeness in communications in terms of a change in degree of connectivity and the frequency of calls/contacts. If the schedules of the previous semesters of a particular student is accessible, then we can identify the student as s social helper based on whether that student completed the same course in previous semesters or she is a currently registered in it. Thus, the proposed social and context-aware metrics for this system are the users’ free time slots, the courses attended by each user, the change in degree of connectivity, the common friends, and the academic major of both the recommended helper and the student who requests the recommendation.

9.2.1 Motivation

In academic research communities, researchers seek collaborations with other researchers either from the same academic institute or from external research institutes in order to foster effective research collaboration. Similarly, in academic institutes, students may face difficulties in their study; accordingly, they seek help from senior or more experienced colleagues.

206 9.2.2 Problem Definition

Improper merger of social and context-aware metrics for ranking experts / researchers leads to an inef- fective recommendation process.

9.2.3 Proposed Solution

We propose ranking the candidates by computing a helper rank index that is a combination of the following metrics:

For the academic researchers’ social network case: common interest in terms of the field of research, common co-authors that both researchers have worked with previously, the rank of the co-authors, common keywords in both researchers’ published papers, the researcher’s rank as per the number of citations of his own work, the rank of the researcher’s published papers based on the publishing confer- ences/journal’s rank, and a modified version of the so called collaboration supportiveness index as per another research contribution [172] - of each researcher.

For the student helpers’ case: the free timeslots, the attended courses, change in degree of connectivity, common friends, and common academic major. We propose building a contextual histogram to be a helpful tool in summarizing the user’s routine activities. We propose that these applications would provide users with the option of zooming in to view their activities in detailed view with fine granularity.

These applications would also enable the users to have a zoomed out view to grasp the bigger picture of the user’s main activities. This tool then becomes a building block in social recommender systems by providing them with the main activities of the users as well as the fine grained activities. Such information supports the social recommender systems in inferring high-level context as a step towards inferring situations and user behavior.

Furthermore, we propose setting one of the metrics for helper selection to be whether they are socially popular and active in communicating with others or not. This quality can be measured through the social rank of the users and their activeness in communications in terms of change degree of connectivity and frequent calls/contacts. In case of availability of the previous semesters’ class of schedules for these students, another criteria must be fulfilled where a social helper must have attended the same course in previous semesters or is a current registered student in the same course.

The above social metrics are going to be examined to improve the performance of social recom- mender systems in academic social networks. However, these metrics may be modified or changed during experimentations.

207 9.2.4 Methodology

We have a planned set of steps to setup the simulation runs and to compute the candidates’ ranking metrics. These steps are as follows:

1. We will collect a database of researchers from IEEExplore or a similar research library.

2. We then compute the social metrics and context-aware metrics.

3. From these metrics, we compute the candidates’ ranks.

4. We set the simulator to randomly select 100 researchers to represent the ones who request collab-

oration recommendations from the academic recommender system.

5. To each one of the randomly selected requesters, the algorithm proposes a list of ten researchers

at maximum. The ten recommended researchers are the ones whose ranks are listed as the highest

ranks according to the common metrics with those of the requester.

6. The simulator then computes the evaluation metrics to evaluate the recommender’s performance.

9.2.5 Related Work

The field of research of the social recommender systems is rich with contributions, however, not many contributions are made for the academic social recommender systems. Moreover, the proposed attempts for academic social networks do not properly select the effective social and context-aware metrics for ranking helpers. One of the most notable contributions published in recommendations for academic social networks is presented by Brando et al. [171]. They rely on the social principles: homophily and proximity which are measured through two social metrics; namely, affiliation and geographical localization. Then they introduce three social-aware evaluation metrics which are novelty, diversity and coverage. They apply their approach in the process of recommending research collaborations among researchers selected from academic research databases. We intend to compare our proposed contribution against their published work setting theirs as a benchmark.

Some useful contributions have been made in measuring the ability of a researcher in contributing to the academic research collaboration. Liu et al. [172] introduced a collaboration supportiveness index to measure the net support of each researcher in the collaboration network. This index is calculated as the difference between the support provided by the researcher to others and the support this same researcher receives from others. The researchers are ranked based on this index.

Another vital criteria for ranking researchers is to give weight to the number of papers that cited their own research contributions. This approach follows the PageRank concept which is applied in Google and some other search engines for ranking documents. For instance, there are several measures that are based

208 on the PageRank concept and that are used to evaluate an individual scholar’s productivity and impact of the published work [172]. Moreover, some proposed research approaches apply the PageRank concept in ranking the social helpers based on both their skills and their history of helping others. For instance,

Ding et al. [174] propose a method to introduce one or several persons to help a learner when she meets difficulty in study by applying the PageRank algorithm to rank the helpers in a social graph. Their selection criteria of helpers are based on community detection and link analysis. The selected helpers should be learners who are interested in the current user’s problems as they share common interests with this student. The authors also mention that all the candidates must be sorted by their ranks whose values indicate their skill level. The authors then propose applying community recovery through adding internal edges which are characterized by being the highest in terms of betweenness in order to link between communities according to their high common interests. However, this work has not conducted any experiments or simulations to demonstrate their proposed idea. They also do not assess the users’ experience or feedback on the provided recommendation lists.

Some research works study the impact of incorporating social contextual information on improving the performance of recommender systems. Ma et al. [175] propose a factor analysis approach based on probabilistic matrix factorization to alleviate the data sparsity and poor prediction accuracy problems by incorporating social contextual information, such as social networks and social tags. Through complexity analysis, they show that their approach can be applied to very large datasets since it scales linearly with the number of observations. Moreover, the experimental results show that their method improves the recommender system’s performance in the cases where users have made few ratings. However, this work relies on social tagging which requires users’ involvement in the ranking/recommendation process.

There is a need for recommender systems that can deduce ranks from existing social context information without reliance on users’ feedback in case of its sparsity or in case of subjective feedback from users.

Another significant work is presented by Xu et al. [176] demonstrates the importance of combining social network and semantic concept analysis for personalized academic researcher recommendation. In this work, the authors propose a two-layer network-based approach for researcher recommendation which combines social network analysis and semantic concept analysis in a unified framework.

Evaluating the proposed recommendation metrics is a crucial component in the research contribution.

By surveying similar work, we find some intuitive recommendation and evaluation metrics such as those presented by Cremonesi et al. [177]. In their work, they provide useful social metrics and evaluation metrics in the study of performance of recommender algorithms on top-n recommendation tasks. Among these social metrics for ranking, they present Top-N item recommendation, Correlation Neighborhood,

Non-normalized Cosine Neighborhood, Asymmetric-SVD, and PureSVD. They evaluate these metrics using recall@N and precision(recall@N) - by setting precision as a function of recall - evaluation metrics.

Another useful contribution is presented by Carmel et al. [178] who show the evaluation of personalized

209 social search based on the user’s social network. This is accomplished through the following evaluation metrics: normalized discount cumulative gain (NDCG) and precision@N.

9.2.6 Evaluation

In the case of the academic researchers, we need to implement the social recommender of [171] and compare its performance to our proposed social-and-context-aware metrics.

For evaluating the student helpers’ case, we implement a recommender system with and without the use of a histogram of the user’s activities, then compare the recommender system’s performance in both cases.

Used Datasets

We herein briefly mention which datasets we intend to import in the simulator and why we chose these datasets. Details about some of these selected datasets which are already imported in the SAROS simulator are available in Section 4.2.3.

• Four Square Dataset: We will import the FourSquare Dataset [179] to simulate the social graph

of followers . The Four Square Dataset has data of users’ check-in time and location in addition

to their friend-lists. The original dataset - which is not published - contains the users’ comments

and which friends followed these comments by checking-in at the same locations. The author

recommends that we assume that any friend v that checks-in at the same location where user u has

recently checked in then he has followed user u’s comment/tip. This dataset contains the check-in

history of 18107 users ranging from March 2010 to January 2011. For each user, we have his social

networks (115574 links), previous check-in locations and the corresponding check-in time (2073740

check-ins). This dataset also includes another set of traces that contains the check-in history of

11326 users ranging from January 2011 to July 2011. For each user, besides his social networks

(47164 links) and check-in history (1385223 check-ins), there is also his hometown location.

• St. Andrews University Dataset: This dataset represents a university environment [156] and

includes users’ encounters, friend lists/social groups, events that users respond to, users’ answers

to interview questions.

• SIGCOMM Dataset: This dataset represents a conference environment [149] and includes users’

encounters, interests, friend lists, added friend lists, added interests, questionnaire answers, presen-

ters talks and place of the presentations. We will then compile a research database of the presenters

of the conference

• A set of gathered data that is fed from several sensors such as mobile sensors, schedule of

210 classes, and the social profile. The social profile could be extracted from the public social profile

of Facebook, or a questionnaire filled by the participants.

• Reality Mining dataset [170]: The Reality Mining project was conducted from 2004-2005 at

the MIT Media Laboratory [170]. This project followed 94 persons using mobile phones that,

with the aid of other software, recorded and sent the participant’s data about call logs, Bluetooth

devices in proximity of approximately five meters, cell tower IDs, application usage, and phone

status. The data collection process lasted for nine months and included students and faculty from

two programs within a major research institution. In addition, the project conducted self-report

relational data as the participants are asked their about their proximity to, and friendship with,

others. For the mobility traces and social network data, we will use Reality Mining dataset [170]

with the Bluetooth recordings for location and contact duration detection. From the phone calls

and SMS, a social network can be constructed. This dataset will be imported to enable simulations

of constructing a contextual histogram. In such a case, the dataset is divided into two subsets; one

subset will be used as a training set while the other subset will be used as the test set.

• The personal traces dataset: This dataset contains 142 days of mobile phone records (aka Call

Data Records) and ground-truth movement description of Czech Ph.D. student Michal Ficek, stored

by his own mobile terminal in 2010-2011. This is dataset is available on CRAWDAD [180]. The

dataset covers more than 99.99% of 142 days of mobile phone usage in mobile networks of 8 different

providers in 5 countries: Czech Republic, Slovak Republic, Germany, Austria and the USA. The

source of the data is user’s own mobile phone Nokia E52. The publicly available LogExport

application was used to record time and type of communication events (voice, SMS, data). For

cell-transition recording, the free CellTrack91 application was utilized. The coordinates of positions

within the cells were obtained by translating the Cell-IDs to their geographical coordinates by

querying the Google Location API. Thus, the dataset records a student’s activities (such as mobile

calls, SMS, location detection) on a mobile device for months during his travel to few countries.

• The imported AUC mobility traces data will be used to summarize the traces into the level

of records which will then be grouped per user. Then we feed in information in the form of time

spent inside classrooms, in sports area, in food court, in library, the summation of the time spent

by each person while he was moving. We then use this data in the simulator and merge them with

twitter social profiles [163] [181]. Alternatively, to make the collected AUC traces useful for

simulations, there is a need to collect the 100 users’ public Facebook profiles and their schedule of

classes during the traces duration. Thus, we will request an IRB approval for collecting the 100

users’ schedule of classes for the period Oct 2012 till Jan 2013. Also, there will be a request to

create a Facebook account and invite these 100 users to become friends of this account and thus

211 we have access to their public profiles.

• Import the Facebook dataset [182] that includes the friendship, few privacy settings and social

networks of 957,000 users by crawling Facebook in 2009. This dataset was collected during April-

May 2009. It contains the following two representative samples of Facebook users with a few

annotated properties:

– MHRW: A sample of 957,000 unique users obtained Facebook-wide by Metropolis-Hastings

random walks, which is shown to closely approximate the ground truth.

– UNI: A uniform sample of 984,000 unique users that represents the ground truth. User IDs

were selected uniformly at random from the platform’s 32-bit ID spaceessentially a rejection

sampling procedure.

For each dataset, the authors release two files. The first file contains for each sampled user ID,

the number of times the user was sampled and the user IDs of his/her friends. The second file

contains additional node properties for each sampled user. For each sampled user ID, the dataset

file has the number of times sampled, the total number of friends, privacy settings and network

membership. User IDs and network IDs are anonymized.

Evaluation Metrics

The following list includes the evaluation metrics that will be applied to evaluate the performance of the proposed social recommendation systems:

• Recall@N, Precision@N as defined [177]. It is the recall and precision at the top N items.

• NDCG normalized discount cumulative gain as defined in [178].

• Coverage as defined in [171]: In recommender systems, coverage is represented by a metric that

computes how unequally different the recommended items are to users. According to [171], we are

going to use Shannon Entropy and Gini Index together to measure coverage.

• Diversity as defined in [171]: In some cases, suggesting a set of similar items may not be useful

for users since, for instance, researchers from the same research area probably already know each

other. Following the recommended approach in measuring diversity as defined in [171], the most

explored method to measure diversity in a recommendation list is accomplished by using the intra-

list similarity metric.

• Novelty as defined in [171]: New recommendations are indications of items that users do not know

and would not know in absence of a recommender algorithm. The novelty metric aims to quantify

the ”novel” characteristic in a recommendation list. We will follow the method adopted by [171]

in computing novelty.

212 • Also, we have to come up with a quality of user experience (QoE) metric to evaluate to what

extent users find the produced recommendation lists useful.

9.3 Conclusion

We are enthusiastic to conduct the above proposed ideas within future work. We hope we come up with novel approaches that help improve the accuracy and novelty of the social recommender systems. We also hope our proposals in the domain of content delivery improves the performance of such systems by taking into consideration the quality of user experience.

213 Appendices

214 Appendix A

The Full Set of Simulation Results of

PIPeR

The following set of results show the simulation runs that are held to compare the performance of PIPeR to IPeR, SCAR and Epidemic. These simulation runs are held three times once by assuming all the nodes start with full battery - Figure A.1 which is titled Full Battery -, then by assuming normal distribution of battery levels among the nodes - Figure A.2 which is titled Normal Battery - , and a third round when assuming the nodes’ power is distributed according to the heatmap discussed in section 4.2.2 - Figure

A.3 which is titled Heatmap Battery. Each Figure contains 10 sub-figures that mainly represent the following metrics of comparison: cost, delivery ratio, delay, effectiveness, power awareness, and fairness.

The whole set of simulations is conducted once using the SLAW mobility model, and another time using the mobility traces of the SIGCOMM09 dataset.

A.1 SLAW Mobility Set

The following set of results are based on the SLAW Mobility model with an SInt(source,msg) = 0.3

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.1: Power versus Delivery Ratio

215 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.2: Cost versus Delivery Ratio

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.3: Cost over time

In the full battery and normal battery experiments (Figures A.1a and A.1b), the Opportunistic version of the fixed threshold PIPeR (P50Opp) achieves the highest delivery ratio approaching that achieved by Epidemic. The IPeR algorithm is the next algorithm in achieving the highest delivery ratio.

In the heatmap experiment (Figure A.1c), the P50Opp algorithm fails to achieve a high delivery ratio, while the IPeR algorithm preserves its performance across all battery environments. On the other side, the adaptive versions of PIPeR always achieve low delivery ratio with low power consumption.

Figures A.2a, A.2b and A.2c illustrate that P50Opp is not cost effective since the slope of the cost versus delivery ratio is steeper than the other PIPeR versions. Despite their low achieved delivery ratio, the adaptive versions and the SCAR algorithm are cost effective since their slopes are the gentlest among all the other versions.

All our proposed versions succeed in reducing the paid cost at least to half that paid by the Epidemic algorithm. It is noticeable that the P50Opp algorithm pays more cost than the other algorithm but only to achieve more delivery ratio as shown in Figures A.3a, A.3b and A.3c. In the heatmap experiment

(A.3c), the P50Opp algorithm pays less cost that is equivalent to that of the PAdOpp algorithm.

Figures A.4a, A.4b and A.4c illustrate that, across the various battery environments, all the algo- rithms follow the same pattern of delivery ratio as that produced by the Epidemic algorithm. However, the IPeR algorithm and the SCAR algorithm show steeper curves in their progressive delivery ratio.

From the Categories of Consumed Batteries figures (Figures A.5a, A.5b and A.5c) and the final mean and SD figures (Figures A.6a, A.6b and A.6c), the SCAR algorithm and the adaptive versions

216 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.4: Delivery ratio over Time

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.5: Categories of Consumed Battery

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.6: Final Mean Battery and Standard Deviation

217 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.7: Variance over Time

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.8: Power Consumption over Time in Opportunistic fixed threshold of the PIPeR algorithm consume the least amount of power as they preserve a high mean of battery distributions across various distributions. However, SCAR loses this feature in the heatmap experiment whereas all the PIPeR versions preserve the highest mean of battery distribution.

From the variance figures (Figures A.7a, A.7b and A.7c), the P50Opp algorithm achieves the least variances among the battery distributions except in the heatmap environment where the IPeR algorithm achieves the least variance towards the end of the simulation run.

From the power consumption in the opportunistic fixed threshold figures (Figures A.8a, A.8b and

A.8c), it is clear that the P50Opp algorithm consumes a lot of the batteries in the normal distribution to end up with 80% of the community exhausted to reach a remaining battery level that is below 25%. The

P50Opp algorithm, though, stabilizes in the consumption of the batteries by the mid of the simulation run.

From the power consumption in the non-opportunistic fixed threshold figures (Figures A.9a, A.9b and

A.9c), it is clear that the P50 algorithm consumes less amounts of the batteries in the normal distribution

(less steep curves) to end up with 60% of the community exhausted to reach a remaining battery level that is below 25%. Also, the P50 algorithm stabilizes in the consumption of the batteries by the mid of the simulation run.

From the power consumption in the opportunistic adaptive threshold figures (Figures A.10a, A.10b and A.10c), it is clear that the PAdOp algorithm consumes much less amounts of the batteries in the full battery distribution than that consumed by the P50Opp in the previous figures. In both battery

218 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.9: Power Consumption over Time in w/o Opportunistic fixed threshold

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.10: Power Consumption over Time in Opportunistic Adaptive threshold distributions: the normal battery distribution and the heatmap battery distribution (Figures A.10b and A.10c), the PAdOp algorithm succeeds to increase the nodes that belong to the two intermediate categories increasing the utilization fairness.

From the power consumption in the non-opportunistic adaptive threshold figures (Figures A.11a,

A.11b and A.11c), it is clear that the PAd algorithm consumes the least amounts of the batteries in all the battery distributions.

From the interest-based effectiveness figures (Figures A.12a, A.12b and A.12c), the opportunistic

fixed threshold PIPeR version succeeds to contact the highest delivery ratio and the highest interested forwarders ratio, except in the heatmap simulation run when the IPeR algorithm achieves the highest delivery ratio and the highest interested forwarders ratio. It is also obvious that the non-opportunistic

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.11: Power Consumption over Time in w/o Opportunistic Adaptive threshold

219 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.12: Effectiveness: Interest-based node classification

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.13: Effectiveness: Recall/Precision/Accuracy

PIPeR versions fail to contact a high ratio of interested forwarders.

From the recall/precision figures (Figures A.13a, A.13b and A.13c), the opportunistic fixed threshold

PIPeR version achieves the highest precision except in the heatmap simulation run when the IPeR algorithm achieves the highest precision. As for recall, the adaptive PIPeR versions achieve the highest recall value.

From the power consumption figures (Figures A.14a, A.14b and A.14c), the IPeR algorithm consumes less power than the opportunistic fixed threshold PIPeR version but its power consumption steeply grows towards the end of the simulation to become equivalent to the opportunistic fixed threshold PIPeR version in terms of power consumption. However, in the heatmap simulation, the IPeR algorithm consumes the highest percent of power when compared to the other PIPeR versions.

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.14: Power Consumption over time

220 Figure A.15: 7-Metrics Space in Full Battery Distribution

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.16: Power versus Delivery Ratio

From the 7-metric figure (Figure A.15), the PAd algorithm consumes the least amount of power, pays the least cost, does not contact any uninterested nodes in order to achieve medium delivery ratio with medium fairness, but it does not contact any interested forwarders. On the other extreme, among the compared PIPeR versions, the opportunistic fixed threshold PIPeR version achieves the highest delivery ratio, contacts the highest ratio of interested forwarders, achieves the highest level of fairness and incurs the least delay, but it consumes the highest percent of power and pays the largest cost.

A.2 Mobility Traces of SIGCOMM09 Dataset

The following set of results are based on the SIGCOMM09 dataset with an SInt(source,msg) = 0.3

From the power versus delivery ratio figures (Figures A.16a, A.16b and A.16c), it is clear that the IPeR algorithm consumes the least amount of power to achieve the highest delivery ratio. It is also noticeable that P50Opp performs the same level of performance towards the end of the simulation. However, the slope of P50Opp in the heatmap experiment shows more power consumption to achieve less delivery ratio. It is observable that the slope of the P50 algorithm is similar to that of IPeR, but in the heatmap

221 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.17: Cost versus delivery ratio

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.18: Cost over time experiment it is the second best algorithm after IPeR.

According to the cost versus delivery ratio figures (Figures A.17a, A.17b and A.17c), the adaptive versions and the SCAR algorithm pay the least cost to achieve moderate delivery ratio. Among all the algorithms, the IPeR algorithm achieve a mid-way cost versus delivery ratio across all battery en- vironments. Although, the P50Opp algorithm achieves the highest delivery ratio, it pays the highest cost.

By focusing on paid cost over time (Figures A.18a, A.18b and A.18c), PAd and SCAR pay the least cost across environments, but the P50Opp algorithm pays the highest cost that reaches four thirds of that paid by IPeR.

From the delivery ratio over time figures (Figures A.19a, A.19b and A.19c), it is noticeable that the

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.19: Delivery ratio over Time

222 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.20: Categories of Consumed Battery

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.21: Final Mean Battery and SD

SCAR algorithm, followed by the adaptive versions of PIPeR, achieves a very low delivery ratio. The non-adaptive interest-and-power-aware versions of PIPeR achieve the highest delivery ratio. It is worth mentioning that the IPeR algorithm achieves the highest delivery ratio in comparison to all the other versions across the various battery environments.

From the categories of consumed power figures (Figures A.20a, A.20b and A.20c), it is noticeable that all the algorithms follow the same relative performance across the various battery environments.

The final mean and standard deviation of the batteries figures (Figures A.21a, A.21b and A.21c) indicate that the Epidemic algorithm is the least in fairness since it consumes the batteries to reach a low mean value and a large SD. The PAd algorithm is the highest in fairness among all the algorithms as it achieves the highest mean and the smallest SD. All the PIPeR versions in general are fair in utilizing the batteries as noticed from the high mean values and the small SD values these versions lead to.

From the variance over time figures (Figures A.22a, A.22b and A.22c), it is clear that the IPeR algorithm is the most unfair algorithm among all the PIPeR versions as it reaches a very high variance among the batteries. These figures also show that the SCAR algorithm is the fairest algorithm except in the heatmap experiment where all the PIPeR versions - except IPeR - achieve less variance than that the SCAR algorithm achieves.

From the opportunistic fixed threshold power consumption figures (Figures A.23a, A.23b and A.23c), the P50Op algorithm maintains almost a stable set of categories of the batteries till the end of the simulation across various battery environments.

223 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.22: Variance over Time

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.23: Power Consumption over Time in Opportunistic fixed threshold

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.24: Power Consumption over Time without Opportunistic fixed threshold

224 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.25: Power Consumption over Time in Opportunistic Adaptive threshold

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.26: Power Consumption over Time without Opportunistic Adaptive threshold

From the non-opportunistic fixed threshold power consumption figures (Figures A.24a, A.24b and

A.24c), the P50 algorithm also maintains almost a stable set of categories of the batteries till the end of the simulation across various battery environments.

From the opportunistic adaptive threshold power consumption figures (Figures A.25a, A.25b and

A.25c), the PAdOp algorithm also maintains almost a stable set of categories of the batteries till the end of the simulation across various battery environments.

From the non-opportunistic adaptive threshold power consumption figures (Figures A.26a, A.26b and

A.26c), the PAd algorithm also maintains almost a stable set of categories of the batteries till the end of the simulation across various battery environments.

From the interest-based node classification figures (Figures A.27a, A.27b and A.27c), across various

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.27: Effectiveness: Interest-based node classification

225 (a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.28: Effectiveness: Recall/Precision/Accuracy

(a) Full Battery (b) Normal Battery (c) Heatmap Battery

Figure A.29: Power Consumption over time environments, the IPeR algorithm achieves a very high delivery ratio, but it is the PIPeR version that contacts more uninterested nodes when compared to the other interest-and-power-aware PIPeR versions.

Also, the PAd algorithm successfully avoids contacting any uninterested nodes.

From the recall/precision figures (Figures A.28a, A.28b and A.28c), the adaptive PIPeR versions achieve the highest recall and the least precision among all the interest-and-power-aware PIPeR versions across various environments. Also, across various environments, the IPeR algorithm achieves the highest precision among all the other PIPeR versions.

From the power consumption over time figures (Figures A.29a, A.29b and A.29c), The SCAR algo- rithm consumes the least amount of power across environments among all the algorithms, except in the heatmap experiment where the SCAR algorithm is the second highest power consuming algorithm after the IPeR algorithm.

226 Appendix B

The Full Set of Simulation Results of

Dynamic Adaption

The dynamic adaptive ranking algorithm has been evaluated against the other proposed algorithms

across various interest and battery distribution environments. The full list of results of each environment

simulation is illustrated in this appendix as follows:

B.1 Interest Environments

In addition to the results presented in the previous chapters, which are the results of the simulation in

the uniform interest distribution, here we list the results of another set of interest environments, namely,

the normal interest distribution and the two distinct interest groups distribution. The results of the

latter two environments are listed as follows:

B.1.1 Normal Interest Distribution

Interest Awareness

Interest Awareness is represented in terms of Effectiveness and efficiency as follows: Effectiveness of

Normal Interest Environment - AUC Traces From both figures; the Interest-based effectiveness

figure B.1a and the F-measure figure B.1b, it is observable that both the Interest-Only and DynAdpOp algorithms achieve the highest level of effectiveness as they avoid contacting uninterested nodes and highly focus on contacting interested forwarders. On the other hand, the depletion-aware versions of

SCAR and Socialcast fail to achieve effectiveness as they fail to contact any nodes at all. Epidemic, E-

BubbleRap and PeopleRank also achieve low effectiveness despite their success to achieve a high delivery ratio and a high interested forwarders since they contact a large number of uninterested nodes.

227 (a) Interest-based Effectiveness (b) F-measure

Figure B.1: Effectiveness of Normal Interest Environment - AUC Traces

Efficiency of Normal Interest Environment - AUC Traces

From the Cost figure and the Cost versus Delivery Ratio and Interested Forwarders Ratio figure

(Figure B.2c), it is deduced that the Interest-Only algorithm followed by the DynAdpOpSpSyn algorithm

achieve the highest delivery ratio with the least cost. From the delay figure (Figure B.2b), one can notice

that the Interest-Only algorithm incurs the lowest delay.

Power Awareness

Power Awareness is represented in terms of power consumption, fairness and power consumption per

unit delivery ratio and unit ratio of contacted interested forwarders as follows:

From both the Power consumption figure (Figure B.3a) and the figure of Power consumption versus

delivery ratio and interested forwarder ratio (Figure B.3c), it is deduced that the DynAdpOpSpSyn

algorithm followed by the Interest-Only algorithm consumed the least amount of power to achieve the

highest delivery ratio and to contact the highest ratio of interested forwarders. From the fairness figure

(Figure B.3b), it is clear that all depletion-aware versions of SCAR and SocialCast, followed by the

SCAR and the SocialCast algorithms themselves, achieve higher level of utilization fairness.

B.1.2 Two Distinct Interest Groups

Interest Awareness

Interest Awareness is represented in terms of Effectiveness and efficiency as follows:

Effectiveness of 2-Interest Groups Environment - AUC Traces Since this environment lacks the existence of any interested forwarders, it natural that the interest-based effectiveness figure (Figure

B.4a) shows that all the algorithms do not contact any interested forwarders. From Figure B.4a, it is visible that the Interest-Only algorithm achieves the highest delivery ratio without contacting any uninterested nodes at all. The Dynamic adaptive versions are the next most effective algorithms since they achieve the highest delivery ratio with the least ratio of contacted uninterested nodes. On the other hand, the depletion-rate aware versions of SCAR and SocialCast are the least effective algorithms since they totally fail to deliver the messages. The F-measure figure (Figure B.4b) seconds these same

228 (a) Cost (b) Delay

(c) Cost versus Delivery Ratio and Interested Forwarder Ratio

Figure B.2: Efficiency of Normal Interest Environment - AUC Traces

229 (a) Power Consumption (b) Fairness

(c) Power Consumption versus Delivery Ratio and Interested Forwarders Ratio

Figure B.3: Power Consumption of Normal Interest Environment - AUC Traces

(a) Interest-based Effectiveness (b) F-measure

Figure B.4: Effectiveness of 2-Interest Groups Environment - AUC Traces

230 (a) Cost (b) Delay

(c) Cost versus Delivery Ratio

Figure B.5: Efficiency of 2-Interest Groups Environment - AUC Traces observations about the Interest-Only algorithm, the dynamic adaptive versions and the depletion-rate aware versions of SCAR and SocialCast.

Efficiency of 2-Interest Groups Environment - AUC Traces: The Cost figure (Figure B.5a) illustrates that all the SCAR and the SocialCast versions consume the least cost. However, the Cost per unit delivery Ratio and unit interested Forwarders Ratio figure (Figure B.5c) clarifies that the

SCAR and the SocialCast versions consume the least cost to achieve a very low delivery ratio, while the

DynAdpOpSpSyn algorithm achieves the highest delivery ratio with a small additional cost. The delay

figure (Figure B.5b) shows the SCAR and SocialCast depletion-rate aware versions incur the highest delay since they actually incur an infinite delay due to the zero delivery ratio they achieve. On the other hand, SocialCast , followed by SCAR, incurs the least delay.

Power Awareness Power Awareness is represented in terms of power consumption, fairness and power consumption per unit delivery ratio and unit ratio of contacted interested forwarders as follows:

Both figures: the Power consumption figure (Figure B.6a) and the power consumption per unit delivery ratio and unit interested forwarders ratio figure (Figure B.6c) shows that the DynAdpSpSyn algorithm consumes the least amount of power to achieve the highest delivery ratio. From the fairness

figure (Figure B.6b), it is clear that the SCAR and SocialCast versions achieve the highest level of

231 (a) Power Consumption (b) Fairness

(c) Power Consumption versus Delivery Ratio

Figure B.6: Power Consumption of 2-Interest Groups Environment - AUC Traces

232 (a) Interest-based Effectiveness (b) F-measure

Figure B.7: Effectiveness of Normal Battery Environment - AUC Traces utilization fairness.

B.2 Battery Environments

In terms of initial battery distribution, there were three different sets of experiments, namely, the full battery distribution, the normal battery distribution and the heatmap battery distribution. The full battery distribution environment is the one that has been represented in the results illustrated in the previous chapters. The other two battery distribution environments are listed in this appendix as follows:

B.2.1 Normal Battery Distribution

Interest Awareness

Interest Awareness is represented in terms of Effectiveness and efficiency as follows:

Effectiveness of Normal Battery Environment - AUC Traces

From the Interest-based effectiveness figure (Figure B.7a), both the Interest-Only and DynAdpOp algorithms achieve the highest level of effectiveness as they avoid contacting uninterested nodes and highly focus on contacting interested forwarders to achieve the highest delivery ratio. The F-measure

figure (Figure B.7b), on the other hand, shows that the Interest-Only algorithm and the IPeR algorithm achieve the highest f-measure since they both achieve the highest delivery ratio and the highest ratio of contacted interested forwarders. Note that IPeR contacts a small ratio of uninterested nodes.

Efficiency of Normal Battery Environment - AUC Traces

Although the depletion-aware versions of SCAR and Socialcast incur the least cost, the Cost versus

Delivery Ratio and Interested Forwarders Ratio figure (Figure B.8c) clarifies that these versions fail to achieve any delivery ratio, and thus are not efficient at all. The Interest-Only algorithm and the IPeR algorithms, on the other hand, achieve the highest delivery ratio but with almost half the cost paid by the E-BubbleRap and almost the same cost paid by the Epidemic algorithm as shown in Figures B.8a and B.8c. From the delay figure (Figure B.8b), one can notice that the depletion-rate aware versions

233 (a) Cost (b) Delay

(c) Cost versus Delivery Ratio and Interested Forwarder Ratio

Figure B.8: Efficiency of Normal Battery Environment - AUC Traces

234 (a) Power Consumption (b) Fairness

(c) Power Consumption versus Delivery Ratio and Interested Forwarders Ratio

Figure B.9: Power Consumption of Normal Battery Environment - AUC Traces consume infinite delay since they do not achieve any delivery ratio, while the Interest-Only algorithm incurs a quite small delay to achieve the highest delivery ratio.

Power Awareness

Power Awareness is represented in terms of power consumption, fairness and power consumption per unit delivery ratio and unit ratio of contacted interested forwarders as follows:

The Power consumption figure (Figure B.9a) shows that the depletion-rate SCAR and SocialCast versions consume the least amount of power, while the power consumption per unit delivery ratio and unit interested forwarders ratio figure (Figure B.9c) clarifies that the Interest-Only algorithm consumes the least amount of power to achieve the highest delivery ratio. As for the utilization fairness (Figure B.9b), the SCAR and SocialCast versions achieve the highest level of fairness, while the Epidemic algorithm, the E-BubbleRap algorithm and the Space Syntax-based forwarding algorithm achieve the least fairness.

B.2.2 Heatmap Battery Distribution

Interest Awareness

Effectiveness of Heatmap Battery Environment - AUC Traces

Both the interest-based effectiveness figure (Figure B.10a) and the F-measure figure (Figure B.10b)

235 (a) Interest-based Effectiveness (b) F-measure

Figure B.10: Effectiveness of Heatmap Battery Environment - AUC Traces illustrates that the Interest-Only algorithm and the DynAdpOpSpSyn algorithm are the most effective algorithms, while the depletion-rate-aware versions are not effective at all since they fail to contact any node.

Efficiency of Heatmap Battery Environment - AUC Traces

The algorithms that pay the least cost are the depletion-rate-aware versions, however, they achieve no delivery ratio at all as shown in the cost figure (Figure B.11a) and the cost versus delivery ratio and interested forwarders ratio figure (Figure B.11c). On the other hand, the PeopleRank algorithm achieves the highest delivery ratio with the highest cost. From the delay figure (Figure B.11b), the SocialCast and SCAR algorithms incur the least delay only to achieve a very small delivery ratio.

Power Awareness

The power consumption figure (Figure B.12) seconds the above observations as the depletion-rate- aware versions consumes the least amount of power.

236 (a) Cost (b) Delay

(c) Cost versus Delivery Ratio and Interested Forwarder Ratio

Figure B.11: Efficiency of Heatmap Battery Environment - AUC Traces

237 (a) Power Consumption (b) Fairness

(c) Power Consumption versus Delivery Ratio and Interested Forwarders Ratio

Figure B.12: Power Consumption of Heatmap Battery Environment - AUC Traces

238 Appendix C

The SPSS and R commands for

Data Manipulation and Analysis

C.1 SPSS Commands to compute Factor Variables for the Pa-

rameters of the 50 Nodes

VARSTOCASES

/ID=id

/MAKE AvgContDurNow FROM AgContDurNow10 AgContDurNow11 AgContDurNow12 AgCont-

DurNow13 AgContDurNow14 AgContDurNow15 AgContDurNow16 AgContDurNow17 AgContDurNow18

AgContDurNow19 AgContDurNow2 AgContDurNow20 AgContDurNow21 AgContDurNow22 AgCont-

DurNow23 AgContDurNow24 AgContDurNow25 AgContDurNow26 AgContDurNow27 AgContDurNow28

AgContDurNow29 AgContDurNow3 AgContDurNow30 AgContDurNow31 AgContDurNow32 AgCont-

DurNow33 AgContDurNow34 AgContDurNow35 AgContDurNow36 AgContDurNow37 AgContDurNow38

AgContDurNow39 AgContDurNow4 AgContDurNow40 AgContDurNow41 AgContDurNow42 AgCont-

DurNow43 AgContDurNow44 AgContDurNow45 AgContDurNow46 AgContDurNow47 AgContDurNow48

AgContDurNow49 AgContDurNow5 AgContDurNow50 AgContDurNow51 AgContDurNow6 AgCont-

DurNow7 AgContDurNow8 AgContDurNow9

/MAKE AvgFriendsRank FROM AvgFriendsRank10 AvgFriendsRank11 AvgFriendsRank12 AvgFriend- sRank13 AvgFriendsRank14 AvgFriendsRank15 AvgFriendsRank16 AvgFriendsRank17 AvgFriendsRank18

AvgFriendsRank19 AvgFriendsRank2 AvgFriendsRank20 AvgFriendsRank21 AvgFriendsRank22 AvgFriend- sRank23 AvgFriendsRank24 AvgFriendsRank25 AvgFriendsRank26 AvgFriendsRank27 AvgFriendsRank28

AvgFriendsRank29 AvgFriendsRank3 AvgFriendsRank30 AvgFriendsRank31 AvgFriendsRank32 AvgFriend- sRank33 AvgFriendsRank34 AvgFriendsRank35 AvgFriendsRank36 AvgFriendsRank37 AvgFriendsRank38

239 AvgFriendsRank39 AvgFriendsRank4 AvgFriendsRank40 AvgFriendsRank41 AvgFriendsRank42 AvgFriend- sRank43 AvgFriendsRank44 AvgFriendsRank45 AvgFriendsRank46 AvgFriendsRank47 AvgFriendsRank48

AvgFriendsRank49 AvgFriendsRank5 AvgFriendsRank50 AvgFriendsRank51 AvgFriendsRank6 AvgFriend- sRank7 AvgFriendsRank8 AvgFriendsRank9

/MAKE Battery FROM Battery10 Battery11 Battery12 Battery13 Battery14 Battery15 Battery16

Battery17 Battery18 Battery19 Battery2 Battery20 Battery21 Battery22 Battery23 Battery24 Battery25

Battery26 Battery27 Battery28 Battery29 Battery3 Battery30 Battery31 Battery32 Battery33 Battery34

Battery35 Battery36 Battery37 Battery38 Battery39 Battery4 Battery40 Battery41 Battery42 Battery43

Battery44 Battery45 Battery46 Battery47 Battery48 Battery49 Battery5 Battery50 Battery51 Battery6

Battery7 Battery8 Battery9

/MAKE ContactDurSoFar FROM ContactsDurSoFar10 ContactsDurSoFar11 ContactsDurSoFar12

ContactsDurSoFar13 ContactsDurSoFar14 ContactsDurSoFar15 ContactsDurSoFar16 ContactsDurSo-

Far17 ContactsDurSoFar18 ContactsDurSoFar19 ContactsDurSoFar2 ContactsDurSoFar20 ContactsDur-

SoFar21 ContactsDurSoFar22 ContactsDurSoFar23 ContactsDurSoFar24 ContactsDurSoFar25 Contacts-

DurSoFar26 ContactsDurSoFar27 ContactsDurSoFar28 ContactsDurSoFar29 ContactsDurSoFar3 Con- tactsDurSoFar30 ContactsDurSoFar31 ContactsDurSoFar32 ContactsDurSoFar33 ContactsDurSoFar34

ContactsDurSoFar35 ContactsDurSoFar36 ContactsDurSoFar37 ContactsDurSoFar38 ContactsDurSo-

Far39 ContactsDurSoFar4 ContactsDurSoFar40 ContactsDurSoFar41 ContactsDurSoFar42 ContactsDur-

SoFar43 ContactsDurSoFar44 ContactsDurSoFar45 ContactsDurSoFar46 ContactsDurSoFar47 Contacts-

DurSoFar48 ContactsDurSoFar49 ContactsDurSoFar5 ContactsDurSoFar50 ContactsDurSoFar51 Con- tactsDurSoFar6 ContactsDurSoFar7 ContactsDurSoFar8 ContactsDurSoFar9

/MAKE normalContactFr FROM normalContactFr10 normalContactFr11 normalContactFr12 nor- malContactFr13 normalContactFr14 normalContactFr15 normalContactFr16 normalContactFr17 nor- malContactFr18 normalContactFr19 normalContactFr2 normalContactFr20 normalContactFr21 normal-

ContactFr22 normalContactFr23 normalContactFr24 normalContactFr25 normalContactFr26 normal-

ContactFr27 normalContactFr28 normalContactFr29 normalContactFr3 normalContactFr30 normalCon- tactFr31 normalContactFr32 normalContactFr33 normalContactFr34 normalContactFr35 normalCon- tactFr36 normalContactFr37 normalContactFr38 normalContactFr39 normalContactFr4 normalContactFr40 normalContactFr41 normalContactFr42 normalContactFr43 normalContactFr44 normalContactFr45 nor- malContactFr46 normalContactFr47 normalContactFr48 normalContactFr49 normalContactFr5 normal-

ContactFr50 normalContactFr51 normalContactFr6 normalContactFr7 normalContactFr8 normalCon- tactFr9

/MAKE PlacePopFreq FROM PlacePopFreq10 PlacePopFreq11 PlacePopFreq12 PlacePopFreq13

PlacePopFreq14 PlacePopFreq15 PlacePopFreq16 PlacePopFreq17 PlacePopFreq18 PlacePopFreq19 Pla- cePopFreq2 PlacePopFreq20 PlacePopFreq21 PlacePopFreq22 PlacePopFreq23 PlacePopFreq24 Place-

240 PopFreq25 PlacePopFreq26 PlacePopFreq27 PlacePopFreq28 PlacePopFreq29 PlacePopFreq3 PlacePopFreq30

PlacePopFreq31 PlacePopFreq32 PlacePopFreq33 PlacePopFreq34 PlacePopFreq35 PlacePopFreq36 Pla- cePopFreq37 PlacePopFreq38 PlacePopFreq39 PlacePopFreq4 PlacePopFreq40 PlacePopFreq41 Place-

PopFreq42 PlacePopFreq43 PlacePopFreq44 PlacePopFreq45 PlacePopFreq46 PlacePopFreq47 Place-

PopFreq48 PlacePopFreq49 PlacePopFreq5 PlacePopFreq50 PlacePopFreq51 PlacePopFreq6 PlacePopFreq7

PlacePopFreq8 PlacePopFreq9

/MAKE PlacePopHere FROM PlacePopHere10 PlacePopHere11 PlacePopHere12 PlacePopHere13

PlacePopHere14 PlacePopHere15 PlacePopHere16 PlacePopHere17 PlacePopHere18 PlacePopHere19 Pla- cePopHere2 PlacePopHere20 PlacePopHere21 PlacePopHere22 PlacePopHere23 PlacePopHere24 Place-

PopHere25 PlacePopHere26 PlacePopHere27 PlacePopHere28 PlacePopHere29 PlacePopHere3 Place-

PopHere30 PlacePopHere31 PlacePopHere32 PlacePopHere33 PlacePopHere34 PlacePopHere35 Place-

PopHere36 PlacePopHere37 PlacePopHere38 PlacePopHere39 PlacePopHere4 PlacePopHere40 Place-

PopHere41 PlacePopHere42 PlacePopHere43 PlacePopHere44 PlacePopHere45 PlacePopHere46 Place-

PopHere47 PlacePopHere48 PlacePopHere49 PlacePopHere5 PlacePopHere50 PlacePopHere51 PlacePo- pHere6 PlacePopHere7 PlacePopHere8 PlacePopHere9

/MAKE SimilarityInterest FROM SimilarityInterest10 SimilarityInterest11 SimilarityInterest12 Sim- ilarityInterest13 SimilarityInterest14 SimilarityInterest15 SimilarityInterest16 SimilarityInterest17 Sim- ilarityInterest18 SimilarityInterest19 SimilarityInterest2 SimilarityInterest20 SimilarityInterest21 Simi- larityInterest22 SimilarityInterest23 SimilarityInterest24 SimilarityInterest25 SimilarityInterest26 Sim- ilarityInterest27 SimilarityInterest28 SimilarityInterest29 SimilarityInterest3 SimilarityInterest30 Simi- larityInterest31 SimilarityInterest32 SimilarityInterest33 SimilarityInterest34 SimilarityInterest35 Sim- ilarityInterest36 SimilarityInterest37 SimilarityInterest38 SimilarityInterest39 SimilarityInterest4 Simi- larityInterest40 SimilarityInterest41 SimilarityInterest42 SimilarityInterest43 SimilarityInterest44 Simi- larityInterest45 SimilarityInterest46 SimilarityInterest47 SimilarityInterest48 SimilarityInterest49 Simi- larityInterest5 SimilarityInterest50 SimilarityInterest51 SimilarityInterest6 SimilarityInterest7 Similari- tyInterest8 SimilarityInterest9

/MAKE ucdc FROM ucdc10 ucdc11 ucdc12 ucdc13 ucdc14 ucdc15 ucdc16 ucdc17 ucdc18 ucdc19 ucdc2 ucdc20 ucdc21 ucdc22 ucdc23 ucdc24 ucdc25 ucdc26 ucdc27 ucdc28 ucdc29 ucdc3 ucdc30 ucdc31 ucdc32 ucdc33 ucdc34 ucdc35 ucdc36 ucdc37 ucdc38 ucdc39 ucdc4 ucdc40 ucdc41 ucdc42 ucdc43 ucdc44 ucdc45 ucdc46 ucdc47 ucdc48 ucdc49 ucdc5 ucdc50 ucdc51 ucdc6 ucdc7 ucdc8 ucdc9

/MAKE ucol FROM ucol10 ucol11 ucol12 ucol13 ucol14 ucol15 ucol16 ucol17 ucol18 ucol19 ucol2 ucol20 ucol21 ucol22 ucol23 ucol24 ucol25 ucol26 ucol27 ucol28 ucol29 ucol3 ucol30 ucol31 ucol32 ucol33 ucol34 ucol35 ucol36 ucol37 ucol38 ucol39 ucol4 ucol40 ucol41 ucol42 ucol43 ucol44 ucol45 ucol46 ucol47 ucol48 ucol49 ucol5 ucol50 ucol51 ucol6 ucol7 ucol8 ucol9

/KEEP=ConsumedPower Cost day DeliveryRatio hour IntFWDRatio msg time UnIntFWDRatio

241 Fairness

/NULL=DROP.

C.2 The R commands to construct the Scatter Matrix with the

Correlation Coefficients

To read the data file

S <–read.csv(”c:/datafile.csv”)

attach(S)

To select certain variables for the scatter matrix

x <–S[c(”DeliveryRatio”, ”Battery”, ”AvgContDurNow”, ”AvgFriendsRank”, ”ContactsDurSoFar”,

”normalContactFr”, ”PlacePopFreq”, ”PlacePopHere”, ”ucol”, ”ucdc”, ”SimilarityInterest”)]

This a function to calculate the correlation and to produce the scatter matrix:

panel.cor <–function(x, y, digits = 2, cex.cor, ...)

{

usr <–par(”usr”); on.exit(par(usr))

par(usr = c(0, 1, 0, 1))

r <–cor(x, y)

txt <–format(c(r, 0.123456789), digits = digits)[1]

txt <–paste(”r= ”, txt, sep = ””)

text(0.5, 0.6, txt)

p <–cor.test(x, y)$p.value

txt2 <–format(c(p, 0.123456789), digits = digits)[1]

txt2 <–paste(”p= ”, txt2, sep = ””)

if(p <0.01) txt2 <–paste(”p= ”, ”<–0.01”, sep = ””)

text(0.5, 0.4, txt2) }

In order to produce a .png file out of the scatter matrix command, type the following commands:

png(”D:/ScatterMatrixAndCorrelation.png”)

pairs(x, lower.panel=panel.cor)

Then set the output back to the screen

dev.off()

In order to conduct a stepwise forward regression with BIC values, type the following commands:

null <–lm(DeliveryRatio˜ 1, data=x)

full <–lm(DeliveryRatio˜ .+.ˆ 3, data=x)

n <- nrow(S)

242 fwd <–step(null, scope=formula(full), direction=”forward”, k=log(n))

extractAIC(fwd, k=log(n))

C.3 SPSS Commands for Multiple Linear Regression

REGRESSION

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA COLLIN TOL

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT DeliveryRatio

/METHOD=ENTER AvgContDurNow AvgFriendsRank Battery ContactDurSoFar normalContactFr

PlacePopFreq PlacePopHere SimilarityInterest ucdc ucol

/PARTIALPLOT ALL

/RESIDUALS HISTOGRAM(ZRESID).

243 244 Appendix D

IRB Approval Letter

245 Appendix E

Consent for Participation in

Research Form

The American University in Cairo Institutional Review Board

Consent for Participation in Research

Title: Mobility Trace of AUC community for use in Social Pervasive Systems

Introduction

The purpose of this form is to provide you information that may affect your decision as to whether or not to participate in this research study. The person performing the research will answer any of your questions. Read the information below and ask any questions you might have before deciding whether or not to take part. If you decide to be involved in this study, this form will be used to record your consent.

Purpose of the Study

You have been asked to participate in a research study about the Social Pervasive Systems that utilize the mobility trace of the users through detecting their mobile phones connectivity with the access points within AUC. The purpose of this study is to correlate the mobility of a set of users with their social network to be fed into a social pervasive application that is aware of the user context and the user’s social network to provide the appropriate service in the opportune time.

What will you to be asked to do?

If you agree to participate in this study, you will be asked to provide your student Id and email address to help the research group trace your mobility through the access points within the university. This study will take place during the academic year 2012-2013 and will include approximately 50 study participants.

What are the risks involved in this study?

There are no foreseeable risks to participating in this study.

What are the possible benefits of this study?

You will receive no direct benefit from participating in this study; however, your participation will contribute to the research area as new real data will be obtained and can be fed into simulator to achieve results that are more close to reality.

Do you have to participate?

246 No, your participation is voluntary. You may decide not to participate at all or, if you start the study, you may withdraw at any time. Withdrawal or refusing to participate will not affect your relationship with The American University in Cairo in anyway.

If you would like to participate, please sign in this form and write your full name, student ID and email address. You will receive a copy of this form.

Will there be any compensation?

You will not receive any type of payment participating in this study.

What are my confidentiality or privacy protections when participating in this research study?

This study is confidential and the users’ identity will be allowed only for the research group of this research only. Any issued publications will anonymize the user identities.

Whom to contact with questions about the study?

Prior, during or after your participation you can contact the researcher Soumaia Ahmed Al Ayyat at 0101020096 or send an email to [email protected]. This study has been reviewed and approved by The University Institutional Review Board and the study number is [STUDY NUMBER].

Whom to contact with questions concerning your rights as a research participant?

For questions about your rights or any dissatisfaction with any part of this study, you can contact, anonymously if you wish, the Institutional Review Board through Dr. Graham Harman, Associate Provost for Research Administration, at ghar- [email protected].

Signature

You have been informed about this study’s purpose, procedures, possible benefits and risks, and you have received a copy of this form. You have been given the opportunity to ask questions before you sign, and you have been told that you can ask other questions at any time. You voluntarily agree to participate in this study. By signing this form, you are not waiving any of your legal rights.

247 —————————- —————————- Printed Name Student ID

—————————- Email address

—————————- —————————- Signature Date

As a representative of this study, we have explained the purpose, procedures, benefits, and the risks involved in this research study.

——————————————————– Print Name of Person obtaining consent

—————————- —————————- Signature of Person obtaining consent Date

Note: this form is adopted from the IRB of the University of Texas at Austin http://www.utexas.edu/research/rsc/humansubjects/forms.html

248 Appendix F

AUC Traces File Description

F.1 Record Description of the AUC Traces File Content

Record Description NTWS-TRAP-MIB:ntwsClientAssociationSuccessTrap Successful association of a client to the wireless connection NTWS-TRAP-MIB:ntwsClientAuthenticationSuccessTrap Successful client authentication NTWS-TRAP-MIB:ntwsClientAuthorizationSuccessTrap4 Successful client authorization NTWS-TRAP-MIB:ntwsClientRoamingTrap Client is roaming from one access point to another NTWS-TRAP-MIB:ntwsClientDeAuthenticationTrap Clearing the client authentication NTWS-TRAP-MIB:ntwsClientDeAssociationTrap De-association of the client from the wireless connection NTWS-TRAP-MIB:ntwsClientClearedTrap2 In case of disconnection of the session even without authorization, authentication, laptop closed, wireless session disconnected

Table F.1: Record Description of the AUC Traces Trap File Content

249 Appendix G

Certificates of Participation and

Awards

I have participated in the 3MT contest at AUC in 2015 and won the Runner’s up award.

I have also presented my poster in the Youssef Jameel PhD Summer School in 2014, in the Microsoft

Research PhD Summer School, and the AUC Research Day in 2016.

250 I have presented a poster paper in PECCS 2013, and I have won the Best PhD Forum Paper Award in the PerCom 2014 Conference.

251 252 253 Appendix H

The SAROS Simulator Main

Functions

H.1 SLAWSim

The SLAWSim code file includes the following main functions:

KalmanPredictor(double currentValue, int counter, int userID)

This function computes the Kalman Filter prediction of the input currentValue for the user with userID taking into consideration that this is the iteration number counter. The implemented Kalman

Filter is identical with that implemented by the SocialCast algorithm. It is used several algorithms such as SocialCast, SCAR, ISCast, ISCAR, PISCast, PISCAR, PISCastOp, PISCAROp, PISCastDep,

PISCARDep, PISCastOpDep, PISCAROpDep, DynAdp, DynAdpOp, DynAdpspSyn, and DynAdpOp-

SpSyn.

PeopleRank(int messageID, int PeRversion, int iteration)

This function implements the following versions of the PeopleRank algorithm: CA-PeR, IPeR, PIPeR,

PIPeROp, PIPeRDep, and PIPeROpDep. It detects which version to execute based on the input param- eter PeRversion.

Socialcast(int messageID, double wcdc, double wcol, int iteration) This function implements the Socialcast and SCAR algorithms and all their interest and power aware versions; namely, SocialCast, SCAR, ISCast, ISCAR, PISCast, PISCAR, PISCastOp, PISCAROp,

PISCastDep, PISCARDep, PISCastOpDep, and PISCAROpDep.

CSID(int messageID, int iteration)

This function implements the CSID protocol of the ProfileCast paradigm.

DynAdpRank(int messageID, int iteration, int DynVersion)

254 This function implements all the Dynamic Adaptive ranking function versions; namely, DynAdp,

DynAdpOp, DynAdpspSyn, and DynAdpOpSpSyn.

Epidemic(int messageID, int iteration)

This function implements the Epidemic function and its interest or / and power aware versions; namely, Epidemic, Interest-Only, Interest-And-Power-Only. Also, it executes any of the proposed Space-

Syntax based forwarding algorithms.

WaitDestination(int messageID, int iteration)

This function implements the Wait Destination algorithm.

NormalDistribution()

This function generates values extracted from a normal distribution.

SimilarityInterest(int user1ID, int user2ID, int messageID, int mode)

This function computes the Similarity Interest between the interest profiles of two user nodes or between the interest profiles of a user node and the message’s interest profile. The input parameter mode indicates which of the mentioned two modes to compute.

SLAWSimulation()

This function uses the SLAW mobility model’s dataset as the mobility trace of the users in the simulation run.

SaveFile(), SaveRecall(), SaveValues(), SaveInterest(), Savecurrentbatteries()

This set of functions saves the results of the simulation in separate files.

RandomMove()

This function generates a random movement mobility model to be utilized in the simulation.

panel1 Paint()

This function redraws the animated mobility of the users in the simulation per second. This function uses different colors to represent the interest classification of the nodes and whether they have received a copy of hte message or not.

KiBaM(int userID, int timeslot, double PowerConsumed, double conductivity, int mes- sageID, bool IsItCtrlMsg)

This function implements the Kinetic Battery model for the nodes in the simulation.

IsFriend(int user1Id, int user2ID)

This function checks whether the two users are friends according to each one’s friend list.

BubbleRap()

This function implements the E-BubbleRap algorithm.

InitBatteries()

This function initialises the batteries of all the nodes according to the selected Battery distribution.

InDestinationSet(int userID, int messageID)

255 This function checks whether the specified user is within the set of destination nodes for the specified message.

GenerateUserInterest(int messageID)

This function generates the set of users’ interest profiles for a specified message.

GetPenalty(int userID, int messageID)

This function computes the reward/penalty of the user’s rank with respect a specified message.

GenActivity()

This function generates the usage profiles of all the nodes in the simulation as being randomly selected from the imported usage profiles dataset.

GenPlotCmdFile()

This function generates the GNUplot command file and then executes it to produce the graphs that represent the results of the simulation runs.

FindContactDurationNow(int timeslot, int userID)

This function computes the contact duration between the specified user and all the other users at a specified time slot.

FindContactCount(int timeslot, int userID)

This function computes the count of contacts between the specified user and all the other users at a specified time slot.

FindContactDurationSoFar(int timeslot, int userID)

This function computes the contact duration between the specified user and all the other users from the beginning of the simulation run until the specified time slot.

ExpectedUsage(int userID, double batterylevel, double avgcontacts, double timeslot, int messageID)

This function predicts the expected amount of battery usage for a specified user based on its current battery level, the expected power consumption due to future contacts as an extrapolation from the average count of contacts recorded so far, and also based on the expected power consumption due to

WiFi connection and forward/receive actions from now till the expiration time of the TTL of the specified message.

calcFairness(int messageID, int timeslot)

This function computes the utilization fairness index for the whole battery community with respect to a certain message and at specified time slot.

H.2 EditINFOCOM

The EditINFOCOM code file includes the following functions:

256 ReadFile(day)

This function reads the INFOCOM conference dataset and selects the traces of a certain day to be imported in the simulation run.

GenFriends()

This function generates a list of friends for each user. This list is randomly selected from within the current users in the simulation.

Readencounters()

This function extracts the encounter information from the traces.

ReadInterestFile()

This function reads the interest profiles recorded in the INFOCOM dataset.

SaveHours(day)

This function extracts the mobility encounters per hour in separate files for use in separate simulation runs.

SaveInterestfromEncounters()

This function deduces the interest profiles of all the 4704 users based on their frequent encounter with the 20 static iMote nodes that were installed to detect mobile devices during the INFOCOM conference event.

H.3 EditSIGCOMM

The EditSIGCOMM code file includes the following main functions:

ReadInterestFiles()

This function reads the interest files originally created by the SIGCOMM dataset.

WriteInterestFiles()

This function writes the interest files in the format that suits the SAROS simulator functionality.

ReadFile(day)

This function reads the mobility traces file of the SIGCOMM dataset.

SaveHours(day)

This function extracts the mobility encounters per hour in separate files for use in separate simulation runs.

ReadFriendsFiles()

This function reads the friend-list files that are originally created by the SIGCOMM dataset.

WriteFriendsFiles()

This function writes the friend-list files in the format that suits the SAROS simulator functionality.

257 H.4 MallTracesSim

The MallTracesSim code file includes the following main functions:

ReadFile(day)

This function reads the mobility traces file of the Mall environment dataset.

MallTracesSim()

This function runs the simulation with the imported Mall dataset.

H.5 StAndrewsTraces

The StAndrewsTraces code file includes the following main functions:

ReadFile(day)

This function reads the mobility traces file of the St. Andrews University dataset.

StAndrewsTraces()

This function runs the simulation with the imported St. Andrews University dataset.

H.6 SpaceSyntax

The SpaceSyn code file mainly handles all the Space Syntax metrics implementation and also handles importing data from the AUC traces dataset. This file includes the following main functions:

ReadStreets(day, List StreetList, List Edges)

This function reads all the street information and the edges information based on a selected day in the AUC traces dataset.

ReadAPcolocation(day, SortedList colocation)

This function reads the colocation of the users with the imported access points for a selected day.

double findUserPop(List UsersPopularity, double proximity, double xu, double yu, int choice, int userID, List CloseAPs, SortedList colocation)

This function computes the node popularity for the user whose ID is userID, and whose location coordinates are (xu, yu). The popularity is computed based on the choice parameter that identifies which Space Syntax metric will be applied in the computation.

bool IsCloseAP(double xp, double yp, List APs, double xu, double yu, double proximity, ref double distToAP)

This function checks whether or not the user located at the coordinates (xu, yu) is within proximity from the access point whose coordinates (px, py) and returns the distance between them in the variable distToAP.

List ReadAPs(ref int MaxFreq, day, ref List APs, List StreetList , int proximity)

258 This function reads the access points information from the AUC traces files based on the selected day, and then computes the frequency of association with each access point to come up with the most frequent access point and the frequency value (MaxFreq) of this most frequent access point. This function also writes the access points information in a separate file with a format that suits the simulator.

SetPopOrder(List APs)

This function ranks the access points as per their computed popularity.

ComputeIntegrationValueAsLocationIndexforAP(List AP)

This function computes the IntegrationValue for each access point, then computes the Location Index for each access point.

ComputeIntegrationValueAsLocationIndex(List StreetList)

This function computes the Integration Value and the Location Index for each street on the AUC map.

double DistanceStreetToAStreet(int i, int j, List StList)

This function returns the distance between the two streets i and j based on their location coordinates stored within the streets information available in StreetList.

double EuclideanDist(double ix, double iy, double jx, double jy)

This function computes the Euclidean Distance between two points i and j whose coordinates are (ix, iy) and (jx, jy) respectively.

ComputePopularityIndex(day, List StList, List Edges)

This function computes the popularity index of all the streets and the access points of a selected day as per AlJarhi’s paper [28].

CountAPcolocation(int userID, int APId, SortedList colocation)

This function counts the number of times the user with userID got in association with access point

APId and stores this information the list colocation.

ReadTraces(int iteration, day, ref SortedList traces, ref List Users, SortedList APs,

SortedList coloc)

This function reads the user mobility traces for a selected day and stores their corresponding infor- mation in the following sorted lists: traces, Users, APs and colocation.

WriteTraces(string filepath, SortedList traces)

This function writes the AUC traces information in a format that suits the simulator.

bool IsSpaceSyntax(ref SortedList colocation, ref List StreetList, int choice, double XThis-

Pos, double YMThisPos, double XOtherPos, double YOtherPos, short numOfmsgs, int proximity, int dataset, ref List CloseAPs, out double PopThisUser, out double PopOtherUser,

List UserPop, int thisuserID, int otheruserID, int msgID, DateTime timeslot, List APs,

List Users)

259 This function computes the SpaceSyntax metric of both users whose IDs are: thisuserID and otheruserID and their location coordinates are (XThisPos, YThisPos) and (XOtherPos, YOtherPos) in order to output their popularity values; namely, PopThisUser and PopOtherUSer. This function also returns a boolean value to indicate whether the popularity value of thisuser is greater than that of otheruser or not. The computed popularity value is based on the chosen Space Syntax metric which is selected through the parameter choice.

FindPosition(int userID, ref double xpos, ref double ypos, int currenttime, SortedList traces, List APs, List Users)

This function returns the location coordinates (xpos, ypos) of the user with ID userID at time

(currenttime).

int FindAP(int userID, DateTime currenttime, SortedList traces, List APs, List Users)

This function returns the number of the access point with which the user of ID userID is associated at time currenttime according to the information extracted from the traces.

SortedList Addoccurrence(DateTime dateTime, int p, DateTime enddate)

This function inserts a new occurrence of association with the access point p for the duration from the timeslot dateTime till the timeslot enddate.

int CountAPfreq(string filepath, ref List APs)

This function counts the frequency of association with each one of the access points as recorded in the data file that is located in filepath.

double DistanceAPToAStreet(int AP, int streetindex, List StreetList, List APs)

This function computes the distance between the access point AP and the street number streetindex.

double ComputeUserPopIndex(double xu, double yu, ref List StreetList, double proxim- ity)

This function returns the computed popularity Index of the node whose coordinates are (xu, yu).

double ComputePopIndexForPoint(int AP, List APs, List StreetList, int proximity)

This function returns the computed popularity Index of the access point AP.

260 Bibliography

[1] N. Vallina-Rodriguez et al., “Exhausting battery statistics understanding the energy demands on

mobile handsets,” in ACM SIGCOMM MobiHeld workshop Proceedings, 2010, pp. 9–14.

[2] V. Rao et al., “Battery model for embedded systems,” in VLSI Design, 2005. 18th International

Conference on, 2005, pp. 105–110.

[3] V. Kostakos, “Space Syntax and Pervasive Systems,” in Geospatial Analysis and Modeling of Urban

Structure and Dynamics. Springer Science, 2009.

[4] A. Carroll and G. Heiser, “An Analysis of Power Consumption in a Smartphone,” in Proceedings

of USENIX, 2010.

[5] R. Friedman and A. Kogan, “On Power and Throughput Tradeoffs of WiFi and Bluetooth in

Smartphones,” in IEEE INFOCOM Proceedings, 2011, pp. 900 – 908.

[6] A. Kearney, “The mobile economy 2013,” http://www.gsmamobileeconomy.com/GSMA Mobile

Economy 2013.pdf.

[7] “Facebook statistics,” https://newsroom.fb.com/Key-Facts.

[8] “Youtube-statistic brain,” http://www.statisticbrain.com/youtube-statistics/.

[9] R. Kokku et al., “Opportunistic Alignment of Ad Delivery with Cellular Basestation Overloads,”

in MobiSys Proceedings, 2011, p. 267.

[10] A. Mashhadi et al., “Ad Hoc Networks Fair content dissemination in participatory DTNs,” Ad Hoc

Networks, vol. 10, no. 8, pp. 1633–1645, 2012.

[11] Y. Zhu et al., “A Survey of Social-based Routing in Delay Tolerant Networks: Positive and Negative

Social Effects,” Communications Surveys Tutorials, IEEE, vol. 15, no. 1, pp. 387–401, 2013.

[12] C.-M. Huang, K.-c. Lan, and C.-Z. Tsai, “A Survey of Opportunistic Networks,” in 22nd IEEE

AINA Workshops, 2008, pp. 1672–1677.

261 [13] M. Conti and I. National, “Opportunities in Opportunistic Computing,” IEEE Computer Society,

vol. 43, no. 1, pp. 42–50, 2010.

[14] “Number of smartphone users worldwide from 2014 to 2019 (in millions),”

http://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/.

[15] “Number of smartphone users in the united states from 2010 to 2019 (in millions),” 2016. [Online].

Available: http://www.statista.com/statistics/201182/forecast-of-smartphone-users-in-the-us/

[16] “Usa population live - worldmeters,” 2016. [Online]. Available: http://www.worldometers.info/

world-population/us-population/

[17] “Number of smartphone users in Middle East and Africa from 2014 to 2019

(in millions),” 2016. [Online]. Available: http://www.statista.com/statistics/494580/

smartphone-users-in-middle-east-and-africa/

[18] “Number of smartphone users in Middle East and Africa from 2014 to 2019

(in millions),” 2016. [Online]. Available: http://www.statista.com/statistics/467163/

forecast-of-smartphone-users-in-india/

[19] “India Population Live - Worldmeters,” 2016. [Online]. Available: http://www.worldometers.info/

world-population/india-population/

[20] “The global information technology report 2015,” 2015. [Online]. Available: http://www3.

weforum.org/docs/WEF Global IT Report 2015.pdf

[21] T. L. Longxiang Gao, Shui Yu and W. Zhou, “Delay Tolerant Networks Based Applications,” 2015.

[22] A. Mtibaa, M. May, and M. Ammar, “Social Forwarding in Mobile Opportunistic Networks: A Case

of PeopleRank,” in Communication and Social Networks. Springer, 2012, vol. 58, pp. 387–425.

[23] E. M. Daly and M. Haahr, “Social Network Analysis for Routing in Disconnected Delay-Tolerant

MANETs,” in 8th ACM MobiHoc Proceedings, 2007, pp. 32–40.

[24] P. Costa et al., “Socially-aware routing for publish-subscribe in delay-tolerant mobile ad hoc net-

works,” IEEE Journal on Selected Areas in Communications, vol. 26, no. 5, pp. 748–760, Jun.

2008.

[25] W.-j. Hsu, D. Dutta, and A. Helmy, “Profile-cast: Behavior-aware mobile networking,” in IEEE

WCNC, 2008, pp. 3033 –3038.

[26] C. Chatfield, The Analysis of Time Series: An Introduction, 6th ed. Chapman and Hall/CRC,

2004.

262 [27] B. Pasztor et al., “Opportunistic mobile sensor data collection with SCAR,” in IEEE MASS, 2007,

pp. 1–12.

[28] A. Al Jarhi, H. A. Arafa, K. A. Harras, and S. G. Aly, “Rethinking opportunistic routing using

space syntax,” in Proceedings of the 6th ACM Workshop on Challenged Networks, ser. CHANTS

’11, 2011, pp. 21–26.

[29] A. Mtibaa and K. A. Harras, “Select & spray: Towards deployable opportunistic communication

in large scale networks,” in Proceedings of the 11th ACM International Symposium on Mobility

Management and Wireless Access, ser. MobiWac ’13. ACM, 2013, pp. 1–8.

[30] B. Hillier and J. Hanson, The Social Logic of Space. Cambridge University Press, 1984, cambridge

Books Online. [Online]. Available: http://dx.doi.org/10.1017/CBO9780511597237

[31] P. Bellavista and S. Helal, “Location-Based Services: Back to the Future,” Pervasive Computing,

IEEE, vol. 7, no. 2, pp. 85–89, 2008.

[32] P. Lehsten, R. Zender, U. Lucke, and D. Tavangarian, “A Service-oriented Approach towards

Context-aware Mobile Learning Management Systems,” in PerCom ’10, 2010, pp. 268–273.

[33] N. Sambasivan, N. Rangaswamy, E. Cutrell, and B. Nardi, “UbiComp4D: Infrastructure and Inter-

action for International Development: the Case of Urban Indian Slums,” in UbiComp. Orlando:

ACM, 2009.

[34] G. Gay, Context-Aware Mobile Computing: Affordances of Space, Social Awareness, and Social

Influence. Morgan & Claypool, 2009.

[35] Y. Kompatsiaris et al., “Information Extraction from Social Sites,” in SSMS, 2010.

[36] D. Quercia, J. Ellis, and L. Capra, “Using Mobile Phones to Nurture Social Networks,” Pervasive

Computing, vol. 9, no. 3, pp. 12–20, 2010.

[37] M. Satyanarayanan, “Pervasive computing: vision and challenges,” Personal Communications,

IEEE, vol. 8, no. 4, pp. 10–17, Aug. 2001.

[38] N. Eagle and A. Pentland, “Social Serendipity: Mobilizing Social Software,” IEEE Pervasive Com-

puting, vol. 4, pp. 28–34, 2005.

[39] F. Baldauf, Matthias, Dustdar, Schahram And Rosenberg, “A survey on context-aware systems,”

Int. J. Ad Hoc and Ubiquitous Computing, vol. 2, no. 4, pp. 263–277, 2007.

[40] D. M. Boyd and N. B. Ellison, “Social Network Sites: Definition, History, and Scholarship,” Journal

of Computer-Mediated Communication, vol. 13, no. 1, pp. 210–230, Oct. 2008.

263 [41] A. Beach, M. Gartrell, X. Xing, R. Han, Q. Lv, S. Mishra, and K. Seada, “Fusing mobile, sensor,

and social data to fully enable context-aware computing,” in HotMobile Workshop Proceedings,

2010, pp. 60–65.

[42] S. Al Ayyat, S. G. Aly, and K. A. Harras, “Social Pervasive Systems - The Integration of Social

Networks and Pervasive Systems,” in PECCS, 2013, pp. 118–124.

[43] A. Saha, Debashis; Mukherjee, “Pervasive Computing: A Paradigm for the 21st Century,” Com-

puter, vol. 36, no. 3, pp. 25–31, 2003.

[44] I. Roussaki, M. Strimpakou, C. Pils, N. Kalatzis, and N. Liampotis, “Optimising

context data dissemination and storage in distributed pervasive computing systems,”

Pervasive and Mobile Computing, vol. 6, no. 2, pp. 218–238, 2010. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S1574119209000601

[45] A. Schmidt, “There is more to context than location,” Computers, vol. 3, no. 5, pp. 421–901, Dec.

1999.

[46] D. C. Dryer, C. Eisbach, and W. S. Ark, “At what cost pervasive? A social computing view of

mobile computing systems,” IBM Systems Journal, vol. 38, no. 4, pp. 652–676, 1999.

[47] O. D. Bruijn and R. Spence, “Serendipity within a Ubiquitous Computing Environment: A Case

for Opportunistic Browsing,” in UbiComp, G. D. Abowd, B. Brumitt, and S. A. N. Shafer, Eds.,

2001, pp. 362–369.

[48] G. Judd and P. Steenkiste, “Providing contextual information to pervasive computing

applications,” in PerCom ’03. IEEE Comput. Soc, 2003, pp. 133–142. [Online]. Available:

http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1192735

[49] B. Moltchanov, C. Mannweiler, and J. Simoes, “Context-Awareness Enabling New Business

Models in Smart Spaces,” in Smart Spaces and Next Generation Wired/Wireless Networking,

ser. Lecture Notes in Computer Science, S. Balandin, R. Dunaytsev, and Y. Koucheryavy,

Eds. Springer Berlin / Heidelberg, 2010, vol. 6294, pp. 13–25. [Online]. Available:

http://dx.doi.org/10.1007/978-3-642-14891-0\ 2

[50] K. Frank, N. Kalatzis, I. Roussaki, and N. Liampotis, “Challenges for context management

systems imposed by context inference,” in Proceedings of the 6th international workshop on

Managing ubiquitous communications and services, ser. MUCS ’09. New York, NY, USA: ACM,

2009, pp. 27–34. [Online]. Available: http://doi.acm.org/10.1145/1555321.1555329

264 [51] T. Awareness, F. Awareness, and I. H. Demonstrator, “Context AWARE mobile NEtworks and

ServiceS,” pp. 1–2, 2004. [Online]. Available: http://www.freeband.nl/project.cfm?language=en\

&id=494

[52] H.-l. Truong and S. Dustdar, “A Survey on Context-aware Web Service Systems,” International

Journal of Web Information Systems, vol. 5, no. 1, pp. 5–31, 2009.

[53] A. C. Santos, J. a. M. Cardoso, D. R. Ferreira, P. C. Diniz, and P. Cha´ınho,

“Providing user context for mobile and social networking applications,” Pervasive and

Mobile Computing, vol. 6, no. 3, pp. 324–341, Jun. 2010. [Online]. Available: http:

//linkinghub.elsevier.com/retrieve/pii/S1574119210000052

[54] A. Santos, L. Tarrataca, J. Cardoso, D. Ferreira, P. Diniz, and P. Chainho, “Context

Inference for Mobile Applications in the UPCASE Project,” in MOBILe Wireless MiddleWARE -

MOBILWARE, 2009, pp. 352–365. [Online]. Available: http://academic.research.microsoft.com/

Author/3318207.aspx

[55] B. Y. Lim and A. K. Dey, “Assessing demand for intelligibility in context-aware applications,”

in Proceedings of the 11th international conference on Ubiquitous computing - Ubicomp

’09. New York, New York, USA: ACM Press, 2009, p. 195. [Online]. Available:

http://portal.acm.org/citation.cfm?doid=1620545.1620576

[56] B. Adams, D. Phung, and S. Venkatesh, “Extraction of social context and application

to personal multimedia exploration,” in Proceedings of the 14th ACM MULTIMEDIA

’06. New York, New York, USA: ACM Press, 2006, p. 987. [Online]. Available:

http://portal.acm.org/citation.cfm?doid=1180639.1180857

[57] M. De Choudhury, W. A. Mason, J. M. Hofman, and D. J. Watts, “Inferring relevant social

networks from interpersonal communication,” in International conference on World wide web

Proceedings. New York, New York, USA: ACM Press, 2010, pp. 301–310. [Online]. Available:

http://portal.acm.org/citation.cfm?doid=1772690.1772722

[58] R. Lange, N. Cipriani, L. Geiger, M. Grossmann, H. Weinschrott, A. Brodt, M. Wieland,

S. Rizou, and K. Rothermel, “Making the World Wide Space happen: New challenges for

the Nexus context platform,” in PerCom ’09. IEEE, Mar. 2009, pp. 1–4. [Online]. Available:

http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4912782

[59] G. Adomavicius and A. Tuzhilin, “Context-Aware Recommender Systems,” in Recommender

Systems Handbook, F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, Eds. Springer US, 2011,

pp. 217–253. [Online]. Available: http://dx.doi.org/10.1007/978-0-387-85820-3\ 24

265 [60] J.-y. Hong, E.-h. Suh, and S.-J. Kim, “Context-aware systems: A literature review and

classification,” Expert Systems with Applications, vol. 36, no. 4, pp. 8509–8522, May 2009.

[Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0957417408007574

[61] J. Hong, E.-H. Suh, J. Kim, and S. Kim, “Context-aware system for proactive personalized service

based on context history,” Expert Systems with Applications, vol. 36, no. 4, pp. 7448–7457, 2009.

[62] N. Ibrahim and F. L. Mou¨el,“A Survey on Service Composition Middleware in Pervasive Environ-

ments,” International Journal of Computer Science, vol. 1, pp. 1–12, 2009.

[63] H. Schmidt, F. Flerlage, and F. J. Hauck, “A generic context service for ubiquitous

environments,” in PerCom ’09. Ieee, Mar. 2009, pp. 1–6. [Online]. Available: http:

//ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4912856

[64] S. B. Mokhtar, L. McNamara, and L. Capra, “A middleware service for pervasive social

networking,” in Proceedings of the International Workshop on Middleware for Pervasive Mobile

and Embedded Computing - M-PAC ’09. New York, New York, USA: ACM Press, 2009, pp. 1–6.

[Online]. Available: http://portal.acm.org/citation.cfm?doid=1657127.1657130

[65] J. Hyuk, P. A. Jianhua, M. A. Laurence, and A. K. Dey, “Special issue on Intelligent systems

and services for ubiquitous computing,” Personal and Ubiquitous Computing, vol. 13, pp. 445–447,

2009.

[66] R. Zender, U. Lucke, and D. Tavangarian, “SOA Interoperability for Large-scale Pervasive Envi-

ronments,” in 24th WAINA Workshops, 2010, pp. 545–550.

[67] P. Lukowicz, S. Pentland, and A. Ferscha, “From context awareness to socially aware computing,”

Pervasive Computing, IEEE, vol. 11, no. 1, pp. 32 –41, 2012.

[68] N. Roy, T. Gu, and S. K. Das, “Supporting pervasive computing applications with active context

fusion and semantic context delivery,” Pervasive and Mobile Computing, vol. 6, no. 1, pp. 21–42,

Feb. 2010. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S1574119209000972

[69] D. Rosen, G. Barnett, and J. Kim, “Social networks and online environments: when science and

practice co-evolve,” Social Network Analysis and Mining, vol. 1, no. 1, pp. 27–42, 2011. [Online].

Available: http://dx.doi.org/10.1007/s13278-010-0011-7

[70] “Bulletin Board Systems.” [Online]. Available: http://www.livinginternet.com/u/ui\ fidonet.htm

[71] “FidoNet History.” [Online]. Available: www.livinginternet.com/u/ui.htm

[72] “The Launch of MSN Messenger.” [Online]. Available: http://www.microsoft.com/presspass/

press/1999/jul99/messagingpr.mspx

266 [73] M. Kaddah, M. Mansour, R. El Kaliouby, and S. Al Ayyat, “From generic to a customized frame-

work: Paving the way for webct,” in Syllabus: Technology for Higher Education, 2002.

[74] S. Al-Ayyat, M. Bali, A. Ellozy, M. Kosheiry, M. Mansour, and W. Pappas, “Two years

into webct: Perceptions of auc students,” in 2nd international e-learning conference, The

American University in Cairo, January 2004. [Online]. Available: http://www.acs.aucegypt.edu/

Presentations/studsurvey.pdf

[75] O’Rourke, “Analysing Social Networks in e-Learning Systems,” Ph.D. dissertation, University of

Sydney, 2007. [Online]. Available: http://www.ee.usyd.edu.au/∼rafa/UGthesis/07-ORourke.pdf

[76] A. J. Berlanga, P. B. Sloep, P. V. Rosmalen, and R. Koper, “Ad hoc transient communities :

towards fostering knowledge sharing in learning networks,” Int. J. Learning Technology, vol. 3,

no. 4, pp. 443–458, 2008.

[77] S. Graf, K. Maccallum, T.-c. Liu, M. Chang, D. Wen, Q. Tan, J. Dron, F. Lin, N.-s. Chen, and

R. Mcgreal, “An Infrastructure for Developing Pervasive Learning Environments,” in PerCom ’08,

2008, pp. 389–394.

[78] O. Awodele, S. Idowu, O. Anjorin, A. Adedire, and V. Akpore, “University Enhancement System

using a Social Networking Approach: Extending E-learning,” Informing Science and Information

Technology, vol. 6, pp. 269–283, 2009.

[79] R. Kumar, J. Novak, and A. Tomkins, “Structure and Evolution of Online Social Networks,” in Link

Mining: Models, Algorithms, and Applications. New York, NY: Springer New York, 2010, ch. 13,

pp. 611–617. [Online]. Available: http://www.springerlink.com/index/10.1007/978-1-4419-6515-8

[80] Y. Tian, J. Srivastava, T. Huang, and N. Contractor, “Social Multimedia Computing,” IEEE

Computer Society, vol. 43, no. 8, pp. 27–36, 2010.

[81] S. Mizzaro and L. Vassena, “A social approach to context-aware retrieval,” World Wide Web, pp.

1–29, 2011. [Online]. Available: http://dx.doi.org/10.1007/s11280-011-0116-6

[82] M. R. Morris, J. Teevan, and K. Panovich, “What Do People Ask Their Social Networks, and

Why? A Survey Study of Status Message Q&A Behavior,” in CHI 2010: Using Your Social

Network, Atlanta, 2010, pp. 1739–1748.

[83] M. Mani, A.-m. Nguyen, and N. Crespi, “SCOPE: A prototype for spontaneous P2P social net-

working,” in PERCOM Workshops, Mannheim, 2010, pp. 220–225.

[84] W. Luo, Q. Xie, and U. Hengartner, “FaceCloak: An Architecture for User Privacy on Social

Networking Sites,” in International Conference on Computational Science and Engineering, vol. 3,

2009, pp. 26–33.

267 [85] M. Terry, E. D. Mynatt, K. Ryall, and D. Leigh, “Social net: using patterns of physical proxim-

ity over time to infer shared interests,” in Conference Extended Abstracts on Human Factors in

Computer Systems. ACM Press, 2002, pp. 816–817.

[86] N. Eagle and A. A. Sandy, “Reality mining: sensing complex social systems,” Personal and Ubiq-

uitous Computing, vol. 10, no. 4, pp. 255–268, 2006.

[87] J. Mendes, “SOCIALNETS: Social networking for pervasive adaptation,” The European Commis-

sion, Tech. Rep. 217141, 2008.

[88] A. Sapuppo and L. T. Sø rensen, “Local Social Networks,” in Proceedings of TTA ’11, vol. 5, 2011,

pp. 15–22.

[89] E. Miluzzo, N. D. Lane, K. Fodor, R. Peterson, H. Lu, M. Musolesi, S. B. Eisenman, X. Zheng,

and A. T. Campbell, “Sensing Meets Mobile Social Networks : The Design , Implementation and

Evaluation of the CenceMe Application,” in SenSys, 2008, pp. 337–350.

[90] J. Mahmud, Y.-W. Huang, J. Ponzo, and R. Pollak, “Avara: a system to improve user experience

in web and virtual world,” in Proceedings of the 15th international conference on Intelligent user

interfaces, ser. IUI ’10. New York, NY, USA: ACM, 2010, pp. 349–352. [Online]. Available:

http://doi.acm.org/10.1145/1719970.1720028

[91] L. Vassena, “Context-aware retrieval going social,” in Symposium A Quarterly Journal In Modern

Foreign Literatures, no. Section 4, Padova, 2009.

[92] O. Kwon, “A social network approach to resolving group-level conflict in context-aware services,”

Expert Systems with Applications, vol. 36, no. 5, pp. 8967–8974, 2009.

[93] A. Garcia-Crespo et al., “SPETA: Social pervasive e-Tourism advisor,” Telematics and Informatics,

vol. 26, no. 3, pp. 306–315, 2009.

[94] W. den Broeck, C. Cattuto, A. Barrat, M. Szomszor, G. Correndo, and H. Alani, “The Live

Social Semantics application: a platform for integrating face-to-face presence with on-line social

networking,” in PERCOM Workshops, 2010, pp. 226–231.

[95] T. Lovett, E. O’Neill, J. Irwin, and D. Pollington, “The Calendar as a Sensor: Analysis and Im-

provement Using Data Fusion with Social Networks and Location,” in UbiComp 2010, Copenhagen,

2010, pp. 3–12.

[96] S. B. Mokhtar, A. J. Mashhadi, L. Capra, and L. McNamara, “A self-organising directory and

matching service for opportunistic social networking,” in Proceedings of the 3rd Workshop on

Social Network Systems, ser. SNS ’10. ACM, 2010, pp. 5:1–5:6.

268 [97] P. Hui et al., “Bubble Rap: Social-based forwarding in Delay Tolerant Networks,” in 9th ACM

MobiHoc Proceedings, 2008, pp. 241–250.

[98] N. Eagle, “Big data, global development, and complex social systems,” in Proceedings

of the eighteenth ACM SIGSOFT international symposium on Foundations of software

engineering, ser. FSE ’10. New York, NY, USA: ACM, 2010, pp. 3–4. [Online]. Available:

http://doi.acm.org/10.1145/1882291.1882293

[99] N. Bosilj, G. Bubaˇs,and M. Jadri¨u,“The influence of users ’ attitudes regarding trust , privacy and

control on the adoption of mobile advertising,” in MIPRO, Opatija, Croatia, 2011, pp. 1420–1425.

[100] D. Cook and S. Das, “Pervasive computing at scale: Transforming the state of the

art,” Pervasive and Mobile Computing, vol. 8, no. 1, pp. 22–35, 2012. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S1574119211001416

[101] N. N. Eagle, “Machine Perception and Learning of Complex Social Systems,” Ph.D. dissertation,

MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 2005.

[102] M. E. M. C. L. H. M. K. C. Marcelo G. Rubinstein, Igor M. Moraes and O. C. M. B. Duarte, “A

survey of wireless ad hoc networks,” in Mobile and Wireless Communication Networks, vol. 211.

IFIP The International Federation for Information Processing, 2006.

[103] F. Bai and A. Helmy, “Chapter 1: A SURVEY OF MOBILITY MODELS,” in Wireless Adhoc

Networks, 2006, pp. 1–30.

[104] L. Junhai et al., “A survey of multicast routing protocols for mobile ad-hoc networks,” IEEE

Communications Surveys Tutorials, vol. 11, no. 1, pp. 78–91, 2009.

[105] G. K. Abdul Wahid and K. Ahmad, “Opportunistic networks: Opportunity versus challenges-

survey,” in Conference on Information Security Challenges, 2014.

[106] C. Boldrini et al., “Exploiting users social relations to forward data in opportunistic networks: The

HiBOp solution,” Pervasive and Mobile Computing, vol. 4, pp. 633–657, 2008.

[107] A. Gopalan and T. Znati, “PeerNet: a peer-to-peer framework for service and application de-

ployment in MANETs,” in 1st International Symposium on Wireless Pervasive Computing, Jan

2006.

[108] C. Toh, “Maximum battery life routing to support ubiquitous mobile computing in wireless ad hoc

networks,” IEEE Communication Magazine, pp. 138–147, 2001.

[109] H. Y. Bohari and S. O. Khanna, “Energy Efficient Power Aware Routing Algorithm(EEPARA)

For Mobile Ad Hoc Network (MANET),” IJERT, vol. 2, no. 7, pp. 1562–1572, 2013.

269 [110] G. Anastasi et al., “Energy conservation in wireless sensor networks: A survey,” Ad Hoc Networks,

vol. 7, pp. 537–568, 2009.

[111] A. Mtibaa and K. A. Harras, “Fairness-related challenges in mobile opportunistic networking,”

Computer Networks, vol. 57, no. 1, pp. 228 – 242, 2013. [Online]. Available: http:

//www.sciencedirect.com/science/article/pii/S1389128612003350

[112] S. A. Al Ayyat, “Social Pervasive Systems: The harmonization between social networking and

pervasive systems,” in IEEE PERCOM PhD Forum, March 2014, pp. 178–180, Best PhD Forum

Award.

[113] S. Al Ayyat, K. A. Harras, and S. G. Aly, “Interest Aware PeopleRank: Towards Effective Social-

Based Opportunistic Advertising,” in IEEE WCNC, 2013, pp. 4428–4433.

[114] R.-I. Ciobanu, R.-C. Marin, C. Dobre, V. Cristea, and C. Mavromoustakis, “ONSIDE: Socially-

aware and Interest-based dissemination in opportunistic networks,” in Network Operations and

Management Symposium (NOMS), 2014 IEEE, May 2014, pp. 1–6.

[115] A. Moghadam and H. Schulzrinne, “Interest-aware content distribution protocol for mobile

disruption-tolerant networks,” in World of Wireless, Mobile and Multimedia Networks Workshops,

2009. WoWMoM 2009. IEEE International Symposium on a, June 2009, pp. 1–7.

[116] C. Boldrini et al., “Modelling social-aware forwarding in opportunistic networks,” in Performance

Evaluation of Computer and Communication Systems, 2011, vol. 6821, pp. 141–152.

[117] B. Chiara, M. Conti, and A. Passarella, “ContentPlace: Social-aware Data Dissemination in op-

portunistic networks,” in ACM MSWiM, 2008, pp. 203–210.

[118] R. Soua and P. Minet, “A survey on energy efficient techniques in wireless sensor networks,” in

Wireless and Mobile Networking Conference (WMNC), 2011 4th Joint IFIP, Oct 2011, pp. 1–9.

[119] M. Maleki et al., “Power-aware source routing protocol for mobile ad hoc networks,” in ISLPED

Proceedings, 2002, pp. 1–29.

[120] P. Spachos, P. Chatzimisios, and D. Hatzinakos, “Energy aware opportunistic routing in wireless

sensor networks,” in Globecom Workshops (GC Wkshps), 2012 IEEE, Dec 2012, pp. 405–409.

[121] C. Chilipirea, A.-C. Petre, and C. Dobre, “Energy-aware social-based routing in opportunistic

networks,” Int. J. of Grid and Utility Computing, vol. 1, no. 3/4, p. 1, 2013.

[122] A. Roy, T. Acharya, and S. DasBit, “Energy-Aware Social-Based Multicast in Delay-Tolerant

Networks,” in The 81st IEEE Vehicular Technology Conference (VTC Spring), May 2015, pp. 1–5.

270 [123] B. Jiang, “Ranking spaces for predicting human movement in an urban environment,” International

Journal of Geographical Information Science, vol. 23, no. 7, pp. 823–837, Jul. 2009.

[124] A. Unlu, O. O. Ozener, T. Ozden, and E. Edgu, “An Evaluation of Social Interactive Spaces in a

University Building,” in Proceedings of the 3rd International Space Syntax Symposium, 2001.

[125] P. C. Dawson, “Analysing the effects of spatial configuration on human movement and social

interaction in Canadian Arctic communities,” in Proceedings of the 4th International Space Syntax

Symposium, 2003.

[126] A. Mtibaa and K. Harras, “Exploiting Space Syntax for Deployable Mobile Opportunistic Net-

working,” in IEEE MASS, 2013, pp. 533–541.

[127] S. Schuhmann, K. Herrmann, and K. Rothermel, “Efficient Resource-Aware Hybrid

Configuration of Distributed Pervasive Applications,” in Pervasive Computing: 8th International

Conference, Pervasive 2010, Helsinki, Finland, 2010, pp. 373–390. [Online]. Available:

http://www.springerlink.com/content/x6607vl0w9753683/

[128] S. R. Anton and H. A. Sodano, “A review of power harvesting using piezoelectric materials

(20032006),” Smart Materials and Structures, vol. 16, no. 3, p. R1, 2007. [Online]. Available:

http://stacks.iop.org/0964-1726/16/i=3/a=R01

[129]

[130] X. Lu, D. Niyato, P. Wang, and D. I. Kim, “Wireless charger networking for mobile devices:

fundamentals, standards, and applications,” IEEE Wireless Communications, vol. 22, no. 2, pp.

126–135, April 2015.

[131] O. C. Ozcanli, “Turning Body Heat Into Electricity,” Forbes, August 2010. [Online]. Available:

http://www.forbes.com/2010/06/07/nanotech-body-heat-technology-breakthroughs-devices.html

[132] T. Starner, “Human-powered wearable computing,” IBM Systems Journal, vol. 35, no. 3.4, pp.

618–629, 1996.

[133] S. Staff, “Harvesting Energy From Humans,” Popular Science, January 2009. [Online]. Available:

http://www.popsci.com/environment/article/2009-01/harvesting-energy-humans

[134] S. Al Ayyat, K. A. Harras, and S. G. Aly, “On the Integration of Interest and Power Awareness

in Social-Aware Opportunistic Forwarding Algorithms,” Computer Communications, pp. 97–110,

Nov. 2015.

[135] R. Vernica, M. J. Carey, and C. Li, “Efficient parallel set-similarity joins using mapreduce,” in

ACM SIGMOD Proceedings, 2010, pp. 495–506.

271 [136] C. G. Adriano Galati, “Human Mobility in Shopping Mall Environments,” in ACM MobiOpp

Workshop Proceedings, 2010.

[137] K. L. Kaist et al., “SLAW: A Mobility Model for Human Walks,” in INFOCOM, 2009, pp. 855–863.

[138] “Internet Live Stats,” http://www.internetlivestats.com/internet-users/.

[139] K. C. A. Khaled A. Harras and E. M. Belding, “Delay Tolerant Mobile Networks (DTMNs): Con-

trolled Flooding Schemes in Sparse Mobile Networks,” in International Federation for Information

Processing (IFIP) Networking, 2005.

[140] K. A. Harras and K. C. Almeroth, “Inter-Regional Messenger Scheduling in Delay Tolerant Mobile

Networks,” in IEEE WoWMoM, 2006.

[141] A. Mtibaa and K. Harras, “CAF: Community Aware Framework for Large Scale Mobile Oppor-

tunistic Networks,” Computer Communications, pp. 180–190, Jan. 2013.

[142] W.-j. Hsu et al., “CSI: A paradigm for behavior-oriented profile-cast services in mobile networks,”

Ad Hoc Networks, vol. 10, no. 8, pp. 1586–1602, 2012.

[143] A. Mtibaa and K. Harras, “FOG: Fairness in Mobile Opportunistic Networking,” in IEEE SECON,

2011.

[144] O. R. Helgason and K. V. J´onsson,“Opportunistic Networking in OMNeT++,” in Proceedings of

SIMUTools, 2008, pp. 82:1–82:8.

[145] A. Ker¨anen et al., “The ONE Simulator for DTN Protocol Evaluation,” in Proceedings of SIMU-

Tools, 2009, pp. 55:1–55:10.

[146] S. Mehta et al., Network and System Simulation Tools for Next Generation Networks: A Case

Study. InTech, August 2010.

[147] “OPNET-RiverbedIEEE 802.15.4/ZigBee OPNET Simulation Model.” [Online]. Available:

http://www.open-zb.net/wpan\ simulator.php

[148] I. Rhee et al., “CRAWDAD dataset ncsu/mobilitymodels (2009-07-23).”

[149] A.-K. Pietilainen and C. Diot, “CRAWDAD dataset thlab/sigcomm2009 (2012-07-15).”

[150] J. Scott et al., “CRAWDAD trace cambridge/haggle/imote/infocom2006 (2009-05-29).”

[151] C. D. F. P. Cristian Chilipirea, Andrea-Cristian Petre and G. Suciu, “A simulator for opportunistic

networks,” Concurrency and Computation: Practice and Experience, 2016.

272 [152] “OPNET-Riverbed.” [Online]. Available: http://www.riverbed.com/products/

performance-management-control/opnet.html

[153] M. S. Siraj et al., “Network simulation tools survey,” IJARCCE, vol. 1, no. 4, pp. 201–210, 2012.

[154] S. Al Ayyat, K. A. Harras, and S. G. Aly, “SAROS: A Social-Aware Opportunistic Forwarding

Simulator,” in IEEE WCNC, 2016.

[155] J. Manwell and J. McGowan, “Extension of the kinetic battery,” in Model for Wind/Hybrid Power

Systems. Proceedings of EWEC, 1994, pp. 284–289.

[156] F. B. Abdesslem et al., “CRAWDAD dataset st andrews/locshare (v. 2011-10-12),” Oct. 2011.

[157] A. Vahdat and D. Becker, “Epidemic routing for partially connected ad hoc networks,” CS-200006,

Duke University, Tech. Rep., 2000.

[158] S. Al Ayyat, S. G. Aly, and K. A. Harras, “PIPeR: Impact of Power-Awareness on Social-Based

Opportunistic Advertising,” in IEEE WCNC, April 2014.

[159] D. M. W. Powers, “Evaluation: From Precision, Recall and F-Measure to ROC, Informedness,

Markedness & Correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63,

2011.

[160] A. J. Mashhadi et al., “Fair content dissemination in participatory DTNs,” Ad Hoc Networks,

vol. 10, no. 8, pp. 1633–1645, 2012.

[161] “Gnuplot homepage.” [Online]. Available: http://gnuplot.info/

[162] D. Danilov et al., “Modeling All-Solid-State Li-Ion Batteries,” Journal of The Electrochemical

Society, vol. 158, no. 3, pp. A215–A222, 2011.

[163] H. Kwak et al., “What is Twitter, a social network or a news media?” in WWW ’10 Proceedings,

2010, pp. 591–600.

[164] “Spearmans Rank Correlation,” http://www.real-statistics.com/correlation/spearmans-rank-

correlation/.

[165] H. Ma, T. C. Zhou, M. R. Lyu, and I. King, “Improving recommender systems by incorporating

social contextual information,” ACM Trans. Inf. Syst., vol. 29, no. 2, pp. 9:1–9:23, Apr. 2011.

[Online]. Available: http://doi.acm.org/10.1145/1961209.1961212

[166] M. T. Yanbo Ma, Chen Wang and Z. Han, “Content-Aware Transmission with Delay Threshold in

Heterogeneous Networks,” in IEEE WCNC, 2013, pp. 4481–4486.

273 [167] “The zettabyte eratrends and analysis,” http://www.cisco.com/c/en/us/solutions/collateral/service-

provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html.

[168] A. El Mougy and M. Ibnkahla, “A Cognitive WSN Framework for Highway Safety Based on

Weighted Cognitive Maps and Q-Learning,” in DIVANet, 2012, pp. 55–61.

[169] L. McNamara, C. Mascolo, and L. Capra, “Media sharing based on colocation prediction in urban

transport,” in Proceedings of the 14th ACM International Conference on Mobile Computing and

Networking, ser. MobiCom ’08. New York, NY, USA: ACM, 2008, pp. 58–69. [Online]. Available:

http://doi.acm.org/10.1145/1409944.1409953

[170] N. Eagle, A. Pentland, and D. Lazer, “Inferring Social Network Structure using Mobile Phone

Data,” Proceedings of the National Academy of Sciences (PNAS), vol. 106, no. 36, pp. 15 274–

15 278, 2009.

[171] M. M. M. Michele A. Brandao and G. R. Lopes, “Using Link Semantics to Recommend Collabora-

tions in Academic Social Networks,” in The International WWW Conference, 2013, pp. 833–840.

[172] X. Liu, Z. Guo, Z. Lin, and J. Ma, “A local social network approach for research

management,” Decision Support Systems, vol. 56, no. 0, pp. 427 – 438, 2013. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0167923612003302

[173] G. Groh, S. Birnkammerer, and V. Kllhofer, “Social recommender systems,” in Recommender

Systems for the Social Web, ser. Intelligent Systems Reference Library. Springer Berlin Heidelberg,

2012, vol. 32, pp. 3–42.

[174] L. Ding, Q. Tao, and P. Shi, “Link Analysis Application in Personalized E-learning Environment,”

International Conference on Computational Intelligence and Natural Computing, vol. 1, pp. 125–

128, 2009.

[175] H. MA, T. C. ZHOU, M. R. LYU, and I. KING, “Improving Recommender Systems by Incorpo-

rating Social Contextual Information,” ACM Transactions on Information Systems, vol. 29, no. 2,

pp. 9:1–9:23, 2011.

[176] Y. Xu, X. Guo, J. Hao, J. Ma, R. Y. Lau, and W. Xu, “Combining social network

and semantic concept analysis for personalized academic researcher recommendation,”

Decision Support Systems, vol. 54, no. 1, pp. 564 – 573, 2012. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0167923612001996

[177] P. Cremonesi, Y. Koren, and R. Turrin, “Performance of recommender algorithms on top-n rec-

ommendation tasks,” in Proceedings of the fourth ACM conference on Recommender systems, ser.

RecSys ’10. ACM, 2010, pp. 39–46.

274 [178] D. Carmel, N. Zwerdling, I. Guy, S. Ofek-Koifman, N. Har’el, I. Ronen, E. Uziel, S. Yogev, and

S. Chernov, “Personalized social search based on the user’s social network,” in Proceedings of the

18th ACM conference on Information and knowledge management, ser. CIKM ’09. ACM, 2009,

pp. 1227–1236.

[179] H. Gao, J. Tang, and H. Liu, “gSCorr: Modeling Geo-Social Correlations for New Check-ins on

Location-Based Social Networks,” in the 21st ACM CIKM, 2012.

[180] M. Ficek, “CRAWDAD dataset ctu/personal (v. 2012-03-15),” Downloaded from

http://crawdad.cs.dartmouth.edu/ctu/personal, Mar. 2012.

[181] K. Lee, B. EoK, and J. Caverlee, “Seven Months with the Devils: A Long-Term Study of Content

Polluters on Twitter,” in Proceeding of the 5th International AAAI Conference on Weblogs and

Social Media (ICWSM), July 2011.

[182] M. Gjoka et al., “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs,” in Pro-

ceedings of IEEE INFOCOM ’10, San Diego, CA, March 2010.

275