Social Network Analysis and Time Varying Graphs
Total Page:16
File Type:pdf, Size:1020Kb
Social Network Analysis and Time Varying Graphs by Amir Afrasiabi Rad Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the Ph.D. degree in Computer Science School of Electrical Engineering and Computer Science Faculty of Engineering University of Ottawa c Amir Afrasiabi Rad, Ottawa, Canada, 2016 Abstract The thesis focuses on the social web and on the analysis of social networks with par- ticular emphasis on their temporal aspects. Social networks are represented here by Time Varying Graphs (TVG), a general model for dynamic graphs borrowed from distributed computing. In the first part of the thesis we focus on the temporal aspects of social networks. We develop various temporal centrality measures for TVGs including betweenness, close- ness, and eigenvector centralities, which are well known in the context of static graphs. Unfortunately the computational complexity of these temporal centrality metrics are not comparable with their static counterparts. For example, the computation of betweenness becomes intractable in the dynamic setting. For this reason, approximation techniques will also be considered. We apply these temporal measures to two very different datasets, one in the context of knowledge mobilization in a small community of university researchers, the other in the context of Facebook commenting activities among a large number of web users. In both settings, we perform a temporal analysis so to understand the importance of the temporal factors in the dynamics of those networks and to detect nodes that act as \accelerators". In the second part of the thesis, we focus on a more standard static graph representation. We conduct a propagation study on YouTube datasets to understand and compare the propagation dynamics of two different types of users: subscribers and friends. Finally, we conclude the thesis with the proposal of a general framework to present, in a comprehensive model, the influence of the social web on e-commerce decision making. ii Acknowledgements First and foremost, my most sincere thanks goes to Prof. Flocchini, who provided me an opportunity to join her team, and provided me with a remarkable scientific and moral support that most students envy. Her ideas, guidance, and recommendations always guided me to the right direction, and her knowledge lightened the dark corners of the route to PhD. As the words come short in expressing my gratitude, there is no doubt that without her precious support it would not be possible to conduct this research. Moreover, I would like to thank Ms. Joanne Gaudet that shared her data, comments, and knowledge with me during the course of this research. I also thank my fellow lab-mate, and dear friend, Dr. Amir Rahnamai Barghi, who has always been a great support at tough times. He patiently provided me with directions to solve hard mathematical issues that I encountered during the course of this research. I also thank him for the stimulating discussions, for the long days we were working together, and for all the fun we have had in the last two years. Also I thank my friends in the Distributed Computing Lab, especially William Doan, for sharing his ideas with me. I also thank Prof. Morad Benyoucef for his support in the course of this study. One of my sincerest thanks goes to my family: my parents and to my sisters for sup- porting me spiritually throughout writing this thesis and throughout my life in general. Last but not the least, I would like to thank all my friends in Iran and Canada whose company energized me in every step of the way. In conclusion, I recognize that this research would not have been possible without the financial assistance of MITACS, and IBM Canada Inc., and I express my gratitude to those agencies. iii Dedication I dedicate my dissertation work to my family. A special feeling of gratitude to my loving parents, Abbas and Marziyeh whose words of encouragement and push for tenacity has been the fuel to my enthusiasm for research. And to my sisters Parvaneh and Fatemeh whom I feel never left my side even though being thousands of kilometres away. iv Table of Contents List of Tablesx List of Figures xii 1 Introduction1 1.1 Motivations and Goals.............................3 1.2 Contributions..................................5 1.3 Organization of the Thesis...........................9 Nomenclature1 2 Background 13 2.1 On-line Social Networks............................ 13 2.1.1 Graph's Terminology.......................... 15 2.1.2 Structure of Social Networks...................... 16 2.1.3 Social Network and Communities................... 19 2.1.4 Characteristics of Social Networks................... 19 2.2 Social Influence................................. 21 2.2.1 Sociometric Techniques for Ranking.................. 22 2.3 Conclusion.................................... 34 3 Time-Varying Graphs and Temporal Metrics 35 3.1 Time Varying Graphs.............................. 35 v 3.2 Temporal Concepts............................... 36 3.2.1 The Underlying Graph......................... 37 3.2.2 Points of View............................. 37 3.2.3 Journeys................................. 38 3.2.4 Connectivity............................... 40 3.3 Temporal Metrics................................ 40 3.3.1 Degree.................................. 41 3.3.2 Eccentricity and Diameter....................... 42 3.3.3 Closeness................................ 42 3.3.4 Temporal Katz Score.......................... 43 3.3.5 Temporal Betweenness......................... 47 3.3.6 Clustering Coefficient.......................... 48 3.3.7 Modularity............................... 48 3.4 Conclusion.................................... 48 4 Computation of Temporal Measures 49 4.1 Temporal betweenness............................. 49 4.2 Temporal Shortest Betweenness........................ 50 4.3 Temporal Foremost Betweenness........................ 54 4.3.1 General Algorithm........................... 55 4.3.2 Algorithm for Zero latency and Instant edges............ 59 4.4 Temporal Eigenvector Centrality....................... 62 4.4.1 Adjacent Degree Induced Eigenvector Centrality (adi)....... 63 4.4.2 Self Degree Induced Eigenvector Centrality (sdi).......... 65 4.4.3 Examples................................ 66 4.5 Conclusion.................................... 68 vi 5 Temporal Analysis of a Knowledge Mobilization Network 70 5.1 Introduction................................... 70 5.2 Knowledge-Net Data description........................ 72 5.3 Design of The Study.............................. 73 5.4 Analysis of consecutive snapshots....................... 75 5.5 Temporal Growing Betweenness Centrality.................. 76 5.6 Foremost Betweenness of Knowledge-Net................... 77 5.6.1 Foremost Betweenness during the lifetime of the system....... 77 5.6.2 A Finer look at foremost betweenness................. 80 5.7 Invisible Rapids and Brooks.......................... 83 5.8 Conclusion.................................... 86 6 Temporal Analysis of a Facebook Network 88 6.1 Introduction................................... 88 6.2 Data Description................................ 89 6.3 Design of the study............................... 94 6.3.1 Static Betweenness Centrality of Bridges............... 94 6.3.2 Temporal Betweenness Centrality of Bridges............. 95 6.4 Static Analysis of the Facebook Dataset................... 98 6.4.1 Facebook Static Analysis: snapshot approach............ 100 6.4.2 Facebook Static Analysis: aggregated approach........... 101 6.5 Foremost Betweenness of Bridges....................... 105 6.5.1 Foremost Betweenness during the lifetime of the system....... 105 6.6 Foremost Betweenness of Bridges in time intervals.............. 108 6.7 Rapids and Brooks in Facebook Dataset................... 110 6.8 Temporal Eigenvector Centrality of Facebook Graph............ 111 6.8.1 Temporal Eigenvector Centrality in The System Lifetime...... 111 6.8.2 Shockers and Breakers......................... 112 6.9 Conclusions................................... 114 vii 7 Propagation Study in YouTube 116 7.1 YouTube Social Network............................ 116 7.2 Data Collection................................. 118 7.2.1 YouTube Statistics........................... 122 7.2.2 Limitations in Data Collection..................... 123 7.3 Propagation in YouTube............................ 125 7.4 Propagation and Popularity in YouTube................... 128 7.4.1 Propagation and popularity in friendship network.......... 129 7.4.2 Propagation and popularity in subscription network......... 130 7.5 Discussion on YouTube Propagation...................... 130 7.6 Interest Similarity and Ties in YouTube................... 133 7.6.1 Similarity Measures and Functions.................. 133 7.6.2 Data Description............................ 136 7.6.3 Analysis of Similarities......................... 137 7.7 Discussion.................................... 139 7.8 Conclusion.................................... 140 8 Social Commerce: a Platform Founded on SNA 142 8.1 Understanding Social Commerce........................ 142 8.1.1 Need Recognition............................ 143 8.1.2 Product Brokerage........................... 146 8.1.3 Merchant Brokerage.......................... 148 8.1.4 Purchase Decision............................ 149 8.1.5 Purchase................................. 150 8.1.6 Evaluation................................ 150 8.2 Conclusion.................................... 151 9 Conclusions