From Online Social Network Analysis to a User-Centric Private Sharing System
Total Page:16
File Type:pdf, Size:1020Kb
From online social network analysis to a user-centric private sharing system by © Arastoo Bozorgi A thesis submitted to the School of Graduate Studies in partial fulfilment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Memorial University of Newfoundland July 2020 St. John’s Newfoundland Abstract Online social networks (OSNs) have become a massive repository of data con- structed from individuals’ inputs: posts, photos, feedbacks, locations, etc. By analyzing such data, meaningful knowledge is generated that can affect individuals’ beliefs, desires, happiness and choices — a data circulation started from individuals and ended in individuals! The OSN owners, as the one authority having full control over the stored data, make the data available for research, advertisement and other purposes. However, the individuals are missed in this circle while they generate the data and shape the OSN structure. In this thesis, we started by introducing approximation algorithms for finding the most influential individuals in a social graph and modeling the spread of information. To do so, we considered the communities of individuals that are shaped in a social graph. The social graph is extracted from the data stored and controlled centrally, which can cause privacy breaches and lead to individuals’ concerns. Therefore, we introduced UPSS: the user-centric private sharing system, in which the individuals are considered as the real data owners and provides secure and private data sharing on untrusted servers. The UPSS’s public API allows the application developers to implement appli- cations as diverse as OSNs, document redaction systems with integrity properties, censorship-resistant systems, health care auditing systems, distributed version con- trol systems with flexible access controls and a filesystem in userspace. Accessing users’ data is possible only with explicit user consent. We implemented the two later cases to show the applicability of UPSS. Supporting different storage models by UPSS enables us to have a local, remote and global filesystem in userspace with one unique core filesystem implementation and having it mounted with different block stores. By designing and implementing UPSS, we show that security and privacy can be addressed at the same time in the systems that need selective, secure and ii collaborative information sharing without requiring complete trust. iii Acknowledgements I have spent almost five years of my life sitting in my office and working on my thesis until I got to this point writing this page, that is the last words of my thesis. With the support of my supervisors, family and friends, all the days of this long journey were sweet and memorable for me. I would like to thank my supervisors, Dr. Jonathan Anderson, who helped me to find my way when I was almost lost, Dr. Saeed Samet, who taught me patience and supportiveness, and Dr. Antonina Kolokolova for her wise suggestions. You taught me how to overcome the challenges in both my personal and academic life. Thanks to my mother, who never gives up with her kind support from thousands of miles away and keeping hope alive in me. Thanks to my father, who was always encouraging me to continue my studies when he was in this world. He devoted himself to his family and kept the smile on her lips even in his hard days. I also thank my brother and sisters for being a mentor for their younger brother for his entire life. I thank Dr. Todd Wareham that his office door is always open to everyone and he welcomes everyone with his bowl of candy. I never forget his help in proving the NP − hardness of one of my problems while he was busy writing a book. And finally, I would like to thank my friends for being with me whenever I need them. iv Contents Abstract ii Acknowledgements iv List of Tables x List of Figures xi 1 Introduction and Overview 1 1.1 Definitions . .2 1.2 From social influence to influence maximization . .3 1.3 Private online social network . .6 1.4 Private filesystem . .7 1.5 Contributions . .9 2 INCIM: A community-based algorithm for influence maximization prob- 2.1lem under Related the Work linear . threshold . model . 22 19 2.2 Preliminaries . 24 2.3 INCIM algorithm . 25 2.3.1 Algorithm overview . 26 2.3.2 Algorithm description . 27 2.3.2.1 Preprocessing step . 28 2.3.2.2 Computing spread of nodes . 28 v 2.3.2.3 Seed nodes selection . 34 2.4 Evaluation . 38 2.4.1 Datasets . 38 2.4.2 Algorithms to compare . 38 2.4.3 Experimental results . 40 2.5 Conclusion and future work . 45 3 Community-based influence maximization in social networks under a 3.1competitive Background linear and threshold Related model Work . 53 51 3.2 Propagation model and algorithm . 55 3.2.1 DCM propagation model . 55 3.2.2 NP-hardness of competitive influence improvement under DCM................................. 58 3.2.3 Community-based algorithm . 62 3.3 Evaluations . 65 3.3.1 Experiments setup . 66 3.3.2 Experimental Results . 69 3.4 Conclusion . 75 44.1 Privacy-preserving Preliminaries . Online . .Social . Networks: . Challenges. .and . Solutions . 85 82 4.1.1 Online Social Networks (OSNs) . 85 4.1.2 Peer-to-Peer (P2P) overlays . 86 4.1.3 Group Key Management . 87 4.2 Privacy Solutions . 88 4.2.1 Centralized OSNs . 89 4.2.2 Decentralized OSNs (Peer-to-Peer) . 95 4.2.3 Hybrid OSNs . 102 4.2.4 Healthcare online social networks . 104 4.3 Discussion . 106 vi 4.4 Recommended Privacy Solution . 108 4.4.1 Encryption mechanism . 110 4.4.2 Topology . 113 4.4.3 Data availability . 115 4.4.4 Searching . 120 4.5 Requirements of a privacy-enabled approach for healthcare online social networks . 121 4.5.1 Privacy Policies . 122 4.5.2 Friend and group recommendation . 125 4.6 Conclusion . 126 55.1 Challenges Motivation: in designing use cases a distributed . cryptographic. file . system. 138 137 5.1.1 Online social network . 139 5.1.2 Censorship-resistant social networking . 141 5.1.3 Redaction with integrity . 142 5.1.4 Health care data sharing . 143 5.2 UPSS: the user-centric private sharing system . 144 5.2.1 Immutable DAGs . 146 5.2.2 Mutable Filesystem API . 147 5.2.3 Share Control Module . 148 5.2.4 VFS layer(s) . 151 5.3 Related Work . 151 5.3.1 Online Social Network Systems and Tool . 152 5.3.2 File systems . 154 5.4 Conclusion . 156 66.1 UPSS: Background the user-centric . private . sharing . .system . 165 163 6.1.1 Preliminaries . 165 6.2 UPSS: the user-centric private sharing system . 166 vii 6.2.1 BlockStore layer . 167 6.2.2 Immutable DAGs . 169 6.2.3 Mutable Filesystem API . 170 6.3 Case studies . 172 6.3.1 Benchmark description . 173 6.3.2 UPSS as a local filesystem . 175 6.3.2.1 Snapshots and consistency . 177 6.3.2.2 Performance comparisons . 178 6.3.2.3 Deduplication . 180 6.3.2.4 Macro-benchmark . 185 6.3.3 UPSS as a network filesystem . 185 6.3.3.1 Remote block store . 187 6.3.3.2 Performance comparison . 188 6.3.4 UPSS as a global filesystem . 189 6.3.4.1 Performance comparison . 189 6.3.5 UVC: UPSS Version Control System . 194 6.3.5.1 Version Control Server . 199 6.3.5.2 Version Control Client . 200 6.3.5.3 Version Control System Evaluations . 201 6.4 Related work . 207 6.5 Future work . 209 6.5.1 FFI . 209 6.5.2 Expansion of Version Control System . 209 6.5.3 Parallelism . 210 6.5.4 Structured files . 210 6.5.5 Hiding access patterns . 211 6.6 Conclusion . 211 viii 7 Summary 220 7.1 Future work . 223 ix List of Tables 2.1 Specifications of real standard datasets . 39 2.2 Specification of communities of the datasets . 39 2.3 Speed up in different datasets . 44 3.1 Specifications of real standard datasets. 66 3.2 Specification of communities in the real standard datasets. 66 3.3 MC and CI2 comparison . 71 4.1 Centralized privacy approaches . 94 4.2 Decentralized privacy approaches . 102 4.3 The online time patterns for three nodes . 118 5.1 Use cases’ requirements . 139 x List of Figures 2.1 A sample network with its communities . 33 2.2 Memory usage comparisons . ..