Preserving the Privacy in Online Social Networks Using Enhanced Clustering Algorithms

Rumman Ahmed Outline

▰ Introduction ▰ Related Work ▰ Problem Statement ▰ Experimental Results ▰ Conclusion

2 Introduction

▰ The role of social networks for the modern population ▰ Public and private data sharing by users ▰ Collection of data from social networks ▰ Protecting the anonymity of a social network

3 Introduction - Public Data Sharing

4 Introduction

▰ The role of social networks for the modern population ▰ Public and private data sharing by users ▰ Collection of data from social networks ▰ Protecting the anonymity of a social network

5 Related Work

▰ Sequentially clustering to anonymize a network ▰ The k-member clustering problem ▰ Data and structural k-anonymity in social networks

6 Problem Statement

▰ A social network can be represented as a graph ▰ Clustering a social network..the issue at hand

7 Figure 1. A social network as a graph Problem Statement

▰ Preventing single user clusters ▰ Keeping information loss at a minimum ▰ Increasing degree of anonymity

8 Figure 1. A social network as a graph Problem Statement - The Algorithm in Use

9 Experimental Results

▰ Traditional clustering algorithms compared to our enhanced algorithms ▰ Metrics ▻ Information Loss ▻ Degree of Anonymization ▻ Running Time

10 Experimental Results

▰ Experiment Details ▻ Dataset: 5000 Yelp users ▻ Enhanced versions of K-Means, Mean-Shift, and Affinity Propagation ▻ Implemented in Python

11 Experimental Results - Tables

12 Experimental Results - Graphs

13 Result Analysis

▰ Information loss kept at a minimum ▰ Increase of the degree of anonymization ▰ Running time comparisons, enhanced algorithms impractical for large data

14 Conclusion

▰ Importance of privacy and anonymity in modern time ▰ Current solution of clustering has a drawback ▰ New proposed solution shows promising results and a stepping stone for the anonymity problem

15 References

1. T. A. Pradesh, Sequential Clustering for Anonymizing Social Networks. International Journal of Information & Computation Technology, 2014. 2. A. Campan og T. M. Truta, Data and Structural k-Anonymity in Social Networks. 2008. 3. J. Byun, A. Kamra, E. Bertino, and N. Li. (2007) “Efficient k-Anonymization Using Clustering Techniques,” In: Kotagiri ., Krishna P.R., Mohania M., Nantajeewarawat E. (eds) Advances in Databases: Concepts, Systems and Applications. DASFAA 2007. Lecture Notes in Computer Science, vol 4443. Springer, Berlin, Heidelberg 4. L. Sweeney, Uniqueness of Simple Demographics in the U.S. Population. Laboratory for Int’l Data Privacy (LIDAP-WP4), 2000. 5. "U.S. population social media penetration 2017 | Statistic", Statista, 2017. [Online]. Available: https://www.statista.com/statistics/273476/percentage-of-us-population-with-a-social-network-p rofile/. 6. "Yelp Dataset", Yelp.com, 2018. [Online]. Available: https://www.yelp.com/dataset/documentation/json. 7. "ELKI Framework", Elki-project.github.io. [Online]. Available: https://elki-project.github.io/. 8. A. Nawroth, "Social Networks in the Database: Using a Graph Database", Neo4j Graph Database Platform, 2009. [Online]. Available: https://neo4j.com/blog/social-networks-in-the-database-using-a-graph-database/. 16 Questions?

17