Ii Detecting Location Spoofing in Social Media
Total Page:16
File Type:pdf, Size:1020Kb
Detecting Location Spoofing in Social Media: Initial Investigations of an Emerging Issue in Geospatial Big Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Bo Zhao Graduate Program in Geography The Ohio State University 2015 Dissertation Committee: Daniel Z. Sui, Advisor Harvey Miller Ningchuan Xiao ii Copyrighted by Bo Zhao 2015 iii Abstract Location spoofing refers to a variety of emerging online geographic practices that allow users to hide their true geographic locations. The proliferation of location spoofing in recent years has stirred debate about the reliability and convenience of crowd-sourced geographic information and the use of location spoofing as an effective countermeasure to protect individual geo-privacy and national security. However, these polarized views will not contribute to a solid understanding of the complexities of this trend. Even today, we lack a robust method for detecting location spoofing and a holistic understanding about its multifaceted implications. Framing the issue from a critical realist perspective using a hybrid methodology, this dissertation aims to develop a Bayesian time geographic approach for detecting this trend in social media and to contribute to our understanding of this complex phenomenon from multiple perspectives. The empirical results indicate that the proposed approach can successfully detect certain types of location spoofing from millions of geo-tagged tweets. Drawing from the empirical results, I further qualitatively examine motivation for spoofing as well as other generative mechanisms of location spoofing, and discuss its potential social implications. Rather than conveniently dismissing the phenomenon of location spoofing, this dissertation calls on the GIScience community ii to tackle this controversial issue head-on, especially when legal decisions or political policies are reached using data from location-based social media. Only then can we promote more effective and trustworthy geographic practices in the age of big data. iii Dedication Dedicated to the students at The Ohio State University iv Acknowledgments Over the past three years, I have received support and encouragement from a great number of individuals. I wish to offer my most heartfelt thanks to the following people. Dr. Daniel Sui has been a mentor, colleague, and friend. We first met during one of the most difficult times of my life, and he was there to listen and to offer guidance. His patience, flexibility, genuine caring, and faith in me during the dissertation process enabled me to earn my Ph.D. He is motivating, encouraging, and enlightening. He has never judged nor pushed even when he knew I was in the midst of juggling priorities. Thank you for the advice and support to pursue research topics about which I am truly passionate. I see the same drive and passion in your own research efforts, and I thank you for letting me do the same. I am very grateful to the remaining members of my dissertation committee, Dr. Harvey Miller and Dr. Ningchuan Xiao. Their academic support, input and encouragement are greatly appreciated. Moreover, Dr. Ola Ahlqvist served on my exam committee and provided valuable advice on the literature review. I must thank Dr. Morton O’Kelly who allocated a spacious working environment for me in CURA and other logistics. v To Jake Carr, Yapin Liu, Yuxi Zhao, Xiang Chen, and Shaun Fontanella, you have generously shared your time and ideas. I have learned much through our conversations. To Julia Shaw, Young Rae Choi, Samuel Kay, Lili Wang, I am grateful for the countless hours spent to proof read working drafts. To Xining Yang, I have always been able to turn to you when I have needed a sympathetic ear, an honest opinion, or someone to just grab lunch with. To Ying Song, you have been a critical and insightful reviewer of all my draft. You were always there with green tea latte or cookies whenever I got exhausted. To Ruoran Cheng, your selfless help made me concentrate on completing the dissertation. Without the love and support of my family, this dissertation would never have seen the light of day. I am particularly grateful to my parents and in-laws, who have always inspired me to follow my heart and honor my deepest passions. Thank you. I extend the greatest gratitude to my wife, Yuanyuan Duan, for encouraging me to follow my dreams, even when that has meant making some unimaginable sacrifices. You have been a constant source of strength, joy, and love. vi Vita 1998 …………..…………..…… NPU Middle School, Xi’an, China 2002…………..…………..…… B.S. and M.S. Cartography and Geographic Information System, Nanjing University, China 2008…………..…………..…… Graduate Assistant, Department of City and Regional Planning, University of Florida 2009…………..…………..…… Research Assistant, Center for Geographic Analysis, Harvard University 2012 to present…………..…….. Graduate Associate, Department of Geography, The Ohio State University Publications S.J. Kay, B. Zhao and D.Z. Sui (2014) Can Social Media Clear the Air? A Case Study of the Air Pollution Problem. Professional Geographer, 351-363. D.Z. Sui, Z.W. Xu and B. Zhao (2013) Networks in Geography. In R. Alhajj and J. Rokne, Encyclopedia of Social Network Analysis and Mining (ESNAM) Berlin, Springer. D.Z. Sui and B. Zhao (2013) GIS as Media: Geographies of Media and Mediated Geographies and through the GeoWeb. In S.P. Mains, J. Cupples and C. Lukinbeal, Mediated Geographies and Geographies of Media. Netherlands, Springer. vii Fields of Study Major Field: Geography viii Table of Contents Abstract ............................................................................................................................... ii Dedication .......................................................................................................................... iv Acknowledgments............................................................................................................... v Vita .................................................................................................................................... vii Table of Contents ............................................................................................................... ix List of Tables ................................................................................................................... xiii List of Figures .................................................................................................................. xiv Chapter 1: Introduction ....................................................................................................... 1 1.1 Motivations and contexts ....................................................................................... 1 1.2 Objectives .............................................................................................................. 4 1.3 Synopsis of the dissertation ................................................................................... 5 Chapter 2: Related Work .................................................................................................... 8 2.1 Definition and relevant vocabulary ....................................................................... 8 2.1.1 Case studies: Spoofing as a geographic phenomenon ..................................... 9 ix 2.1.2 Connotations and denotations ....................................................................... 14 2.1.3 Spoofing process and related elements ......................................................... 24 2.1.4 Locational inconsistency ............................................................................... 28 2.1.5 Spoofing motivation ...................................................................................... 30 2.2 Understanding location spoofing based on geographic knowledge .................... 33 2.2.1 Linguistic fallacy ........................................................................................... 33 2.2.2 Ontology for understanding location spoofing .............................................. 37 2.2.3 Necessity of geographic knowledge for detecting location spoofing............ 41 2.3 Time geography ................................................................................................... 44 2.3.1 Binary models: Space-time prism ................................................................. 45 2.3.2 Frequentist models: Prison/prism break ........................................................ 47 2.3.3 Bayesian models: A potential direction ........................................................ 49 2.4 McLuhan’s laws of media ................................................................................... 52 2.5 Summary and research framework ...................................................................... 54 Chapter 3: Methodology ................................................................................................... 58 3.1 Crawling and harvesting social media feeds ....................................................... 59 3.2 Detecting fake locational information ................................................................. 64 3.2.1 Prior beliefs: Individual activity .................................................................... 69 3.2.2 Posteriori probability: Human appearance .................................................... 71 x 3.3 Profiling spoofing motivation .............................................................................