The Pennsylvania State University the Graduate School College Of
Total Page:16
File Type:pdf, Size:1020Kb
The Pennsylvania State University The Graduate School College of Earth and Mineral Sciences MAPPING SEMANTIC AND SPATIAL MEDIASCAPES IN THE CATALONIAN INDEPENDENCE MOVEMENT: GEOPOLITICS, SPORTS, AND BLACK BOXES A Dissertation in Geography by Samuel K. Stehle © 2017 Samuel K. Stehle Submitted in Partial Fulfillment Of the Requirements for the Degree of Doctor of Philosophy December 2017 The dissertation of Samuel K. Stehle was reviewed and approved* by the following: Donna J. Peuquet Professor Emeritus of Geography Dissertation Advisor Chair of Committee Clio Andris Assistant Professor of Geography Deryck Holdsworth Professor Emeritus of Geography Burt L. Monroe Professor of Political Science, Social Data Analytics, and Information Sciences Cynthia Brewer Professor of Geography Head of the Department of Geography *Signatures are on file in the Graduate School ii Abstract This dissertation explores the geographic and semantic spaces of local and international news media reporting on the Catalonian independence movement of fall 2015. It shows that the media intersects several geographic themes through its connections between Catalonia, sports, and other independence-seeking movements, most notably in Scotland. These connections are largely dictated by two factors: the scale and location of the media source, and the themes which these sources discuss in reference to the Catalonian movement. This project uses data-driven thematic analysis through Latent Dirichlet Allocation (LDA) to determine the primary topics present in the media. LDA defines probabilistic topics based on co-occurring terms within clusters of documents, and relies on several key parameters to generate a usable result. The highly variable semantic spaces which result from parameter combinations are scrutinized via this dissertation’s introduction and implementation of ‘interestingness’ measures. Interestingness measures define related but separate methods for evaluating the usability of data-driven results when multiple valid results are generated. Thus, this dissertation offers new methods to GIScience for evaluating data-driven methods. This dissertation then maps the semantic spaces discovered through the LDA process onto geographic space via the placenames present in the media reports. Modern, digital, and global media emphasizes unique places connected through the Catalonian independence context and the media which reports on it. Networks of soccer – via competing local and national teams, the nationalities of athletes, and international sponsorship deals – emerge alongside those of media conglomerates throughout Europe and the world. The linking of semantic and geographic spaces in this analysis generate new ways of understanding the impacts of globalized media. iii Table of Contents LIST OF TABLES ......................................................................................................... viii LIST OF FIGURES .......................................................................................................... x LIST OF EQUATIONS .................................................................................................. xii ACKNOWLEDGEMENTS .......................................................................................... xiii 1. Chapter 1:................................................................................................................... 1 1.1 Introduction ..................................................................................................... 2 1.2 Problem statement .......................................................................................... 4 1.2.1 Research Objective ................................................................................. 4 1.2.2 Catalan Independence ............................................................................. 6 1.2.3 Topic Modeling ...................................................................................... 7 1.2.4 Contributions .......................................................................................... 8 1.3 Dissertation outline ....................................................................................... 10 2. Chapter 2:................................................................................................................. 11 2.1 Introduction ................................................................................................... 12 2.2 Big Data.......................................................................................................... 12 2.2.1 Big data – paradigm-shifting for social science ................................... 13 2.2.2 Theory and Data-Driven Research ....................................................... 15 2.2.3 Big Data and Media .............................................................................. 17 2.2.4 Big Data and Politics: Event Data ........................................................ 19 2.2.5 Evaluation ............................................................................................. 21 2.2.5.1 Evaluation issues .................................................................................................... 21 2.2.5.2 Interestingness........................................................................................................ 22 2.3 Geopolitics in Media and Sport ................................................................... 27 2.3.1 Geography in the Media ....................................................................... 28 2.3.2 Geography through the Media .............................................................. 29 2.3.3 Popular Geopolitics .............................................................................. 32 2.3.4 Sports and Popular Geopolitics ............................................................ 35 2.3.5 Sports and National Identity ................................................................. 37 2.3.6 Political Relationships Influence the Framing of Sports Rivalry ......... 37 2.4 Geography and Text: Computational Methods ......................................... 40 2.4.1 Placename Disambiguation .................................................................. 40 2.4.2 Mining Big Text Data: Geographic Information Retrieval .................. 41 2.4.3 Thematic text analysis .......................................................................... 42 2.5 Summary ........................................................................................................ 44 3. Chapter 3:................................................................................................................. 45 3.1 Introduction ................................................................................................... 46 3.2 Methods .......................................................................................................... 46 3.2.1 Big Data ................................................................................................ 46 3.2.2 Latent Dirichlet Allocation ................................................................... 47 3.2.2.1 Algorithm Procedure .............................................................................................. 47 3.2.3 Topic Disambiguation .......................................................................... 51 3.3 Evaluation ...................................................................................................... 54 3.3.1 Expectation Maximization.................................................................... 55 3.3.2 Interestingness for Evaluation .............................................................. 56 3.3.2.1 LDA Outputs Facilitate Interestingness Evaluation ............................................... 57 iv 3.3.2.2 Conciseness ............................................................................................................ 59 3.3.2.3 Generality/Coverage .............................................................................................. 60 3.3.2.4 Peculiarity .............................................................................................................. 61 3.3.2.5 Diversity ................................................................................................................. 62 3.3.2.6 Reliability ............................................................................................................... 65 3.3.2.7 Novelty................................................................................................................... 67 3.3.2.8 Unexpectedness/surprisingness .............................................................................. 68 3.3.2.9 Utility, Actionability .............................................................................................. 69 3.4 Catalonian Independence ............................................................................. 70 3.4.1 The movement ...................................................................................... 71 3.5 Data Processing ............................................................................................. 73 3.5.1 Data Collection ..................................................................................... 73 3.5.2 Text Preprocessing ............................................................................... 74 3.5.3 Data Analysis........................................................................................ 75 3.5.3.1 Parameterization ...................................................................................................