Analyzing Refugee Migration Patterns Using Geo-Tagged Tweets
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Geo-Information Article Analyzing Refugee Migration Patterns Using Geo-Tagged Tweets Franziska Hübl 1, Sreten Cvetojevic 2, Hartwig Hochmair 2 ID and Gernot Paulus 1,* ID 1 Geoinformation and Environmental Technology, Carinthia University of Applied Sciences, Villach 9524, Austria; [email protected] 2 Geomatics Program, Fort Lauderdale Research and Education Center, University of Florida, Davie, FL 33314, USA; scvetojevic@ufl.edu (S.C.); hhhochmair@ufl.edu (H.H.) * Correspondence: [email protected] Received: 5 June 2017; Accepted: 24 June 2017; Published: 2 October 2017 Abstract: Over the past few years, analysts have begun to materialize the “Citizen as Sensors” principle by analyzing human movements, trends and opinions, as well as the occurrence of events from tweets. This study aims to use geo-tagged tweets to identify and visualize refugee migration patterns from the Middle East and Northern Africa to Europe during the initial surge of refugees aiming for Europe in 2015, which was caused by war and political and economic instability in those regions. The focus of this study is on exploratory data analysis, which includes refugee trajectory extraction and aggregation as well as the detection of topical clusters along migration routes using the V-Analytics toolkit. Results suggest that only few refugees use Twitter, limiting the number of extracted travel trajectories to Europe. Iterative exploration of filter parameters, dynamic result mapping, and content analysis were essential for the refinement of trajectory extraction and cluster detection. Whereas trajectory extraction suffers from data scarcity, hashtag-based topical clustering draws a clearer picture about general refugee routes and is able to find geographic areas of high tweet activities on refugee related topics. Identified spatio-temporal clusters can complement migration flow data published by international authorities, which typically come at the aggregated (e.g., national) level. The paper concludes with suggestions to address the scarcity of geo-tagged tweets in order to obtain more detailed results on refugee migration patterns. Keywords: tweets; trajectory; migration; hashtag; refugee; V-Analytics 1. Introduction Since the beginning of 2015, Europe has been experiencing the largest migration movement in the 21st Century, which is caused by political crises and wars in the Middle East and Africa. Figure1 provides a snapshot of refugee movements to Europe in July 2015 based on country level data that are published by the UN Refugee Agency (UNHCR) and visualized by Saarinen and Ojala [1]. The flow intensity on the map, where each moving dot represents 25 people, shows that Syria, Afghanistan and Iraq are the countries where the majority of refugees originate from. White vertical bars indicate the numbers of asylum seekers per destination country between 2012 and July 2015. Germany can be identified as the most prominent refugee destination country in the world, with a total of 441,900 asylum seekers registered during 2015 [2]. Although migration routes are shown in their entirety, they are computed as shortest paths connecting country centroids and therefore do not reflect actual migration corridors taken. It is also important to note that UNHCR numbers rely on registered refugees, which will deviate from actual refugee arrival numbers. Reasons can be delays in the governmental registration process, some nations not releasing all of their refugee statistics, or discrepancies between countries on how to count appeals to asylum denials. In addition, ISPRS Int. J. Geo-Inf. 2017, 6, 302; doi:10.3390/ijgi6100302 www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2017, 6, 302 2 of 23 ISPRS Int. J. Geo-Inf. 2017, 6, 302 2 of 23 the European Union (EU) Dublin regulation states that, in whichever country a refugee is first registered registeredand has his and or her has fingerprints his or her taken,fingerprints that country taken, isth toat handlecountry the is asylum to handle application, the asylum that application, is, to accept thator reject is, tothe accept claim. or reject Many the migrants claim. Many know migran aboutts this know rule, about some this of rule, which some deliberately of which trydeliberately to avoid tryregistration to avoid in registration their country in oftheir first country entry to of the firs EUt (e.g.,entry Greece)to the EU since (e.g., such Greece) prior registration since such would prior registrationreduce their would chance reduce of being their allowed chance to remainof being in allo richerwed countries to remain of in the richer EU (e.g., countries Germany of the or EU Sweden) (e.g., Germanyif they already or Sweden) moved if there they afteralready entering moved EU there territory after entering [3]. Also, EU falsely territory stated [3]. information Also, falsely about stated a informationrefugee’s home about country a refugee’s to increase home the country chance to for incr politicalease the asylum chance in for Europe political may asylum lead to additionalin Europe maydata biaslead into officialadditional datasets. data bias For example,in official Germany datasets. estimatesFor example, that Germany 30% of incoming estimates migrants that 30% who of incomingclaim to be migrants Syrian citizens who claim are from to be other Syrian countries citizens [4 ].are Though from other based countries on a reliable [4]. dataThough source, based the on map a reliablecontent data of Figure source,1 could the map be misused content byof Figure political 1 could parties be or misused individuals by political to back parties their agenda, or individuals e.g., as toaproof back oftheir the agenda, alleged threate.g., as of a refugeesproof ofto the Europe, alleged if threat taken outof refugees of context. to NumerousEurope, if taken examples out of context.historic mapsNumerous exist that examples were purposely of historic used maps for political exist propagandathat were purposely through manipulation used for political of map propagandadesign (e.g., choicethrough of manipulation map projection, of colors,map design line style, (e.g., andchoice size of of map map projection, symbols) [colors,5–7]. In line Figure style,1, andfor example, size of map the sizesymbols) and density [5–7]. In of arrowsFigure 1, could for example, be used to the influence size and the density impression of arrows that Europe could be is, usedor is not,to influence experiencing the impression a heavy inflow that Europe of refugees. is, or is not, experiencing a heavy inflow of refugees. Figure1. Refugee flows from Africa and the Middle East to Europe in July 2015 with a histogram (on Figure 1. Refugee flows from Africa and the Middle East to Europe in July 2015 with a histogram top) showing the daily numbers of asylum seekers (adapted from [1]). (on top) showing the daily numbers of asylum seekers (adapted from [1]). Understanding migration patterns is important for administrative logistics such as accommodation,Understanding transportation, migration education, patterns and is importantdistribution for of refugees administrative as well as logistics for the mitigation such as ofaccommodation, refugee causes. transportation, Social media platforms education, and and phon distributione records ofare refugees potential as data well sources as for the for mitigation studying refugeeof refugee movement causes. Social patterns media besides platforms more traditional and phone da recordsta sources, are potential such as datagovernmental sources for datasets studying or statisticalrefugee movement data from patterns national besides or international more traditional agencies data sources,or voluntary such as aid governmental programs. datasetsHowever, or especiallystatistical datafor crowd-sourced from national or and international social media agencies data, su orch voluntary as tweets, aid it programs.must be noted However, that these especially data undergofor crowd-sourced a selection and bias social and mediado not data,necessarily such as repr tweets,esent it the must entire be noted populati thaton these [8,9]. data Twitter undergo is a microblogginga selection bias andservice do notthat necessarily allows its represent users to the send entire 140 population character [long8,9]. Twitterposts called is a microblogging tweets. It is utilizedservice thatevery allows month its usersby over to send 300 140million character active long users, posts and called therefore tweets. frequently It is utilized used every to monthstudy communication patterns and information flows among people [10,11]. Drawing from geo-positioning capabilities of mobile devices through GPS, Wi-Fi, or cell phone towers, tweeted posts can be annotated with location information about the user’s current position if the user opts to ISPRS Int. J. Geo-Inf. 2017, 6, 302 3 of 23 by over 300 million active users, and therefore frequently used to study communication patterns and information flows among people [10,11]. Drawing from geo-positioning capabilities of mobile devices through GPS, Wi-Fi, or cell phone towers, tweeted posts can be annotated with location information about the user’s current position if the user opts to do so in the application. However, only about 1% of all tweets is geo-tagged [12], which means that results of any analysis that involves Twitter geo-locations are not necessarily representative of all Twitter users. Geo-tagged positional information within tweets can be relayed as point with exact geographic coordinates or as bounding