From Motion Activity to Geo-Embeddings: Generating and Exploring Vector Representations of Locations, Traces and Visitors Through Large-Scale Mobility Data

International Journal of Geo-Information Article From Motion Activity to Geo-Embeddings: Generating and Exploring Vector Representations of Locations, Traces and Visitors through Large-Scale Mobility Data Alessandro Crivellari * and Euro Beinat Department of Geoinformatics—Z_GIS, University of Salzburg, 5020 Salzburg, Austria; [email protected] * Correspondence: [email protected] Received: 29 January 2019; Accepted: 4 March 2019; Published: 8 March 2019 Abstract: The rapid growth of positioning technology allows tracking motion between places, making trajectory recordings an important source of information about place connectivity, as they map the routes that people commonly perform. In this paper, we utilize users’ motion traces to construct a behavioral representation of places based on how people move between them, ignoring geographical coordinates and spatial proximity. Inspired by natural language processing techniques, we generate and explore vector representations of locations, traces and visitors, obtained through an unsupervised machine learning approach, which we generically named motion-to-vector (Mot2vec), trained on large-scale mobility data. The algorithm consists of two steps, the trajectory pre-processing and the Word2vec-based model building. First, mobility traces are converted into sequences of locations that unfold in fixed time steps; then, a Skip-gram Word2vec model is used to construct the location embeddings. Trace and visitor embeddings are finally created combining the location vectors belonging to each trace or visitor. Mot2vec provides a meaningful representation of locations, based on the motion behavior of users, defining a direct way of comparing locations’ connectivity and providing analogous similarity distributions for places of the same type. In addition, it defines a metric of similarity for traces and visitors beyond their spatial proximity and identifies common motion behaviors between different categories of people. Keywords: embeddings; Word2vec; unsupervised learning; trajectories; motion behavior 1. Introduction The only source of information about relationships between different locations or traces (sequences of locations) provided by the traditional geographical representation is their spatial proximity. However, some specific applications may take advantage of further relationships ignored by the simple geographic coordinates. Such missing information may be relevant in cases where places (and connectivity between them) strongly influence people’s behavior over the territory, or when, vice versa, people’s movements between places can provide a meaningful indication of the functionality of such places. Nowadays, the rapid growth of positioning technology makes easier to track motion, allowing many devices to acquire current locations. These trajectory recordings are important sources of information about place connectivity, showing the routes that people commonly perform [1–4]. In this paper, we explore the concept of “behavioral proximity” between places, based on people’s trajectories, not on locations’ geography. We construct location embeddings as a vector representation strictly based on large-scale human mobility, defining a metric of similarity beyond the simple geographical proximity. ISPRS Int. J. Geo-Inf. 2019, 8, 134; doi:10.3390/ijgi8030134 www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2018, 7, x FOR PEER REVIEW 2 of 24 ISPRS Int. J. Geo-Inf. 2019, 8, 134 2 of 23 embeddings as a vector representation strictly based on large-scale human mobility, defining a metric of similarity beyond the simple geographical proximity. TheThe meaning meaning of of “behavioral “behavioral proximity” proximity” is is related related to to the the concept concept of of trajectory. trajectory. Two Two locations locations are are behaviorallybehaviorally similar similar if if they they often often belong belong to to the the same same trajectories; trajectories; they they often often share share the the same same neighbor neighbor locationslocations along along the the trace. trace. Of Of course, course, closer closer locations locations areare likely likely to to be be more more connected connected between between each each other,other, but but it it is is not not always always the the case. case. For For example, example, if if two two places places are are located located next next to to each each other other but but none none ofof the the roads roads connects connects them, them, those those places places are are not not behaviorally behaviorally close, close, even even if if geographically geographically theythey are,are, or,or, again, again, if if two two locations locations are are spatially spatially close close to to each each other other but but rarely rarely visited visited together, together, rarely rarely following following thethe same same route, route, their their behavioral behavioral distance distance is higher is than higher the onethan between the one two locationsbetween geographicallytwo locations moregeographically distant but more often distant sharing but common often sharing trajectories. commo Thisn trajectories. can also reflect This can how also people reflect tend how to people visit placestend to when visit traveling. places when Figure traveling.1a shows Figure three 1(a) locations, shows where three thelocations, spatial where distance the LOC1–LOC2 spatial distance is comparableLOC1–LOC2 with is comparable the distance with LOC2–LOC3. the distance However, LOC2–LOC3. people’s However, behavior people’s between behavior those locations between isthose very locations different. is The very motion different. traces The between motion traces LOC1 between and LOC2 LOC1 are muchand LOC2 sparser: are much from asparser: behavioral from perspective,a behavioral LOC2 perspective, is “closer” LOC2 to LOC3. is “closer” Figure to1 LOC3b depicts. Figure people’s 1(b) depicts trajectories peop passingle’s trajectories through passing LOC1 andthrough LOC2; LOC1 Figure and1c shows LOC2; the Figure ones passing1(c) shows through the LOC2ones passing and LOC3. through This behavioralLOC2 and difference LOC3. This is capturedbehavioral by difference the embeddings. is captured by the embeddings. (a) (b) (c) FigureFigure 1. 1.Locations Locations having having comparable comparable distances distances LOC1–LOC2 LOC1–LOC2 and and LOC2–LOC3 LOC2–LOC3 ( a(),a), with with trajectories trajectories passingpassing through through LOC1 LOC1 andand LOC2LOC2 (b),), and LOC2 LOC2 and and LOC3 LOC3 (c (c),), within within a amaximum maximum time time period period of ofsix sixhours. hours. OurOur study study is is not not limited limited to to a a location location level level but but can can also also be be extended extended to to a a trace trace level, level, allowing allowing comparisonscomparisons that that go beyondgo beyond the simplethe simple spatial distancespatial distance between centersbetween of masscenters (COMs), of mass considering (COMs), alsoconsidering the “behavioral also the relationships” “behavioral relationships” between traces. between Figure traces.2 shows Figure three 2 traces shows where three the traces distance where betweenthe distance COM1 between and COM2 COM1 is and comparable COM2 is withcomparab the distancele with the between distance COM1 between and COM3.COM1 and However, COM3. TRACE1However, and TRACE1 TRACE2 and are TRACE2 located inare proximity located in of proximity a common of area a common (different area sides (different of the samesides lake),of the whilesame TRACE3lake), while is in TRACE3 a different is in area. a different Trace embeddings area. Trace areembeddings able to capture are able the to influence capture ofthe particular influence areasof particular of interest, areas defining of interest, similar defining vector representations similar vector forrepresentations traces located for in theirtraces proximity, located in hence their consideredproximity, behaviorallyhence considered related. behaviorally related. Besides locations and traces, we explore a third level of representation, the visitor level, conceptually similar to the trace one. We propose comparisons intra-visitor, e.g., different behaviors of the same user in different hours of the day, and inter-visitor, between different customers or different groups of customers, e.g., to study the motion behavior of tourists grouped for instance by nationality. ISPRS Int. J. Geo-Inf. 2019, 8, 134 3 of 23 ISPRS Int. J. Geo-Inf. 2018, 7, x FOR PEER REVIEW 3 of 24 Figure 2. TracesTraces having comparable distances CO COM1–COM2M1–COM2 and COM1–COM3 but behaviorally different meaning: TRACE1 TRACE1 and and TRACE2 TRACE2 are are located located in in the the proximity proximity of of the the same area of interest represented by the same lake. BesidesOur purpose locations is the and design traces, of a machine-readablewe explore a thir representation,d level of representation, whereby behaviorally the visitor similar level, conceptuallylocations (traces, similar visitors) to the share trace similarone. We representations propose comparisons in mathematical intra-visitor, terms. e.g., We different hereby behaviors generate ofand the explore same user a dense in different vector representation hours of the day, obtained and byinter-visitor, means of between an embedding different method customers that weor differentgenerically groups called of motion-to-vectorcustomers, e.g., to (Mot2vec), study the mo applyingtion behavior the tools of oftourists Word2vec, grouped primarily

Load more