Learning Mobility Flows from Urban Features with Spatial Interaction Models and Neural Networks*
Total Page:16
File Type:pdf, Size:1020Kb
Learning Mobility Flows from Urban Features with Spatial Interaction Models and Neural Networks* Gevorg Yeghikyan Felix L. Opolka Mirco Nanni Bruno Lepri Pietro Lio` Scuola Normale Superiore University of Cambridge ISTI-CNR FBK University of Cambridge Pisa, Italy Cambridge, UK Pisa, Italy Trento, Italy Cambridge, UK gevorg.yeghikyan@sns.it flo23@cam.ac.uk mirco.nanni@isti.cnr.it lepri@fbk.eu pl219@cam.ac.uk Abstract—A fundamental problem of interest to policy mak- More specifically, the motivation for this task comes from ers, urban planners, and other stakeholders involved in urban a scenario in which it is necessary to assess the impact development projects is assessing the impact of planning and of an urban development project on the OD flows in and construction activities on mobility flows. This is a challenging task due to the different spatial, temporal, social, and economic out of the project’s location. Examples of these motivating factors influencing urban mobility flows. These flows, along with scenarios include retail location choice and consumer spatial the influencing factors, can be modelled as attributed graphs behaviour prediction, which have been approached with the with both node and edge features characterising locations in Huff model and its modifications [7]. These models, however, a city and the various types of relationships between them. suffer from a series of drawbacks related mostly to overly In this paper, we address the problem of assessing origin- destination (OD) car flows between a location of interest and restrictive assumptions. In this paper, we take a different every other location in a city, given their features and the approach and focus on the problem of evaluating OD flows structural characteristics of the graph. We propose three neural in and out of a location of interest. By modelling urban flows network architectures, including graph neural networks (GNN), as attributed graphs in which the nodes represent locations and conduct a systematic comparison between the proposed in a city (i.e. each node is described by a vector of features methods and state-of-the-art spatial interaction models, their modifications, and machine learning approaches. The objective such as population density, Airbnb prices, available parking of the paper is to address the practical problem of estimating areas, etc.), and the edges represent the car flows between potential flow between an urban development project location them (each one described by a vector of features such as road and other locations in the city, where the features of the project distance, average time required to travel, average speed, etc.), location are known in advance. We evaluate the performance this project aims to offer an instrument for assessing flows of the models on a regression task using a custom data set of attributed car OD flows in London. We also visualise the model between a specific location and all other locations in the city. performance by showing the spatial distribution of flow residuals Since a rigorous experimental setting would have required across London. difficult-to-obtain longitudinal data of OD flows before and Index Terms—urban mobility flows, spatial interaction models, after the completion of an urban development project, we set graph neural networks, urban computing up a quasi-experimental setting. We randomly select locations in a city and the flows associated with them as a test set, I. INTRODUCTION and attempt to find a function that takes the urban features Planning and managing city and transportation infrastruc- describing city locations and the remaining flows as input, tures requires understanding the relationship between urban and predicts the flows in the test set as output. mobility flows and spatial, structural, and socio-economic In sum, our paper makes the following contributions: features associated with them. There exists extensive literature • We propose three neural network architectures for pre- addressing this problem ranging from the classical gravity arXiv:2004.11924v1 [cs.SI] 24 Apr 2020 dicting car flows between a location of interest and every model and its modifications [1], [2] to the more recent spatial other location in a city. Two of the models use graph con- econometric interaction models [3] and the non-parametric ra- volutional layers that pool information from geographical diation models [4] that attempt to characterise cross-sectional or topological neighbourhoods around relevant nodes to origin-destination (OD) flow matrices. Furthermore, various incorporate more information (Section V). neural network-based models have been proposed for predict- • We evaluate and compare our models on a custom dataset ing temporal OD flow matrices [5], [6]. However, modelling of aggregate OD car flows in London, containing node OD flow matrices in their entirety, the mentioned works do and edge features (Section VI). not address the problem of assessing flows between a specific • We show that the proposed neural network models outper- location and every other location in the city, given all other form well-known spatial interaction and machine learning flows, other location characteristics, as well as information on models. A comparison among neural network models the dyadic relations between those locations. reveals that graph convolutions do not substantially im- *To appear in the Proceedings of 2020 IEEE International Conference on prove prediction performance on the formulated task Smart Computing (SMARTCOMP 2020) (Sections IV, VI). • We describe our custom dataset and make it publicly disadvantage of considering either spatial agglomeration or available along with the code for this study (Section III). competition effects, ignoring the fact that they can coexist in the same location. Even though a number of extensions to the II. RELATED WORK Huff model and the gravity framework in general have been The problem of estimating human flows between locations proposed to overcome spatial non-stationarity and to include in a geographical space has been first addressed by [1] through a larger array of features affecting the flows [19], [20], this a family of spatial interaction models and subsequently ex- family of models, along with the non-parametric radiation and tended by [2]. Spatial interaction models, extensively used population-weighted opportunities model, have demonstrated to estimate human mobility flows and trip demand between to fall short of high predictive capacity particularly at the city locations as a function of the location features, have be- scale [21], [22], [23]. come an acknowledged method for modelling geographical More recently, machine learning, particularly a Random mobility in transportation planning [8], [9], commuting [10], Forest approach, has shown promising results in reconstructing and spatial economics [11]. The spatial interaction models inter-city OD flow matrices [24]. However, its performance on are usually calibrated via an Ordinary Least Squares (OLS) intra-urban flow data remains to be tested. regression, which assumes normally distributed data. However, Moreover, as already mentioned, the discussed models ad- OD flows are usually not distributed normally, are count data, dress the problem of modelling the OD flow matrix as a whole and contain a large number of zero flows. This makes the and have to be adapted to our specific task of estimating flows setting incompatible with OLS estimation and requires either between a specific location and all other locations, given the a Poisson model or, in the presence of over-dispersion, a other flows in the city, the location features, and the features Negative Binomial Regression (NB) model [12]. describing the dyadic relations between them, respectively. Another major concern in this modelling scenario are the The problem of estimating OD flows has also been ad- complex interactions often caused by spatial dependencies dressed with neural network methods [25]. As flows are most and non-stationarity. The former arises from spill-over effects naturally modelled by graphs, most work has focused on the from a location to its neighbourhoods, while the latter is use of graph neural networks for flow estimation. An early caused by the influence of independent variables varying neural network model for graph structured data has been across space. These issues have been addressed in literature by suggested in [26]. Later work has specifically focused on spatial autocorrelation and geographically weighted modelling generalising Convolutional Neural Networks from the domain techniques [13], [3], [14], [12]. of regular grids to the domain of irregular graphs [27], [28]. Another approach within the spatial interaction modelling One of the most commonly used graph neural network models paradigm is the Huff model and its extensions [7]. Originally is the Graph Convolutional Neural Network (GCN) proposed developed mainly for retail location choice and turnover pre- in [29]. diction, they represent a probabilistic formulation of the grav- Graph neural networks have previously been applied to ity model. The Huff model considers OD flows as proportional urban planning tasks. In [5], they have been used to predict the to the relative attractiveness and accessibility of the destination flow of bikes within a bike sharing system. Unlike our model, flows are modelled as node-level features, which requires a compared to other competing destinations. The probability Pij of a consumer at location i of choosing to shop at a retail different neural network model and does not allow to predict location j is framed as: flows between specific pairs of nodes. Although [30] uses graph neural networks to predict flows between parts of a α −β Aj Dij city, their model operates on spatio-temporal data and focuses Pij = ; (1) Pn AαD−β on the temporal aspect of the data. Beyond flow prediction, j=1 j ij in [31], a graph neural network model has been proposed where Aj is a measure of attractiveness of retail location j, for building site selection. A broader overview of machine such as area or a linear combination of different features, Dij learning methods applied to the task of urban flow prediction is the distance between locations i and j, α and β, estimated is given in [32].