Article Towards a Better Understanding of Public Transportation Traffic: A Case Study of the Washington, DC Metro Robert Truong, Olga Gkountouna *, Dieter Pfoser and Andreas Züfle Department of Geography and GeoInformation Science, George Mason University, Fairfax, VA 22030, USA;
[email protected] (R.T.);
[email protected] (D.P.); azufl
[email protected] (A.Z.) * Correspondence:
[email protected] (O.G.); Tel.: +1-703-993-1210 Received: 26 June 2018; Accepted: 5 August 2018; Published: 7 August 2018 Abstract: The problem of traffic prediction is paramount in a plethora of applications, ranging from individual trip planning to urban planning. Existing work mainly focuses on traffic prediction on road networks. Yet, public transportation contributes a significant portion to overall human mobility and passenger volume. For example, the Washington, DC metro has on average 600,000 passengers on a weekday. In this work, we address the problem of modeling, classifying and predicting such passenger volume in public transportation systems. We study the case of the Washington, DC metro exploring fare card data, and specifically passenger in- and outflow at stations. To reduce dimensionality of the data, we apply principal component analysis to extract latent features for different stations and for different calendar days. Our unsupervised clustering results demonstrate that these latent features are highly discriminative. They allow us to derive different station types (residential, commercial, and mixed) and to effectively classify and identify the passenger flow of “unknown” stations. Finally, we also show that this classification can be applied to predict the passenger volume at stations. By learning latent features of stations for some time, we are able to predict the flow for the following hours.