Explainable Federated Learning for Taxi Travel Time Prediction
Total Page:16
File Type:pdf, Size:1020Kb
Explainable Federated Learning for Taxi Travel Time Prediction Jelena Fiosina a Institute of Informatics, Clausthal Technical University, Julius-Albert Str. 4, D-38678, Clausthal-Zellerfeld, Germany Keywords: FCD Trajectories, Traffic, Travel Time Prediction, Federated Learning, Explainability. Abstract: Transportation data are geographically scattered across different places, detectors, companies, or organisations and cannot be easily integrated under data privacy and related regulations. The federated learning approach helps process these data in a distributed manner, considering privacy concerns. The federated learning archi- tecture is based mainly on deep learning, which is often more accurate than other machine learning models. However, deep-learning-based models are intransparent unexplainable black-box models, which should be explained for both users and developers. Despite the fact that extensive studies have been carried out on investigation of various model explanation methods, not enough solutions for explaining federated models exist. We propose an explainable horizontal federated learning approach, which enables processing of the dis- tributed data while adhering to their privacy, and investigate how state-of-the-art model explanation methods can explain it. We demonstrate this approach for predicting travel time on real-world floating car data from Brunswick, Germany. The proposed approach is general and can be applied for processing data in a federated manner for other prediction and classification tasks. 1 INTRODUCTION transparent. This feature of AI-driven systems com- plicates the user acceptance and can be troublesome Most real-world data are geographically scattered even for model developers. across different places, companies, or organisations, Therefore, the contemporary AI technologies and, unfortunately, cannot be easily integrated under should be capable of processing these data in a de- data privacy and related regulations. This is especially centralised manner, according to the data privacy reg- topical for the transportation domain, in which ubiq- ulations. Moreover, the algorithms should be maxi- uitous traffic sensors and Internet of Things create a mally transparent, which makes the decision-making world-wide network of interconnected uniquely ad- process user-centric. dressable cooperating objects, which enable exchange Federated learning is a distributed ML approach, and sharing of information. With the increase in the which enables model training on a large corpus of de- amount of traffic, a large number of available decen- centralised data (Konecny´ et al., 2016). It has three tralised data is available. major advantages: 1) it does not need to transmit the Fuelled by a large amount of data collected in original data to the cloud, 2) the computational load various domains and the high available computing is distributed among the participants, and 3) it as- power analytical procedures and statistical models for sumes synchronisation of the distributed models with data interpretation and processing rely on methods a server for more accurate models. The main assump- of artificial intelligence (AI). In recent years, a large tion is that the federated model should be parametric progress of AI has been achieved. These data-driven (e. g., deep learning) because the algorithm synchro- methods replace complex analytical procedures by nises the models by synchronising the parameters. A multiple calculations. They are easily applicable and, known limitation of deep learning is that neural net- in most cases, more accurate considering their ma- works inside it are unexplainable black-box models. chine learning (ML) ancestry. The accuracy and inter- Numerous model-agnostic and model-specific (e.g., pretability are two dominant features present in suc- Integrated gradients, DeepLIFT) methods for expla- cessful predictive models. However, more accurate nation of black-box models are available (Kraus et al., black-box models are not sufficiently explainable and 2020). Distributed versions of these methods exist, which allow them to be executed on various processes a https://orcid.org/0000-0002-4438-7580 670 Fiosina, J. Explainable Federated Learning for Taxi Travel Time Prediction. DOI: 10.5220/0010485606700677 In Proceedings of the 7th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS 2021), pages 670-677 ISBN: 978-989-758-513-5 Copyright c 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved Explainable Federated Learning for Taxi Travel Time Prediction on graphics processing units (GPUs) 1. However, to data analyses as well as investigation of novel ap- the best of our knowledge, studies on the explanation proaches capable of processing distributed data often of geographically distributed federated deep learning with data privacy requirements. models are lacking. When data centralisation is available, accurate The mentioned challenges are topical in the trans- prediction models can be developed, which address portation domain, in which the generation and pro- the big data challenge through smart ‘artificial’ par- cessing of big data are necessary. We propose titioning and parallelisation of data and computation a privacy-preserving explainable federated model, within a cloud-based architecture or powerful super which achieves a comparable accuracy to that of the computers (Fiosina and Fiosins, 2017). centralised approach on the considered real-world Often, data should be physically and logically dis- dataset. We predict the Brunswick taxi travel time tributed without transmission of big information vol- based on floating car data trajectories obtained from umes, without the need to store, manage, and pro- different taxi service providers, which should remain cess massive datasets in one location. This approach private. The proposed model makes predictions for enables a data analysis with smaller datasets. How- the stated problem and allows a joint learning pro- ever, scaling it up requires novel methods to effi- cess over different users, processing the data stored ciently support the coordinated creation and main- in each of them without exchanging their raw data, tenance of decentralised data models. Specific de- but only parameters, as well as providing joint expla- centralised architectures (e.g., multi-agent systems nations about variable importance. (MAS)) should be implemented to support the de- We address several research questions. 1) Which centralised data analysis, which requires a coordi- is the most accurate ML prediction method for the nated suite of decentralised data models, including given data in a centralised manner? We identify the parameter/data exchange protocols and synchronisa- best hyper-parameters for each method. 2) Under tion mechanisms among the decentralised data mod- which conditions federated learning is effective? We els (Fiosina et al., 2013a). distribute the dataset among various providers, and The MAS based representation of transportation analyse after which point the distributed and non- networks helps overcome the limitations of cen- synchronised models lose their accuracy and feder- tralised data analyses, which will enable autonomous ated learning is beneficial. We define an optimal syn- vehicles to make better and safer routing decisions chronisation plan for parameter exchange, identifying (Dotoli et al., 2017). Various cloud-based architec- the hyper-parameters and frequency of parameter ex- tures for intelligent transportation systems were pro- change that is acceptable and beneficial. 3) Do exist- posed (Khair et al., 2021). A cloud-based architec- ing black-box explanation methods successfully ex- ture, which focuses on decentralised big transporta- plain federated learning models? We investigate how tion data analyses was presented in (Fiosina et al., the state-of-the-art explainability methods can explain 2013a). federated models. The rest of the paper is organised as follows. Sec- Travel-time is an important parameter of trans- tion 2 describes the state-of-the-art. Section 3 de- portation networks, which accurate prediction helps scribes the proposed explainable federated deep learn- to reduce delays and transport delivery costs, im- ing concept and parameter synchronisation mech- proves reliability through better selection of routes anisms. Section 4 introduces the available data, and increases the service quality of commercial deliv- presents the experimental setup, and provides insights ery by bringing goods within the required time win- into the data preprocessing step. Section 5 presents dow (Ciskowski et al., 2018). A centralised deep the model validation and experimental results. learning based approach to the estimation of travel- time for ride-sharing problem was discussed in (Al- Abbasi et al., 2019). Often proper travel time forecasting model needs 2 STATE OF THE ART a pre-processing such as data filtering and ag- gregation. Travel-time aggregation models (non- 2.1 Distributed Data Analysis parametric, semi-parametric) for decentralized data clustering and corresponding coordination and pa- The large amount of contemporary generated data rameter exchange algorithms were researched in in the transportation domain requires application of (Fiosina et al., 2013b). Travel-time estimation and state-of-the-art methods of distributed/decentralised forecasting using decentralized multivariate linear and kernel-density regression with corresponding pa- 1https://captum.ai/ rameter/data exchange