Smart Train Operation Algorithms Based on Expert Knowledge and Ensemble Cart for the Electric Locomotive,” Knowledge-Based Systems, Vol

PREPRINT VERSION. PUBLISHED IN IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS - https://ieeexplore.ieee.org/abstract/document/9144488. Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning Kaichen Zhou, Student Member, IEEE, Shiji Song, Member, IEEE, Anke Xue, Member, IEEE Keyou You, Member, IEEE and Hui Wu, Student Member, IEEE Abstract—During recent decades, the automatic train operation on the train’s trajectory. And improving the performance of the (ATO) system has been gradually adopted in many subway systems ATO system has become a focus in the field of the transportation for its low-cost and intelligence. This paper proposes two smart system. In most cases, research on the urban subway system train operation algorithms by integrating the expert knowledge with reinforcement learning algorithms. Compared with previous works, can be divided into two parts: the energy-efficient train operation the proposed algorithms can realize the control of continuous action committed to designing an off-line optimized train trajectory, and for the subway system and optimize multiple critical objectives the automatic train tracking method to track the real-time train without using an offline speed profile. Firstly, through learning speed-distance profile. historical data of experienced subway drivers, we extract the expert knowledge rules and build inference methods to guarantee the riding During recent years, lot of studies are devoted to designing comfort, the punctuality, and the safety of the subway system. Then an off-line optimized train trajectory to improve the energy we develop two algorithms for optimizing the energy efficiency of efficiency. For example, Albrecht et al. used a comprehensive train operation. One is the smart train operation (STO) algorithm perturbation analysis to show that a key local energy function based on deep deterministic policy gradient named (STOD) and the is strictly convex with a unique minimum, and thus proved that other is the smart train operation algorithm based on normalized advantage function (STON). Finally, we verify the performance of the optimal switching points are uniquely defined for each steep proposed algorithms via some numerical simulations with the real section [3]. Besides, the train operation problem also involves field data from the Yizhuang Line of the Beijing Subway and many other aspects, such as riding comfort and punctuality. In illustrate that the developed smart train operation algorithm are order to minimize the total travel time of passengers and the better than expert manual driving and existing ATO algorithms in energy consumption of the train operation, Wang et al. proposed terms of energy efficiency. Moreover, STOD and STON can adapt to different trip times and different resistance conditions. a new iterative convex programming (ICP) approach to obtain the optimal departure times, running times and dwell times, to Index Terms —Smart train operation, subway, expert knowledge, solve the train scheduling problem [4]. Focusing on the energy- reinforcement learning. saving and the service quality, Yang et al. formulated a two- objective integer programming model with headway time and I. INTRODUCTION dwell time control to find the optimal solution by designing With the deterioration of modern urban traffic problems and a genetic algorithm with binary encoding [5]. Considering the energy problems, the urban subway is getting more and more energy consumption and trip time as the main objectives of attention due to its advantages on safety, punctuality, and energy optimization, Wei et al. developed a multi-objective optimization efficiency [1]. Since January 9th, 1863, the first subway has model for the speed trajectory by using optimal speed trajec- started operation from Paddington to Farringdon, and until 2015, tory searching strategies under different track characteristics [6]. more than 150 cities were hosting approximately 160 subway Moreover, the parameters of railways are not always constant. systems around the world. With the consideration of variable gradients and arbitrary speed arXiv:2003.03327v3 [cs.CE] 1 Jan 2021 et al. Meanwhile, with the acceleration of the modernization process limits, Khmelnitsky constructed a numerical algorithm to obtain the optimal velocity profile [7]. With the development of of the subway, the automatic train operation system has been used in many places to replace manual driving for its low- artificial intelligence, many intelligent algorithms have recently been used in the train operation. Açıkba¸s et al. implemented cost and intelligence. The ATO system first generates the target artificial neural networks with a genetic algorithm to optimize speed curve based on various requirements under both the train condition and the railway condition, and then sends a speed coasting points of speed-distance trajectory in order to obtain minimum energy consumption for a given travel time [8]. Yang et al. control command to control the train to track the generated target speed curve [2]. Thus, the ATO system has a direct influence integrated simulation-based methodologies and genetic algorithm to reduce the calculation difficulties and seek the approximate This work is funded by the key project of the National Natural Science optimal coasting control strategies on the railway network [9]. Foundation of China under Grant 61936009, and is funded by the major research Yin et al. used the reinforcement learning-Q learning method, plan of the National Major Research Program under Grant 2018AAA0101604. to construct an intelligent train operation system that can meet This work was done when the first author was with Department of Automation, Tsinghua University, under the supervision of Prof.Shiji Song and Dr.Keyou You. multiple objectives [10]. K. Zhou is with Department of Computer Science, University of Oxford, Oxford Moreover, tracking the real-time train speed profile is also a OX1 2JD, UK (e-mail: [email protected]). S. Song, K. You and H. Wu are with Department of Automation and BNRist, critical research topic. For example, Liu et al. proposed a high- Tsinghua University, Beijing 100084, China (e-mail: [email protected]; speed railway control system based on the fuzzy control method [email protected]; [email protected]). and designed a control system in the Matlab software according to A. Xue is with the Key Laboratory for IOT and Information Fusion Technology of Zhejiang, Institute of Information and Control, Hangzhou Dianzi University, the expert experience and knowledge [11]. Wu et al. used variable Hangzhou 310018, China (e-mail: [email protected]). structure technique and a time-delayed compensator, to design 1 PREPRINT VERSION. PUBLISHED IN IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS - https://ieeexplore.ieee.org/abstract/document/9144488. a state observer-based adaptive fuzzy controller to approximate 1 the unknown system parameters, and thus trajectory tracking problem of a series of two-wheeled self-balancing vehicles can ) be addressed [12]. Gu et al. proposed a new energy-efficient train )F )G operation model based on real-time traffic information from the )U geometric and topographic points of view through a nonlinear programming method [13]. Recently, Li et al. designed a robust * sampled-data cruise control scheduling with the form of linear matrix inequality(LMI) and proposed numerical examples that )J verified the effectiveness of the proposed algorithms to track the SD trajectory precisely [14]. Despite great achievements in previous studies, there are still some essential problems unsolved, which blocks the development Fig. 1: The force diagram of the train. of the ATO system. Firstly, for multiple objectives of train operation, most researchers mentioned above had just taken one or two objectives into account, and there is no comprehensive problems and formulate multiple objectives of train operation analysis about designing optimal train operation to meet multiple into numerical evaluation indices to systematically evaluate the objectives. Secondly, the modern subway is capable of outputting train operation problem. Plus, we state the objectives of this continuous traction and braking force [15], however, there are rare paper. In Section III, the structure of STO is presented. Then researches devoted to design the control model for continuous we put forward expert knowledge rules and summarize inference action while considering complicated train operation conditions, methods. Besides, the principles and the algorithms of STOD such as variables speed limits. Thirdly, in the ATO system, the and STON are explained. In Section IV, the simulation platform optimized speed profile was designed before the operation of the is built. Three numerical simulations are made based on the real train, and the train is controlled to track the designed optimized field data of YLBS. Conclusions are summarized in Section V. speed profile during the trip time which largely decreases the flexibility and the robustness of the ATO system. Plus, it is hard II. PROBLEM FORMULATION to implement complicated mathematic optimal methods to treat In general, the problem of train control can be formulated as an the nonlinear train operation problem, thus it is necessary to optimal control problem with focus on finding an optimal control design a model that can realize train control without considering strategy for traction and braking force during the trip time. Firstly, offline optimized speed profile. Finally, the real subway operation we define ∆t as the minimum time interval, and the trip time of is faced with many unexpected situations, such as, the changed train can be described as follow: trip time of one subway which influences the timetable of the t = t + ∆t, (1) whole line, and the railway aging which changes the railway i+1 i resistance condition. Being faced with these problems, modern for 0 ≤ i ≤ n − 1. The total trip time T is defined as subway always transfers from the ATO system to manual driving T = t − t , (2) which largely decreases the intelligence and efficiency of train n 0 operation. From the analysis above, the contribution of the paper where the initial running time t0 = 0(s).

Smart Train Operation Algorithms Based on Expert Knowledge and Ensemble Cart for the Electric Locomotive,” Knowledge-Based Systems, Vol

Ambitious Roadmap for City's South

Beijing Subway Map

EUROPEAN COMMISSION DG RESEARCH STADIUM D2.1 State

Can Light Rail Benefit Job-Housing Relationships and Land Use-Transp Ortation Integration in New Town? ——Case Study of Yizhuang in Beijing

Job-Worker Spatial Dynamics in Beijing: Insights from Smart Card Data

China Clean Energy Study Tour for Urban Infrastructure Development

Exploration in the Curriculum and Teaching Based Cultivation of Innovation Capabilities for Graduate Students

Multi-Train Energy Saving for Maximum Usage of Regenerative Energy by Dwell Time Optimization in Urban Rail Transit Using Genetic Algorithm

Dynamic Passenger Demand Oriented Metro Train Scheduling with Energy-Eﬃciency and Waiting Time Minimization: Mixed-Integer Linear Programming Approaches

Flood Risk Assessment of Subway Systems in Metropolitan Areas Under Land Subsidence Scenario: a Case Study of Beijing

Assessing Survivability of the Beijing Subway System

Coordinated Train Control in a Fully Automatic Operation System for Reducing Energy Consumption