ORE Open Research Exeter TITLE Continuous Control with a Combination of Supervised and Reinforcement Learning AUTHORS Kangin, D; Pugeault, N JOURNAL Proceedings of the International Joint Conference on Neural Networks DEPOSITED IN ORE 23 April 2018 This version available at http://hdl.handle.net/10871/32566 COPYRIGHT AND REUSE Open Research Exeter makes this work available in accordance with publisher policies. A NOTE ON VERSIONS The version presented here may differ from the published version. If citing, you are advised to consult the published version for pagination, volume/issue and date of publication Continuous Control with a Combination of Supervised and Reinforcement Learning Dmitry Kangin and Nicolas Pugeault Computer Science Department, University of Exeter Exeter EX4 4QF, UK fd.kangin,
[email protected] Abstract—Reinforcement learning methods have recently sures: for example, by the average speed, time for completing achieved impressive results on a wide range of control problems. a lap in a race, or other appropriate criteria. This situation is However, especially with complex inputs, they still require an the same for other control problems connected with robotics, extensive amount of training data in order to converge to a meaningful solution. This limits their applicability to complex including walking [9] and balancing [10] robots, as well as in input spaces such as video signals, and makes them impractical many others [11]. In these problems, also usually exist some for use in complex real world problems, including many of those criteria for assessment (for example, time spent to pass the for video based control. Supervised learning, on the contrary, is challenge), which would help to assess how desirable these capable of learning on a relatively limited number of samples, control actions are.