Learning Robust Helpful Behaviors in Two-Player Cooperative Atari Environments Paul Tylkin Goran Radanovic David C. Parkes Harvard University MPI-SWS∗ Harvard University
[email protected] [email protected] [email protected] Abstract We initiate the study of helpful behavior in the setting of two-player Atari games, suitably modified to provide cooperative incentives. Our main interest is to understand whether reinforcement learning can be used to achieve robust, helpful behavior— where one agent is trained to help a second, partner agent. Robustness requires the helpful AI to be able to cooperate effectively with a diverse set of partners. We study this question with both artificial partner agents as well as human participants (introducing a new, web-based framework for the study of human-with-AI behavior). We achieve positive results in both Space Invaders and Fall Down, as well as successful transfer to human partners, including with people who are asked to deliberately follow unexpected behaviors. 1 Introduction As we anticipate a future of systems of AIs, interacting in ever-increasing ways and with each other as well as with people, it is important to develop methods to promote cooperation. This need for cooperation is relevant, for example, in settings with automated vehicles [1], home robotics [14], as well as in military domains, where UAVs assist teams of soldiers [31]. In this paper, we seek to advance the study of cooperative behavior through suitably modified, two-player Atari games. Although closed and relatively simple environments, there is a rich, recent tradition of using Atari to drive advances in AI [23, 24].