Programmation Et Apprentissage Bayésien Pour Les Jeux Vidéo Multi-Joueurs, Application À L’Intelligence Artificielle De Jeux De Stratégies Temps-Réel Gabriel Synnaeve
Total Page:16
File Type:pdf, Size:1020Kb
Programmation et apprentissage bayésien pour les jeux vidéo multi-joueurs, application à l’intelligence artificielle de jeux de stratégies temps-réel Gabriel Synnaeve To cite this version: Gabriel Synnaeve. Programmation et apprentissage bayésien pour les jeux vidéo multi-joueurs, ap- plication à l’intelligence artificielle de jeux de stratégies temps-réel. Informatique et théorie desjeux [cs.GT]. Université de Grenoble, 2012. Français. NNT : 2012GRENM075. tel-00780635 HAL Id: tel-00780635 https://tel.archives-ouvertes.fr/tel-00780635 Submitted on 24 Jan 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE Pour obtenir le grade de DOCTEUR DE L’UNIVERSITÉ DE GRENOBLE Spécialité : Informatique Arrêté ministérial : 7 août 2006 Présentée par Gabriel SYNNAEVE Thèse dirigée par Pierre BESSIÈRE préparée au sein Laboratoire d’Informatique de Grenoble et de École Doctorale de Mathématiques, Sciences et Technologies de l’Information, Informatique Bayesian Programming and Learn- ing for Multi-Player Video Games Application to RTS AI Jury composé de : M. Augustin LUX Professeur à Grenoble Institut National Polytechnique, Président M. Stuart RUSSELL Professeur à l’Université de Californie à Berkeley, Rapporteur M. Philippe LERAY Professeur à l’Ecole Polytechnique de l’Université de Nantes, Rapporteur M. Marc SCHOENAUER Directeur de Recherche à INRIA à Saclay, Examinateur M. Pierre BESSIÈRE Directeur de Recherche au CNRS au Collège de France, Directeur de thèse Foreword Scope This thesis was prepared partially (2009-2010) at INRIA Grenoble, in the E-Motion team, part of the Laboratoire d’Informatique de Grenoble, and (2010-2012) at Collège de France (Paris) in the Active Perception and Exploration of Objects team, part of the Laboratoire de la Physiologie de la Perception et de l’Action. I tried to compile my research with introductory material on game AI and on Bayesian modeling so that the scope of this document can be broader than my thesis committee. The first version of this document was submitted for revision on July 20th 2012, this version contains a few changes and corrections done during fall 2012 according to the reviews of the committee. Remerciements Je tiens tout d’abord à remercier Elsa, qui m’accompagne dans la vie, et qui a relu maintes fois ce manuscrit. Je voudrais aussi remercier mes parents, pour la liberté et l’aide qu’il m’ont apporté tout au long de ma jeunesse, ainsi que le reste de ma famille pour leur support et leurs encouragements continus. Mon directeur de thèse, Pierre Bessière, a été fabuleux, avec les bons cotés du professeur Smith : sa barbe, sa tenacité, ses histoires du siècle dernier, son champ de distorsion temporelle à proximité d’un tableau, et toujours une idée derrière la tête. Mes collègues, autant à l’INRIA qu’au Collège de France, ont été sources de discussions fort intéressantes : je les remercie tous, et plus particulièrement ceux avec qui j’ai partagé un bureau dans la (très) bonne humeur Mathias, Amaury, Jacques, Guillaume et Alex. Je tiens enfin à remercier chaleureusement tous mes amis, et en particulier ceux qui ont relus des parties de cette thèse, Étienne et Ben. Si c’était à refaire, je le referais surtout avec les mêmes personnes. 3 Abstract This thesis explores the use of Bayesian models in multi-player video game AI, particularly real-time strategy (RTS) game AI. Video games are in-between real world robotics and total simulations, as other players are not simulated, nor do we have control over the simulation. RTS games require having strategic (technological, economical), tactical (spatial, temporal) and reactive (units control) actions and decisions on the go. We used Bayesian modeling as an alternative to logic, able to cope with incompleteness of information and uncertainty. Indeed, incomplete specification of the possible behaviors in scripting, or incomplete specification of the possible states in planning/search raise the need to deal with uncertainty. Machine learning helps reducing the complexity of fully specifying such models. Through the realization of a fully robotic StarCraft player, we show that Bayesian programming can integrate all kinds of sources of uncertainty (hidden state, intention, stochasticity). Probability distributions are a mean to convey the full extent of the information we have and can represent by turns: constraints, partial knowledge, state estimation, and incompleteness in the model itself. In the first part of this thesis, we review the current solutions to problems raised by multi- player game AI, by outlining the types of computational and cognitive complexities in the main gameplay types. From here, we sum up the cross-cutting categories of problems, explaining how Bayesian modeling can deal with all of them. We then explain how to build a Bayesian program from domain knowledge and observations through a toy role-playing game example. In the second part of the thesis, we detail our application of this approach to RTS AI, and the models that we built up. For reactive behavior (micro-management), we present a real-time multi-agent decentralized controller inspired from sensorimotor fusion. We then show how to perform strategic and tactical adaptation to a dynamic opponent through opponent modeling and machine learning (both supervised and unsupervised) from highly skilled players’ traces. These probabilistic player-based models can be applied both to the opponent for prediction, or to ourselves for decision-making, through different inputs. Finally, we explain our StarCraft robotic player architecture and precise some technical implementation details. Beyond models and their implementations, our contributions fall in two categories: integrat- ing hierarchical and sequential modeling and using machine learning to produce or make use of abstractions. We dealt with the inherent complexity of real-time multi-player games by using a hierarchy of constraining abstractions and temporally constrained models. We produced some of these abstractions through clustering; while others are produced from heuristics, whose outputs are integrated in the Bayesian model through supervised learning. 5 Contents Contents 6 1Introduction 11 1.1 Context ......................................... 11 1.1.1 Motivations................................... 11 1.1.2 Approach .................................... 12 1.2 Contributions.................................... .. 12 1.3 Readingmap ...................................... 13 2GameAI 15 2.1 GoalsofgameAI.................................... 15 2.1.1 NPC....................................... 15 2.1.2 Win ....................................... 16 2.1.3 Fun ....................................... 17 2.1.4 Programming.................................. 17 2.2 Single-playergames................................. .. 19 2.2.1 Actiongames .................................. 19 2.2.2 Puzzles ..................................... 19 2.3 Abstractstrategygames .............................. .. 19 2.3.1 Tic-tac-toe,minimax.............................. 19 2.3.2 Checkers,alpha-beta.............................. 21 2.3.3 Chess,heuristics ................................ 22 2.3.4 Go, Monte-Carlo tree search . 23 2.4 Games with uncertainty . 25 2.4.1 Monopoly.................................... 26 2.4.2 Battleship.................................... 26 2.4.3 Poker ...................................... 28 2.5 FPS ........................................... 28 2.5.1 GameplayandAI................................ 28 2.5.2 Stateoftheart................................. 30 2.5.3 Challenges ................................... 30 2.6 (MMO)RPG ...................................... 31 2.6.1 GameplayandAI................................ 31 2.6.2 Stateoftheart................................. 32 2.6.3 Challenges ................................... 32 6 2.7 RTSGames....................................... 32 2.7.1 GameplayandAI................................ 33 2.7.2 Stateoftheart&challenges . 33 2.8 Gamescharacteristics ................................ 33 2.8.1 Combinatorics ................................. 34 2.8.2 Randomness .................................. 37 2.8.3 Partialobservations . .. 37 2.8.4 Timeconstant(s) ................................ 38 2.8.5 Recapitulation ................................. 38 2.9 Playercharacteristics.............................. .... 38 3 Bayesian modeling of multi-player games 43 3.1 ProblemsingameAI................................. 43 3.1.1 Sensingandpartialinformation . ..... 43 3.1.2 Uncertainty(fromrules,complexity) . .... 44 3.1.3 Verticalcontinuity ............................... 44 3.1.4 Horizontalcontinuity. .. 44 3.1.5 Autonomy and robust programming . 44 3.2 TheBayesianprogrammingmethodology. .... 45 3.2.1 The hitchhiker’s guide to Bayesian probability. ........ 45 3.2.2 AformalismforBayesianmodels . 47 3.3 ModelingofaBayesianMMORPGplayer . 49 3.3.1 Task....................................... 49 3.3.2 Perceptionsandactions ............................ 50 3.3.3 Bayesianmodel................................. 51