Diplomová Práce

České vysoké učení technické Fakulta elektrotechnická DIPLOMOVÁ PRÁCE Hraní obecných her s neúplnou informací General Game Playing in Imperfect Information Games Prohlášení Prohlašuji, že jsem svou diplomovou práci vypracoval samostatně a použil jsem pouze podklady (literaturu, projekty, SW atd.) uvedené v přiloženém seznamu. V Praze, dne 7.5.2011 ……………………………………. podpis Acknowledgements Here I would like to thank my advisor Mgr. Viliam Lisý, MSc. for his time and valuable advice which was a great help in completing this work. I would also like to thank my family for their unflinching support during my studies. Tomáš Motal Abstrakt Název: Hraní obecných her s neúplnou informací Autor: Tomáš Motal email: [email protected] Oddělení: Katedra kybernetiky Fakulta elektrotechnická, České vysoké učení technické v Praze Technická 2 166 27 Praha 6 Česká Republika Vedoucí práce: Mgr. Viliam Lisý, MSc. email: [email protected] Oponent: RNDr. Jan Hric email: [email protected] Abstrakt Cílem hraní obecných her je vytvořit inteligentní agenty, kteří budou schopni hrát konceptuálně odlišné hry pouze na základě zadaných pravidel. V této práci jsme se zaměřili na hraní obecných her s neúplnou informací. Neúplná informace s sebou přináší nové výzvy, které je nutno řešit. Například hráč musí být schopen se vypořádat s prvkem náhody ve hře či s tím, že neví přesně, ve kterém stavu světa se právě nachází. Hlavním přínosem této práce je TIIGR, jeden z prvních obecných hráčů her s neúplnou informací který plně podporuje jazyk pro psaní her s neúplnou informací GDL-II. Pro usuzování o hře tento hráč využívá metodu založenou na simulacích. Přesněji, využívá metodu Monte Carlo se statistickým vzorkováním. Dále zde popíšeme jazyk GDL-II a na námi navržené hře piškvorek s neúplnou informací ukážeme, jak se v tomto jazyce dají tvořit hry. Schopnost našeho hráče hrát konceptuálně odlišné hry i jeho výkonnost je experimentálně ověřena při hraní několika různých her (karty, piškvorky s neúplnou informací, Macháček). Klíčová slova Hraní obecných her, Hry s neúplnou informací, Monte Carlo Abstract Title: General Game Playing in Imperfect Information Games Author: Tomáš Motal email: [email protected] Department: Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University in Prague Technická 2 166 27 Prague 6 Czech Republic Advisor: Mgr. Viliam Lisý, MSc. email: [email protected] Opponent: RNDr. Jan Hric email: [email protected] Abstract The goal of General Game Playing (GGP) is to create intelligent agents that are able to play any game based only on the description of the games rules. In this thesis we focus on GGP for games with imperfect information. Compared with perfect information games the imperfect information games bring with them numerous new challenges that must be tackled, for example: Player must find a way how to work with the uncertainty about her current state of the world, or cope with randomness in the game. The main outcome of this thesis is TIIGR, one of the first GGP agents for games with imperfect information. Our agent uses a simulation-based approach to action selection. Specifically it uses perfect information sampling Monte Carlo with Upper Confidence Bound applied to Trees as its main reasoning system. Further we explain the Game Description Language-II (GDL-II) and on a game we created we explain how to create games in this language. The ability of our player to play conceptually different games and her performance was verified on several games including Latent Tic-Tac-Toe, Liar’s dice and cards. Keywords General Game Playing, Imperfect Information Games, Monte Carlo Contents 1 INTRODUCTION .......................................................................................................... 1 1.1. THESIS OUTLINE ................................................................................................................ 2 2 EXTENSIVE FORM GAMES ........................................................................................... 3 2.1. GAME ............................................................................................................................. 3 2.2. EXTENSIVE FORM GAME ..................................................................................................... 3 2.3. SOLUTION CONCEPTS ......................................................................................................... 8 3 GENERAL GAME PLAYING ALGORITHMS ................................................................... 11 3.1. MINIMAX ...................................................................................................................... 11 3.2. REGRET MINIMIZATION .................................................................................................... 13 3.3. MONTE CARLO METHODS ................................................................................................ 15 3.4. PERFECT INFORMATION SAMPLING ..................................................................................... 16 3.5. COUNTERFACTUAL REGRET ................................................................................................ 19 3.6. INFORMATION SET SEARCH ............................................................................................... 21 4 GENERAL GAME PLAYING ......................................................................................... 24 4.1. GENERAL GAME PLAYER ................................................................................................... 24 4.2. GENERAL GAME PLAYING COMPETITION ............................................................................. 24 4.3. GAME DESCRIPTION LANGUAGE ........................................................................................ 25 4.3.1. Syntax.......................................................................................................................................... 25 4.3.2. Restrictions.................................................................................................................................. 28 4.4. GDL-II ......................................................................................................................... 29 4.5. DRESDEN GGP SERVER .................................................................................................... 29 4.6. CREATING GDL-II GAMES ................................................................................................. 31 5 DISCUSSION ............................................................................................................. 37 5.1. CFR ............................................................................................................................. 37 5.2. PIMC ........................................................................................................................... 38 5.3. ISS ............................................................................................................................... 38 5.4. CONCLUSIONS ................................................................................................................ 39 6 IMPLEMENTATION ................................................................................................... 40 6.1. PALAMEDES ................................................................................................................... 40 6.2. TIIGR ........................................................................................................................... 41 6.2.1. Palamedes - Reasoners ............................................................................................................... 42 6.2.2. Perfect Information Game Player ............................................................................................... 43 6.2.3. Imperfect Information Generalization ........................................................................................ 43 7 EXPERIMENTS .......................................................................................................... 47 7.1. VARIABLE TIME ............................................................................................................... 47 7.2. VARIABLE “STATES TO BE SEARCHED” SIZE ........................................................................... 50 7.3. VARIABLE ................................................................................................................... 52 7.4. SIMPLE CARD GAME ......................................................................................................... 53 7.5. LIAR’S DICE .................................................................................................................... 56 8 CONCLUSIONS .......................................................................................................... 60 8.1. EVALUATION .................................................................................................................. 60 8.2. FUTURE WORK ............................................................................................................... 61 9 BIBLIOGRAPHY ......................................................................................................... 62 10 APPENDIX A .......................................................................................................... 64 11 APPENDIX B .......................................................................................................... 71 List of acronyms CFR – counterfactual regret EFG – extensive

Load more