Monte Carlo Tree Search Algorithms Applied to the Card Game Scopone
Total Page:16
File Type:pdf, Size:1020Kb
POLITECNICO DI MILANO Scuola di Ingegneria Industriale e dell'Informazione Corso di Laurea Magistrale in Ingegneria Informatica Monte Carlo Tree Search algorithms applied to the card game Scopone Relatore: Prof. Pier Luca Lanzi Tesi di Laurea di: Stefano Di Palma, Matr. 797319 Anno Accademico 2013-2014 Contents Contents III List of Figures VII List of Tables XI List of Algorithms XIII List of Source Codes XV Acknowledgements XVII Abstract XIX Sommario XXI Estratto XXIII 1 Introduction 1 1.1 Original Contributions . 2 1.2 Thesis Outline . 2 2 Artificial Intelligence in Board and Card Games 5 2.1 Artificial Intelligence in Board Games . 5 2.1.1 Checkers . 5 2.1.2 Chess . 6 2.1.3 Go . 7 2.1.4 Other Games . 7 2.2 Minimax . 8 2.3 Artificial Intelligence in Card Games . 10 2.3.1 Bridge . 10 2.3.2 Poker . 10 III 2.3.3 Mahjong . 11 2.4 Monte Carlo Tree Search . 11 2.4.1 State of the art . 12 2.4.2 The Algorithm . 14 2.4.3 Upper Confidence Bounds for Trees . 16 2.4.4 Benefits . 18 2.4.5 Drawbacks . 18 2.5 Monte Carlo Tree Search in Games with Imperfect Information 20 2.5.1 Determinization technique . 21 2.5.2 Information Set Monte Carlo Tree Search . 23 2.6 Summary . 25 3 Scopone 29 3.1 History . 29 3.2 Cards . 30 3.3 Game Rules . 30 3.3.1 Dealer Selection . 30 3.3.2 Cards Distribution . 31 3.3.3 Gameplay . 31 3.3.4 Scoring . 33 3.4 Variants . 34 3.4.1 Scopone a 10 . 34 3.4.2 Napola . 35 3.4.3 Scopa d'Assi or Asso piglia tutto . 35 3.4.4 Sbarazzino . 35 3.4.5 Rebello . 36 3.4.6 Scopa a 15 . 36 3.4.7 Scopa a perdere . 36 3.5 Summary . 36 4 Rule-Based Artificial Intelligence for Scopone 37 4.1 Greedy Strategy . 37 4.2 Chitarrella-Saracino Strategy . 39 4.2.1 The Spariglio . 39 4.2.2 The Mulinello . 42 4.2.3 Double and Triple Cards . 43 4.2.4 The Play of Sevens . 45 4.2.5 The Algorithm . 48 4.3 Cicuti-Guardamagna Strategy . 53 4.3.1 The Rules . 53 IV 4.3.2 The Algorithm . 55 4.4 Summary . 55 5 Monte Carlo Tree Search algorithms for Scopone 59 5.1 Monte Carlo Tree Search Basic Algorithm . 59 5.2 Information Set Monte Carlo Tree Search Basic Algorithm . 61 5.3 Reward Methods . 63 5.4 Simulation strategies . 63 5.5 Reducing the number of player moves . 64 5.6 Determinization with the Cards Guessing System . 65 5.7 Summary . 65 6 Experiments 67 6.1 Experimental Setup . 67 6.2 Random Playing . 67 6.3 Rule-Based Artificial Intelligence . 68 6.4 Monte Carlo Tree Search . 72 6.4.1 Reward Methods . 72 6.4.2 Upper Confidence Bounds for Trees Constant . 74 6.4.3 Simulation strategies . 75 6.4.4 Reducing the number of player moves . 76 6.5 Information Set Monte Carlo Tree Search . 79 6.5.1 Reward Methods . 79 6.5.2 Upper Confidence Bounds for Trees Constant . 80 6.5.3 Simulation strategies . 82 6.5.4 Reducing the number of player moves . 83 6.5.5 Determinizators . 85 6.6 Final Tournament . 86 6.7 Summary . 87 7 Conclusions 89 7.1 Future research . 91 A The Applications 93 A.1 The Experiments Framework . 93 A.2 The User Application . 96 Bibliography 101 V VI List of Figures 2.1 Steps of the Monte Carlo Tree Search algorithm. 15 3.1 Official deck of Federazione Italiana Gioco Scopone. 31 3.2 Beginning positions of a round from the point of view of one player. Each player holds nine cards and there are four cards on the table. 32 3.3 The player's pile when it did two scopa. 33 6.1 Reward methods comparison on the Monte Carlo Tree Search winning rate as a function of the number of iterations when it plays against the Greedy strategy as the hand team. 73 6.2 Reward methods comparison on the Monte Carlo Tree Search winning rate as a function of the number of iterations when it plays against the Greedy strategy as the deck team. 73 6.3 Scores Difference and Win or Loss reward methods compari- son on the Monte Carlo Tree Search winning rate as a function of the Upper Confidence Bounds for Trees constant when it plays against the Greedy strategy as the hand team. 74 6.4 Scores Difference and Win or Loss reward methods compari- son on the Monte Carlo Tree Search winning rate as a function of the Upper Confidence Bounds for Trees constant when it plays against the Greedy strategy as the deck team. 75 6.5 Winning rate of Monte Carlo Tree Search as a function of the value used by the Epsilon-Greedy Simulation strategy when it plays against the Greedy strategy both as hand team and deck team. 76 6.6 Simulation strategies comparison on the Monte Carlo Tree Search winning rate when it plays against the Greedy strategy both as hand team and deck team. 77 VII 6.7 Moves handlers comparison on the Monte Carlo Tree Search winning rate when it plays against the Greedy strategy both as hand team and deck team. 78 6.8 Moves handlers comparison on the Monte Carlo Tree Search winning rate when it plays against the Chitarrella-Saracino strategy both as hand team and deck team. 78 6.9 Reward methods comparison on the Information Set Monte Carlo Tree Search winning rate as a function of the number of iterations when it plays against the Greedy strategy as the hand team. 79 6.10 Reward methods comparison on the Information Set Monte Carlo Tree Search winning rate as a function of the number of iterations when it plays against the Greedy strategy as the deck team. 80 6.11 Scores Difference and Win or Loss reward methods compari- son on the Information Set Monte Carlo Tree Search winning rate as a function of the Information Set Upper Confidence Bounds for Trees constant when it plays against the Greedy strategy as the hand team. 81 6.12 Scores Difference and Win or Loss reward methods compari- son on the Information Set Monte Carlo Tree Search winning rate as a function of the Information Set Upper Confidence Bounds for Trees constant when it plays against the Greedy strategy as the deck team. 81 6.13 Winning rate of Information Set Monte Carlo Tree Search as a function of the value used by the Epsilon-Greedy Simulation strategy when it plays against the Greedy strategy both as hand team and deck team. 82 6.14 Simulation strategies comparison on the Information Set Monte Carlo Tree Search winning rate when it plays against the Greedy strategy both as hand team and deck team. 83 6.15 Moves handlers comparison on the Information Set Monte Carlo Tree Search winning rate when it plays against the Greedy strategy both as hand team and deck team. 84 6.16 Moves handlers comparison on the Information Set Monte Carlo Tree Search winning rate when it plays against the Chitarrella-Saracino strategy both as hand team and deck team. 84 VIII 6.17 Determinizators comparison on the Information Set Monte Carlo Tree Search winning rate when it plays against the Chitarrella-Saracino strategy both as hand team and deck team. 85 A.1 Menu of the user application . 97 A.2 Round of the user application . 98 A.3 Scores at the end of a round of the user application . 98 IX X List of Tables 3.1 Cards' values for the calculation of primiera. Each column shows the correspondence between card's rank and used value. 34 3.2 Example of scores calculation. The first part shows the cards in the pile of each team at the end of a round. The second part shows the points achieved by each team and the final scores of the round. 35 4.1 Cards' values used by the Greedy strategy to determine the importance of a card. Each column of the first part shows the correspondence between a card in the suit of coins and the used value. Each column of the second part shows the correspondence between a card's rank in one of the other suits and the used value. 38 5.1 Complexity values of the Monte Carlo Tree Search algorithm applied to Scopone. 61 5.2 Complexity values of the Information Set Monte Carlo Tree Search algorithm applied to Scopone. 62 6.1 Winning rates of the hand team and the deck team, and the percentage of ties, when both the teams play randomly. 68 6.2 Tournament between the different versions of the Chitarrella- Saracino strategy. In each section, the artificial intelligence used by the hand team are listed at the left, while the ones used by the deck team are listed at the bottom. The first section shows the percentage of wins of the hand team, the second one shows the percentage of losses of the hand team, and the third one shows the percentage of ties. 69 6.3 Scoreboard of the Chitarrella-Saracino strategies tournament. It shows the percentage of wins, losses, and ties each artificial intelligence did during the tournament. 69 XI 6.4 Tournament between the rule-based artificial intelligence. In each section, the artificial intelligence used by the hand team are listed at the left, while the ones used by the deck team are listed at the bottom.