A Game-Centric Approach to Teaching Artificial Intelligence
Total Page:16
File Type:pdf, Size:1020Kb
A Game-centric Approach to Teaching Artificial Intelligence Marc Pouly a and Thomas Koller b and Ruedi Arnold c School of Information Technology, Lucerne University of Applied Sciences and Arts, Switzerland Keywords: Teaching Artificial Intelligence, Card Game Playing Platform, Open Source Software, Student Competition, Contextualized Education. Abstract: Man vs. machine competitions have always been attracting much public attention and the famous defeats of human champions in chess, Jeopardy!, Go or poker undoubtedly mark important milestones in the history of artificial intelligence. In this article we reflect on our experiences with a game-centric approach to teaching artificial intelligence that follows the historical development of algorithms by popping the hood of these cham- pion bots. Moreover, we made available a server infrastructure for playing card games in perfect information and imperfect information playing mode, where students can evaluate their implementations of increasingly sophisticated game-playing algorithms in weekly online competitions, i.e. from rule-based systems to exhaus- tive and heuristic search in game trees to deep learning enhanced Monte Carlo methods and reinforcement learning completely freed of human domain knowledge. The evaluation of this particular course setting re- vealed enthusiastic feedback not only from students but also from the university authority. What started as an experiment became part of the standard computer science curriculum after just one implementation. 1 INTRODUCTION the legendary victories of IBM Deep Blue against the chess grandmaster Garry Kasparov in 1997 (Camp- The online news platform New Atlas coined 2017 as bell et al., 2002) and IBM Watson against two human Jeopardy! champions in a series of public TV shows The year AI beat us at all our own games in 2011 (Kelly and Hamm, 2013), to the most recent referring to the memorable victory of DeepMind’s Al- breakthroughs in Go, poker and various eSports dis- phaGo Master against Ke Jie, the world No. 1 ranked ciplines. Irrespective of the certainly welcome news Go player in May 2017 and the seminal announce- headlines for artificial intelligence research, set off ment of AlphaGo Zero in October 2017, stronger than each time the human race had to surrender in another any previous human champion defeating version of popular game, the development of intelligent game- AlphaGo. But the same year has also seen DeepStack playing agents has always had a much more signif- and Libratus independently conquering several hu- icant role. After all, chess was famously referred to man poker champions in No Limit Texas Hold’em as as the drosophila of artificial intelligence (McCarthy, well as the participation of OpenAI in a giant eSports 1990), and after 1997, this role was attached to the tournament in August 2017 beating world top players Chinese board game Go as experts had thought it in the online battle game Dota 2 (Haridy, 2017). could take another hundred years before an AI sys- Such public exhibitions of human against artifi- tem beats a human Go champion (Johnson, 1997). cial intelligence in various strategic board, card and Throughout the course of history, AI has mastered video games have always been attracting much in- increasing levels of game complexities including the terest from one of the earliest public clashes of man vast search spaces of Go, imperfect information and and machine at the industrial fair in Berlin in 1951, randomness in card games and even socio-behavioral where an archaic computer called Nimrod beat the aspects (e.g. bluffing in poker games) that are typi- former secretary of commerce of West Germany in cally attributed to humans. These big headlines there- the game of Nim (Redheffer, 1948; Baker, 2010), over fore witness the algorithmic progress of artificial in- telligence research, i.e. from rule-based systems to a https://orcid.org/0000-0002-9520-4799 exhaustive and heuristic search in game trees to deep b https://orcid.org/0000-0003-2309-5359 learning enhanced Monte Carlo methods and rein- c https://orcid.org/0000-0001-7185-305X 398 Pouly, M., Koller, T. and Arnold, R. A Game-centric Approach to Teaching Artificial Intelligence. DOI: 10.5220/0007745203980404 In Proceedings of the 11th International Conference on Computer Supported Education (CSEDU 2019), pages 398-404 ISBN: 978-989-758-367-4 Copyright c 2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved A Game-centric Approach to Teaching Artificial Intelligence forcement learning freed of any human domain ex- of the Nimrod, maybe the first artificial intelligence to pertise. At the same time, they were made possible beat a human in a public exhibition in 1951, respec- only by important technological advances: the Nim- tively to implement perfect play for a selection of sim- rod was basically an public demonstration of early ple games including Nim and tic-tac-toe. From there computer technology; Deep Blue a massively parallel we intensified the discussion on state space complex- architecture of dedicated chess chips (Campbell et al., ity and game tree modelling, introduced the notion 2002) whereas with Watson IBM ultimately proved of zero-sum games, the famous minimax algorithm their expertise in building supercomputers culminat- with its extension to multiple players and alpha-beta ing in Google’s demonstration of GPU technology for pruning. Such a brute-force analysis of the connect deep learning with AlphaGo and AlphaGo Zero. four game yielding a perfect player based on sim- This historical evolution of ideas, methods and ple database lookup became possible in 1995. Note technology complemented with all the intriguing as- however that the game was weakly solved already in pects and stories around a man-machine competition 1988 (Allen, 2010). We next introduced the idea of and broad coverage in research literature and popu- combining minimax with domain-specific heuristics lar media made us raise the idea of experimentally and elaborated on formal requirements, manual and designing an introductory class to artificial intelli- semi-automated design approaches to heuristics fol- gence on Bachelor’s level along these developments, lowing (Russell and Norvig, 2010). This proved suffi- leaving the trodden paths of just following the well- cient for a closer analysis of DeepBlue - a massively established syllabus of one of the major textbooks parallelized minimax implementation with alpha-beta on the topic. Moreover, by making available a self- pruning and a heuristics based on 8000 chess fea- developed software framework together with a server tures manually tailored by a team of expert chess application for in-class tournaments, we gave our stu- players that finally beat a chess grandmaster in 1997 dents the opportunity not only to design and imple- (Campbell et al., 2002). This marked an unprece- ment their own game-playing agents with increas- dented success in AI history but at the same time ingly complex and stronger methods, but also to fre- stands for the extreme edge of what can be achieved quently run tournaments and obtain direct feedback by a rule-based vaccination of algorithms with hu- on their approaches. AI courses with in-class tourna- man domain knowledge. Also, these approaches do ments or even participation in open challenges, e.g. not carry over to games with imperfect information. on Kaggle1, became fairly common in recent times. Our course advanced by introducing Monte Carlo tree However, we aimed to push this idea to an extreme by search (MCTS) with random play-outs and the up- running weekly student competitions for each game- per confidence bound for balancing exploration and playing method presented in class. The subsequent exploitation. First proposed in 2006 (Coulom, 2006), chapters survey our first attempts and experiences in MCTS since then has completely dominated the com- setting up this program, initially limited to a single puter Go scene (Couetoux¨ et al., 2013). The picture semester course but with a clear ambition to spread was rounded out by its variants for imperfect informa- over several courses with different emphasis. We will tion games called determinization and information set further justify our selection of a card game for these MCTS (Cowling et al., 2012). We paused the histori- competitions and sketch the software framework and cal survey at this point for an introduction to the basic server infrastructure that we make available our code principles of supervised machine learning including as open-source projects to the community in the hope data preparation techniques and a small selection of of spreading and challenging our guiding principles classifiers adding in logistic regression, decision trees and the proposed didactical method. and random forests. Hereby, is is important to say that we put the main emphasis not on the algorithmic as- pects of machine learning but rather on how to cor- rectly evaluate a machine learning classifier with hy- 2 A GAME-CENTRIC SYLLABUS perparameter optimization. Next, an introduction to deep learning and the open source neural network li- Our course started off by an introduction to sequen- brary Keras2 was given, which allowed us to resume tial games with perfect information laying out ba- our survey of approaches to game-playing agents by sic concepts such as game trees, backward induc- combining MCTS with value and policy functions ob- tion, sub-game perfect Nash equilibria and the no- tained from supervised deep learning analogously to tion of strongly and weakly solved games. This al- the original version of AlphaGo (Silver et al., 2016). ready provides sufficient information to pop the hood The final part of this course comprised an introduction 1https://www.kaggle.com/competitions 2https://keras.io/ 399 CSEDU 2019 - 11th International Conference on Computer Supported Education to deep reinforcement learning aligned with the ap- in-class tournaments because of its increased com- proaches implemented in AlphaGo Zero (Silver et al., plexity: Jass requires one player to initially declare 2017; Silver et al., 2018).