Temporal Difference Learning Algorithms Have Been Used Successfully to Train Neural Networks to Play Backgammon at Human Expert Level
USING TEMPORAL DIFFERENCE LEARNING TO TRAIN PLAYERS OF NONDETERMINISTIC BOARD GAMES by GLENN F. MATTHEWS (Under the direction of Khaled Rasheed) ABSTRACT Temporal difference learning algorithms have been used successfully to train neural networks to play backgammon at human expert level. This approach has subsequently been applied to deter- ministic games such as chess and Go with little success, but few have attempted to apply it to other nondeterministic games. We use temporal difference learning to train neural networks for four such games: backgammon, hypergammon, pachisi, and Parcheesi. We investigate the influ- ence of two training variables on these networks: the source of training data (learner-versus-self or learner-versus-other game play) and its structure (a simple encoding of the board layout, a set of derived board features, or a combination of both of these). We show that this approach is viable for all four games, that self-play can provide effective training data, and that the combination of raw and derived features allows for the development of stronger players. INDEX WORDS: Temporal difference learning, Board games, Neural networks, Machine learning, Backgammon, Parcheesi, Pachisi, Hypergammon, Hyper-backgammon, Learning environments, Board representations, Truncated unary encoding, Derived features, Smart features USING TEMPORAL DIFFERENCE LEARNING TO TRAIN PLAYERS OF NONDETERMINISTIC BOARD GAMES by GLENN F. MATTHEWS B.S., Georgia Institute of Technology, 2004 A Thesis Submitted to the Graduate Faculty of The University of Georgia in Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE ATHENS,GEORGIA 2006 © 2006 Glenn F. Matthews All Rights Reserved USING TEMPORAL DIFFERENCE LEARNING TO TRAIN PLAYERS OF NONDETERMINISTIC BOARD GAMES by GLENN F.
[Show full text]