A GAME THEORETIC FRAMEWORK for COMMUNICATION DECISIONS in MULTIAGENT SYSTEMS by TUMMALAPALLI SUDHAMSH REDDY Presented to The

A GAME THEORETIC FRAMEWORK FOR COMMUNICATION DECISIONS IN MULTIAGENT SYSTEMS by TUMMALAPALLI SUDHAMSH REDDY Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY THE UNIVERSITY OF TEXAS AT ARLINGTON December 2012 Copyright c by TUMMALAPALLI SUDHAMSH REDDY 2012 All Rights Reserved To my family ACKNOWLEDGEMENTS I would like to thank my supervising professor Dr. Manfred Huber and my co- supervising professor Dr Gergely Zaruba for their invaluable support and advise. This dissertation would have not been possible without their significant effort. I would also like to thank Mr. David Levine for the help and support he has given me for the past few years, first as the thesis advisor for my Masters of Science degree, and secondly as a committee member for my dissertation. I would also like to thank Dr. Farhad Kamanger for his helpful comments and for serving on my PhD committee. I would like to thank my family for their constant support and encouragement. This work would have not been possible without their support. November 16, 2012 iv ABSTRACT A GAME THEORETIC FRAMEWORK FOR COMMUNICATION DECISIONS IN MULTIAGENT SYSTEMS TUMMALAPALLI SUDHAMSH REDDY, Ph.D. The University of Texas at Arlington, 2012 Supervising Professor: Manfred Huber Communication is the process of transferring information between multiple entities. We study the communication process between different entities that are modeled as multiple agents. Agents are assumed to be rational and take actions to increase their utility. To study the interactions between these entities we use game theory, which is a mathematical tool that is used to model the interactions and decision process of these multiple players/agents. In the presence of multiple players their interactions are generally modelled as a stochastic game. In most cases it is clear that communication can help in coordinating the actions between multiple agents such that they can achieve higher utility, but, what is not clear is how the agents can take decisions about when to communicate and more importantly what to communicate. In this dissertation, we focus on the question of what information the agents can communicate and how they take decisions on selecting information to communicate. v Here we assume that the communication medium, protocols and language are already present in this multiagent system. We address the question of information selection in the communication process. In this thesis, we develop a formal framework for communication between different agents using game theory. Our major contributions are: A classifications of multiagent systems and what information to communicate in these various cases. Algorithms for inverse reinforcement learning in multiagent systems, which al- low an agent to get a better understanding about the other agents. A mathematical framework using which the agents can make two important decisions, when to communicate, and, more importantly what to communicate in different classes of multiagent systems. vi TABLE OF CONTENTS ACKNOWLEDGEMENTS . iv ABSTRACT . v LIST OF ILLUSTRATIONS . x Chapter Page 1. INTRODUCTION . 1 1.1 Objective . 2 1.2 Methodology . 4 1.3 Contribution . 5 1.4 Outline . 6 2. MULTIAGENT SYSTEMS . 7 2.1 Introduction . 7 2.2 Single Agent System . 8 2.2.1 Markov Decision Process . 9 2.2.2 Partially Observable Markov Decision Process . 10 2.3 Game Theory . 12 2.3.1 Normal Form Game . 12 2.3.2 Extensive Form Games . 15 2.3.3 Stochastic Games . 17 2.4 Communication in MAS . 21 3. RELATED WORK . 23 3.1 Introduction . 23 3.2 Reinforcement Learning in Multiagent Systems . 23 vii 3.2.1 NASH Q-Learning . 25 3.3 Inverse Reinforcement Learning . 26 3.3.1 IRL for MDPnR from Policies . 27 3.3.2 IRL for MDPnR from Observations . 28 3.4 Communication in MAS . 29 4. COMMUNICATION IN MULTIAGENT SYSTEMS . 32 4.1 Introduction . 32 4.2 Classification of Multiagent Systems . 34 4.2.1 Observability . 34 4.2.2 Information About Other Agents . 37 4.2.3 Agent Type . 38 4.3 Modeling of Multiagent Systems and What to Communicate . 39 5. EQUILIBRIUM SELECTION IN NASHQ . 43 5.1 Introduction . 43 5.2 Payoff and Risk Dominance . 43 5.3 Risk Dominance Method for Equilibrium Selection . 45 5.4 Results . 48 6. INVERSE REINFORCEMENT LEARNING FOR DECENTRALIZED NON- COOPERATIVE MULTIAGENT SYSTEMS . 51 6.1 Introduction . 51 6.2 IRL in GSSGnR from Policies . 54 6.3 IRL in GSSRnR from Trajectories . 56 6.4 Experimental Results . 58 6.5 Summary . 60 7. GAME THEORETIC FRAMEWORK FOR COMMUNICATION IN FULLY OBSERVABLE MULTIAGENT SYSTEMS . 62 viii 7.1 Introduction . 62 7.2 Communication in Fully Observable Multiagent System . 62 7.3 When to Communicate . 63 7.4 What to communicate . 64 7.5 Experimental Results . 68 7.6 Summary . 70 8. LEARNING WITH COMMUNICATION IN FULLY OBSERVABLE MUL- TIAGENT SYSTEMS . 72 8.1 Nash-SARSA . 72 8.2 Online Algorithm for Communication . 73 8.3 Experimental Results . 75 8.4 Summary . 77 9. LEARNING WITH COMMUNICATION IN PARTIALLY OBSERVABLE MULTIAGENT SYSTEMS . 78 9.1 Planning in POSGs . 78 9.2 Linear Nash-Q . 79 9.3 Belief State Update . 81 9.4 Communication in POSG . 85 9.5 Communication Game . 86 10. CONCLUSION . 88 10.1 Future Work . 88 REFERENCES . 90 BIOGRAPHICAL STATEMENT . 97 ix LIST OF ILLUSTRATIONS Figure Page 2.1 Agent interacts with environment . 8 2.2 A simple bimatrix game . 13 2.3 A simple bimatrix game with 2 players and 2 actions . 14 2.4 A simple bimatrix game with 2 players and 2 actions . 15 3.1 IRL in Single agent systems . 27 4.1 Gridworld game . 33 4.2 Classification of the multiagent systems . 35 4.3 Classification of the multiagent systems based on observability of the world . 36 4.4 Partially observable gridworld . 36 4.5 Classification of the multiagent systems based on information about other agents . 38 4.6 Classification of the multiagent systems based on similarities between agents . 38 4.7 Models of the Different Classes of Non-Cooperative MAS . 40 5.1 Bimatrix game format . 44 5.2 A simple bimatrix game . 44 5.3 Left:The Gridworld game; Right: Agent policies . 49 6.1 Linear Program for IRL . 56 6.2 IRL from trajectories . 58 6.3 Grid World with rewards for Agent 1 and Agent 2 . 59 x 6.4 Estimated reward for Agent 1 . 60 6.5 Estimated reward for Agent 2 . 61 7.1 A simple communication game modeled as an Extensive Form Game . 67 7.2 Gridworld game: policy without communication (left) and policy with communication (right) . 68 7.3 A part of the Communication as an Extensive Form Game . 71 8.1 A simple communication game modeled as an Extensive Form Game . 75 8.2 Gridworld game example: (a) describes a simple gridworld example. (b) shows the policies learned by the agents. (c) shows a different example of the gridworld. (d) shows the policies learned by the agents . 76 8.3 Gridworld game example: (a) Describes a simple gridworld example. (b) Shows the policies learned by the agents when communication is allowed . 77 9.1 Belief tracking without communication . 85 9.2 Belief tracking with observation communication . 87 xi CHAPTER 1 INTRODUCTION Communication is a fundamental aspect of all sentient organisms that are ca- pable of reasoning, be they humans or animals communicating verbally, or unicellular organisms such as bacteria communicating using signalling [1]. Communication can be defined as the process of conveying information between sentient beings. While the process of communication differs between different organisms, the question of what information to communicate is central to all organisms. There are many forms of communication which differ based on the abilities of the organism or entity that is communicating. In human beings, communication is generally done consciously by either verbally using specific language that all people in the group can reasonably understand, or visually using specific sign language such as the American Sign Language system, or by subconsciously using certain body language or facial expressions. The more formally defined the language and communication process is, the better it is for communicating information and understanding the communicated information. In most cases there is a reason to communicate any information. We believe that the true value of communication is based on that reason and the impact it has on the long term benefit of the communicating entity1. 1In this document we use the terms agent, players, entity to mean a rational agent which acts in the environment to maximize its long term benefit 1 To study the interactions between these entities we model them as rational agents [2] in a multiagent system (MAS) [3] and then use game theory [3] to model the interactions between these agents. A rational agent is defined as an agent that always takes an action which max- imizes its performance measure. Game.

Load more