Generation Based Analysis Model for Non-Cooperative Games
Total Page:16
File Type:pdf, Size:1020Kb
Generation Based Analysis Model for Non-Cooperative Games Soham Banerjee Department of Computer Engineering International Institute of Information Technology Pune, India Email: [email protected] Abstract—In most real-world games, participating agents are to its optimal solution which forms a good base for comparison all perfectly rational and choose the most optimal set of actions for the model’s performance. possible; but that is not always the case while dealing with complex games. This paper proposes a model for analysing II. PROPOSED MODEL the possible set of moves and their outcome; where not all participating agents are perfectly rational. The model proposed is The proposed model works within the framework of pre- used on three games, Prisoners’ Dilemma, Platonia Dilemma and determined assumptions; but may be altered or expanded to Guess 2 of the Average. All the three games have fundamental 3 other games as well. differences in their action space and outcome payoffs, making them good examples for analysis with the proposed model. • All agents have infinite memory - this means that an agent Index Terms—game theory, analysis model, deterministic remembers all previous iterations of the game along with games, superrationality, prisoners’ dilemma, platonia dilemma the observation as well as the result. • All the games have a well-defined pure strategy Nash Equilibrium. I. INTRODUCTION • The number of participating agents is always known. • All the games are non co-operative - this indicates that the While selecting appropriate actions to perform in a game, agents will only form an alliance if it is self-enforcing. a common assumption made is that, all other participating agents (co-operative or not) will make the most optimal move Each participating agent is said to posses a Generation of possible. Such an agent is called a Perfectly Rational Agent. Knowledge Gi. A Generation of Knowledge is defined as When a perfectly rational agent is aware of the fact that all follows Gi = f (Gi-1) ; where f (x) is the generation increment other agents are also perfectly rational, the agent is said to function and Gx is the Generation of Knowledge. be Superrational. This concept of decision mapping was first A Generation of Knowledge is the state of an agent which introduced by Douglas Hofstader in his book Metamagical defines its decision mapping. For example, an agent who Themas, while proposing a solution to the Platonia Dilemma. knows its destination is to the right, will move towards the Analysing the outcomes of games where all agents are right. This is the Generation of Knowledge possessed by the superrational is useful and usually leads to the formation agent. If the agent learns about a shortcut to its destination, it of a good mapping - from observations to actions; but in would prefer to use the shorter path and hence use the shortcut some cases the agents might not be superrational and hence eventually. This is the incremented Generation of Knowledge the aforementioned model will not yield proper results. The possessed by the agent. knowledge and understanding of the real world agent changes The model has the following properties based on multiple factors including, but not limited to, pre- • The initial Generation of Knowledge is G0 and for most vious iterations, changing goals, actions performed by other games, it is a random choice made over the entire action agents or simply finding a better set of actions which maps space. to a better output. As the actions taken by any agent changes, • Each generation is incremented over the previous Gener- the optimal action choices for other agents may change. To ation of Knowledge based on the output provided by the analyse all permutations of actions taken by each of the agents, previous generations. this paper proposes a model that divides the decision process • For each game the Generation of Knowledge starts re- into Generations of Knowledge which defines actions of each peating after a certain generations as the agent reaches participating agent. the optimal situation. In this paper the proposed model is applied to three games, • The generations of knowledge are ordinal in nature and 2 have no numeric weight assigned to them. which are Prisoner’s Dilemma, Platonia Dilemma and Guess 3 of the Average. All the aforementioned games have different Using Generation of Knowledge to dictate the agents’ Nash Equilibria and strategies to achieve the most optimal actions makes it possible to find the most optimal action outcome. Each of the selected games has a different approach even when dealing with agents who do not select the most th optimal action possible, i.e. are not superrational in nature. • Gi = i iteration where you perform the action performed In most real-world games such as the stock market, warfare, by the other agent in the previous generation. This data sharing etc. it is a challenging task to find agents who strategy is also called as Tit-For-Tat. can function rationally and make optimal decisions. If we can • Gn = It becomes apparent that both player will always identify the Generation of Knowledge possessed by the agent pick the same action; making staying silent the superra- we can select the most optimal action from the total action tional action to be taken. space. The Generation of Knowledge also changes when different agents have different objectives. An agent wanting to earn the maximum amount of money possible will not care about the money earned by other agents. This means that in games where different agents are trying to achive differnt goals they will be at different generations of knowledge, making this proposed model useful in these situations as well. This paper uses the proposed model on ”Prisoners’ Dilem- 2 mma”, ”Platonia Dilemma” and ”Guess 3 of the Average” to analyse the payoffs. III. PRISONERS’DILEMMA Fig. 2. Payoff Matrix for each generation in Prisoners’ Dilemma Prisoner’s Dilemma is a common game played between two or more agents. The game is defined as follows: Two criminals In the inter-generation payoff table, there exists cells that are arrested and are taken in for interrogation. There exists no have multiple values; these have been reduced into one value means of communication between the two individuals. The based on their probabilities of occurrance. For example; if a prosecutors have sufficient evidence to convict the pair on a cell has 50% probability to have value 3 and 50% probability lesser charge, but lack sufficient evidence to convict the pair on to have value 5,s the reduced value will be 4. the primary charge. In order to convict them on their primary IV. PLATONIA DILEMMA charge, the prosecutors offer each prisoner a bargain. Each criminal is given the opportunity to either cooperate with the Douglas Hofstader, in his book Metamagical Themas, intro- other criminal by remaining silent or betray the other criminal duces a game played among 20 people who have no means of by testifying that the other committed the crime. The offer communiction with each other. The game is defined as follows; provides different payouts for diffent senarios. These are - An eccentric trillionaire gathers 20 people together, and tells them that if one and only one of them sends him a telegram • If both criminals betray each other, each of them serves two years in prison. (reverse charges) by noon the next day, that person will receive a billion dollars. If he receives more than one telegram, or none • If one(A) of them betrays the other(B) while the other remain silent, the criminal who betrays has to face no at all, no one will get any money. charges and is free to go but the one who stays quite will It may seem impossible to win this game, but there is a have to serve three years in prison. set of actions shared by all participating agents of the game which leads to 1 person aquiring the billion dollars. The game • If both criminals remain silent, they will each serve only one year in prison. has the generations of knowledge defined as follows: • G0 = 50% probability to send a telegram and 50% probability of not sending a telegram. • G1 = Send a telegram anyway as it increases your probability of attaining the billion dollars. • G2 = Roll a 20 sided-die and only send a telegram if the outcome of roll is 1. (As this will lead to only 1 person sending the telegram most of the times the game is played). • G3 = Send a telegram anyway as it increases your Fig. 1. Payoff Matrix for Prisoners’ Dilemma probability of attaining the billion dollars. • G2i = Roll a 20 sided-die and only send a telegram if you If we apply our model to this game we get the Generation roll a 1. (Pure strategy Nash Equilibrim) of Knowledge as; • G2i+1 = Send a telegram anyway as it increases your • G0 = 50% chance of staying silent and 50% chance of probability of attaining the billion dollars. betraying. After a couple of increments, the Generations of Knowl- • G1 = 100% chance of betraying (Pure Strategy Nash edge start to toggle between Rolling a die And Sending the Equilibrium). Telegram anyway. This makes the model especially useful as knowing the Generation of Knowledge, the other contestents the Average”. After dividing various actions by Generations of can greatly increase the agent’s chances. knowledge, it can be noticed that selecting the action that leads The following table indicates the probability (in percentage) to the Pure Strategy Nash Equilibrium may not be the solution of winning the billion dollars based on what Generation of if the other participating agents are at different generations of Knowledge that the agent and opposing agents are on.