Search in Games with Incomplete Information
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Minimax TD-Learning with Neural Nets in a Markov GameMinimax TD-Learning with Neural Nets in a Markov Game Fredrik A. Dahl and Ole Martin Halck Norwegian Defence Research Establishment (FFI) P.O. Box 25, NO-2027 Kjeller, Norway {Fredrik-A.Dahl, Ole-Martin.Halck}@ffi.no Abstract. A minimax version of temporal difference learning (minimax TD- learning) is given, similar to minimax Q-learning. The algorithm is used to train a neural net to play Campaign, a two-player zero-sum game with imperfect information of the Markov game class. Two different evaluation criteria for evaluating game-playing agents are used, and their relation to game theory is shown. Also practical aspects of linear programming and fictitious play used for solving matrix games are discussed. 1 Introduction An important challenge to artificial intelligence (AI) in general, and machine learning in particular, is the development of agents that handle uncertainty in a rational way. This is particularly true when the uncertainty is connected with the behavior of other agents. Game theory is the branch of mathematics that deals with these problems, and indeed games have always been an important arena for testing and developing AI. However, almost all of this effort has gone into deterministic games like chess, go, othello and checkers. Although these are complex problem domains, uncertainty is not their major challenge. With the successful application of temporal difference learning, as defined by Sutton [1], to the dice game of backgammon by Tesauro [2], random games were included as a standard testing ground. But even backgammon features perfect information, which implies that both players always have the same information about the state of the game.
- 
												  De Fina, A. 2007. Code Switching and Ethnicity in a Community of PracticeGeorgetown University Institutional Repository http://www.library.georgetown.edu/digitalgeorgetown De Fina, A. 2007. Code switching and ethnicity in a community of practice. Language in Society 36 (3):371-392. http://dx.doi.org/10.1017/S0047404507070182 Collection Permanent Link: http://hdl.handle.net/10822/559318 © 2007 Cambridge University Press This material is made available online with the permission of the author, and in accordance with publisher policies. No further reproduction or distribution of this copy is permitted by electronic transmission or any other means. Language in Society 36, 371–392. Printed in the United States of America DOI: 10.10170S0047404507070182 Code-switching and the construction of ethnic identity in a community of practice ANNA DE FINA Georgetown University Italian Department ICC Building 307 J 37 and O Streets NW Washington D.C. 20057 [email protected] ABSTRACT In the past twenty years the existence of a sense of ethnic belonging among immigrant groups of European ancestry in the United States has become the focus of frequent debates and polemics. This article argues that ethnicity can- not be understood if it is abstracted from concrete social practices, and that analyses of this construct need to be based on ethnographic observation and on the study of actual talk in interaction. This interactionally oriented per- spective is taken to present an analysis of how Italian ethnicity is constructed as a central element in the collective identity of an all-male card playing club. Linguistic strategies, particularly code-switching, are central in this con- struction, but their role becomes apparent only when language use is ana- lyzed within significant practices in the life of the club.
- 
												  The Penguin Book of Card GamesPENGUIN BOOKS The Penguin Book of Card Games A former language-teacher and technical journalist, David Parlett began freelancing in 1975 as a games inventor and author of books on games, a field in which he has built up an impressive international reputation. He is an accredited consultant on gaming terminology to the Oxford English Dictionary and regularly advises on the staging of card games in films and television productions. His many books include The Oxford History of Board Games, The Oxford History of Card Games, The Penguin Book of Word Games, The Penguin Book of Card Games and the The Penguin Book of Patience. His board game Hare and Tortoise has been in print since 1974, was the first ever winner of the prestigious German Game of the Year Award in 1979, and has recently appeared in a new edition. His website at http://www.davpar.com is a rich source of information about games and other interests. David Parlett is a native of south London, where he still resides with his wife Barbara. The Penguin Book of Card Games David Parlett PENGUIN BOOKS PENGUIN BOOKS Published by the Penguin Group Penguin Books Ltd, 80 Strand, London WC2R 0RL, England Penguin Group (USA) Inc., 375 Hudson Street, New York, New York 10014, USA Penguin Group (Canada), 90 Eglinton Avenue East, Suite 700, Toronto, Ontario, Canada M4P 2Y3 (a division of Pearson Penguin Canada Inc.) Penguin Ireland, 25 St Stephen’s Green, Dublin 2, Ireland (a division of Penguin Books Ltd) Penguin Group (Australia) Ltd, 250 Camberwell Road, Camberwell, Victoria 3124, Australia
- 
												  Bayesian Games Professors Greenwald 2018-01-31Bayesian Games Professors Greenwald 2018-01-31 We describe incomplete-information, or Bayesian, normal-form games (formally; no examples), and corresponding equilibrium concepts. 1 A Bayesian Model of Interaction A Bayesian, or incomplete information, game is a generalization of a complete-information game. Recall that in a complete-information game, the game is assumed to be common knowledge. In a Bayesian game, in addition to what is common knowledge, players may have private information. This private information is captured by the notion of an epistemic type, which describes a player’s knowledge. The Bayesian-game formalism makes two simplifying assumptions: • Any information that is privy to any of the players pertains only to utilities. In all realizations of a Bayesian game, the number of players and their actions are identical. • Players maintain beliefs about the game (i.e., about utilities) in the form of a probability distribution over types. Prior to receiving any private information, this probability distribution is common knowledge: i.e., it is a common prior. After receiving private information, players conditional on this information to update their beliefs. As a consequence of the com- mon prior assumption, any differences in beliefs can be attributed entirely to differences in information. Rational players are again assumed to maximize their utility. But further, they are assumed to update their beliefs when they obtain new information via Bayes’ rule. Thus, in a Bayesian game, in addition to players, actions, and utilities, there is a type space T = ∏i2[n] Ti, where Ti is the type space of player i.There is also a common prior F, which is a probability distribution over type profiles.
- 
												  Outline for Static Games of Complete Information IOutline for Static Games of Complete Information I. Definition of a game II. Examples III. Definition of Nash equilibrium IV. Examples, continued V. Iterated elimination of dominated strategies VI. Mixed strategies VII. Existence theorem on Nash equilibria VIII. The Hotelling model and extensions Copyright © 2004 by Lawrence M. Ausubel Definition: An n-player, static game of complete information consists of an n-tuple of strategy sets and an n-tuple of payoff functions, denoted by G = {S1, … , Sn; u1, … , un} Si, the strategy set of player i, is the set of all permissible moves for player i. We write si ∈ Si for one of player i’s strategies. ui, the payoff function of player i, is the utility, profit, etc. for player i, and depends on the strategies chosen by all the players: ui(s1, … , sn). Example: Prisoners’ Dilemma Prisoner II Remain Confess Silent Remain –1 , –1 –5 , 0 Prisoner Silent I Confess 0, –5 –4 ,–4 Example: Battle of the Sexes F Boxing Ballet Boxing 2, 1 0, 0 M Ballet 0, 0 1, 2 Definition: A Nash equilibrium of G (in pure strategies) consists of a strategy for every player with the property that no player can improve her payoff by unilaterally deviating: (s1*, … , sn*) with the property that, for every player i: ui (s1*, … , si–1*, si*, si+1*, … , sn*) ≥ ui (s1*, … , si–1*, si, si+1*, … , sn*) for all si ∈ Si. Equivalently, a Nash equilibrium is a mutual best response. That is, for every player i, si* is a solution to: ***** susssssiiiiin∈arg max{ (111 , ... ,−+ , , , ... , )} sSii∈ Example: Prisoners’ Dilemma Prisoner II
- 
												  Efficiency and Welfare in Economies with Incomplete Information∗Efficiency and Welfare in Economies with Incomplete Information∗ George-Marios Angeletos Alessandro Pavan MIT, NBER, and FRB of Minneapolis Northwestern University September 2005. Abstract This paper examines a class of economies with externalities, strategic complementarity or substitutability, and incomplete information. We first characterize efficient allocations and com- pare them to equilibrium. We show how the optimal degree of coordination and the efficient use of information depend on the primitives of the environment, and how this relates to the welfare losses due to volatility and heterogeneity. We next examine the social value of informa- tion in equilibrium. When the equilibrium is efficient, welfare increases with the transparency of information if and only if agents’ actions are strategic complements. When the equilibrium is inefficient, additional effects are introduced by the interaction of externalities and informa- tion. We conclude with a few applications, including investment complementarities, inefficient fluctuations, and market competition. Keywords: Social value of information, coordination, higher-order beliefs, externalities, transparency. ∗An earlier version of this paper was entitled “Social Value of Information and Coordination.” For useful com- ments we thank Daron Acemoglu, Gadi Barlevi, Olivier Blanchard, Marco Bassetto, Christian Hellwig, Kiminori Matsuyama, Stephen Morris, Thomas Sargent, and especially Ivàn Werning. Email Addresses: [email protected], [email protected]. 1Introduction Is coordination socially desirable? Do markets use available information efficiently? Is the private collection of information good from a social perspective? What are the welfare effects of the public information disseminated by prices, market experts, or the media? Should central banks and policy makers disclose the information they collect and the forecasts they make about the economy in a transparent and timely fashion, or is there room for “constructive ambiguity”? These questions are non-trivial.
- 
												  Lecture 3 1 Introduction to Game Theory 2 Games of CompleteCSCI699: Topics in Learning and Game Theory Lecture 3 Lecturer: Shaddin Dughmi Scribes: Brendan Avent, Cheng Cheng 1 Introduction to Game Theory Game theory is the mathematical study of interaction among rational decision makers. The goal of game theory is to predict how agents behave in a game. For instance, poker, chess, and rock-paper-scissors are all forms of widely studied games. To formally define the concepts in game theory, we use Bayesian Decision Theory. Explicitly: • Ω is the set of future states. For example, in rock-paper-scissors, the future states can be 0 for a tie, 1 for a win, and -1 for a loss. • A is the set of possible actions. For example, the hand form of rock, paper, scissors in the game of rock-paper-scissors. • For each a ∈ A, there is a distribution x(a) ∈ Ω for which an agent believes he will receive ω ∼ x(a) if he takes action a. • A rational agent will choose an action according to Expected Utility theory; that is, each agent has their own utility function u :Ω → R, and chooses an action a∗ ∈ A that maximizes the expected utility . ∗ – Formally, a ∈ arg max E [u(ω)]. a∈A ω∼x(a) – If there are multiple actions that yield the same maximized expected utility, the agent may randomly choose among them. 2 Games of Complete Information 2.1 Normal Form Games In games of complete information, players act simultaneously and each player’s utility is determined by his actions as well as other players’ actions. The payoff structure of 1 2 GAMES OF COMPLETE INFORMATION 2 the game (i.e., the map from action profiles to utility vectors) is common knowledge to all players in the game.
- 
												  Quantum Bidding in Bridge. Quantum bidding in Bridge Sadiq Muhammad,1 Armin Tavakoli,1 Maciej Kurant,2 Marcin Paw lowski,3, 4 Marek Zukowski,˙ 3 and Mohamed Bourennane1 1Department of Physics, Stockholm University, S-10691, Stockholm, Sweden 2Department of Information Technology and Electrical Engineering, ETH Zurich, Switzerland 3Instytut Fizyki Teoretycznej i Astrofizyki, Uniwersytet Gda´nski, PL-80-952 Gda´nsk, Poland 4Department of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom (Dated: March 19, 2014) Quantum methods allow to reduce communication complexity of some computational tasks, with several separated partners, beyond classical constraints. Nevertheless, experimental demonstrations of this fact are thus far limited to some abstract problems, far away from real-life tasks. We show here, and demonstrate experimentally, that the power of reduction of communication complexity can be harnessed to gain advantage in famous, immensely popular, card game - Bridge. The essence of a winning strategy in Bridge is efficient communication between the partners. The rules of the game allow only specific form of communication, of a very low complexity (effectively one has a strong limitations on number of exchanged bits). Surprisingly, our quantum technique is not violating the existing rules of the game (as there is no increase in information flow). We show that our quantum Bridge auction corresponds to a biased nonlocal Clauser-Horne-Shimony-Holt (CHSH) game, which is equivalent to a 2 → 1 quantum random access code. Thus our experiment is also a realization of such protocols. However, this correspondence is not full which enables the Bridge players to have efficient strategies regardless of the quality of their detectors.
- 
												  Bayesian Games with IntentionsBayesian Games with Intentions Adam Bjorndahl Joseph Y. Halpern Rafael Pass Carnegie Mellon University Cornell University Cornell University Philosophy Computer Science Computer Science Pittsburgh, PA 15213 Ithaca, NY 14853 Ithaca, NY 14853 [email protected] [email protected] [email protected] We show that standard Bayesian games cannot represent the full spectrum of belief-dependentprefer- ences. However, by introducing a fundamental distinction between intended and actual strategies, we remove this limitation. We define Bayesian games with intentions, generalizing both Bayesian games and psychological games [5], and prove that Nash equilibria in psychological games correspond to a special class of equilibria as defined in our setting. 1 Introduction Type spaces were introduced by John Harsanyi [6] as a formal mechanism for modeling games of in- complete information where there is uncertainty about players’ payoff functions. Broadly speaking, types are taken to encode payoff-relevant information, a typical example being how each participant val- ues the items in an auction. An important feature of this formalism is that types also encode beliefs about types. Thus, a type encodes not only a player’s beliefs about other players’ payoff functions, but a whole belief hierarchy: a player’s beliefs about other players’ beliefs, their beliefs about other players’ beliefs about other players’ beliefs, and so on. This latter point has been enthusiastically embraced by the epistemic game theory community, where type spaces have been co-opted for the analysis of games of complete information. In this context, types encode beliefs about the strategies used by other players in the game as well as their types.
- 
												  The Cadet Barracks, the Walls of the Lower Floors Is Almost a Country Has About Hit Bottom Brown Is the Best Man in the Class 10THE PUBLISHECADED WEEKLY BY T THE CORPS OF CADETS VIRGINIA MILITARY INSTITUTE Vol. XXIV LEXINGTON, VIRGINIA, MONDAY, FEB. 2, 1931 No. 15 CORPS MADE INTO REGIMENT TWO CAVALRY AND ONE INFANTRY COMPANIES FORM Corps Receives Gratifying Report Highway Conference FIRST BATTALION ARTILLERY THE SECOND on Inspection By Col. Leavitt to Meet Here Dayhuff Is Made Regimental Commander; Brower and Browning Discussions Designed To Aid Command Battalions. ALL PHASES OF CADET MILITARY ACTIVITIES ARE Counties In Highway VIEWED WITH SATISFACTION Problems. REGIMENTAL COMMANDER TO WEAR SIX BARS ON The Sixth Annual Highway Con- CHEVRONS; BATTALION COMMANDER FIVE Appearance of Corps Especially commended In Report By Corps ference will be held at the Virginia Area Inspector. First Class Privates Put On O. Military Institute, February 5 and 6,; (^IfJSS Attends G. Roster. MARKED IMPROVEMENT 1931. The conference is to be spon- sored by V. M. I. in connection with That eagerly awaited and much OVER LAST YEAR Dave Miller to Coach Fancy Dress Ball discussed event, makeovers, occurred the board of supervisors of Rock- Friday night. Following the prece- On Dec. 4 and 5 Colonel Leavitt bridge county and the United States FIRST CLASS • HOP SATUR dent established last January, the the R 0. T. C. officer of the Third Rat Athletics corps was organized into a regiment Corps'area, visited V. M. I. for the Bureau of public roads. DAY; BIG SUCCESS TO BE HEAlTOF FOOTBALL, of two battai'ons of three companies purpose of making his annual corps The purpose of this conference is On Friday night a great number Gach.
- 
												  Part II Dynamic Games of Complete InformationPart II Dynamic Games of Complete Information 83 84 This is page 8 Printer: Opaq 10 Preliminaries As we have seen, the normal form representation is a very general way of putting a formal structure on strategic situations, thus allowing us to analyze the game and reach some conclusions about what will result from the particular situation at hand. However, one obvious drawback of the normal form is its difficulty in capturing time. That is, there is a sense in which players’ strategy sets correspond to what they can do, and how the combination of their actions affect each others payoffs, but how is the order of moves captured? More important, if there is a well defined order of moves, will this have an effect on what we would label as a reasonable prediction of our model? Example: Sequencing the Cournot Game: Stackelberg Competition Consider our familiar Cournot game with demand P = 100 − q, q = q1 + q2,and ci(qi)=0for i ∈{1, 2}. We have already solved for the best response choices of each firm, by solving their maximum profit function when each takes the other’s quantity as fixed, that is, taking qj as fixed, each i solves: max (100 − qi − qj)qi , qi 86 10. Preliminaries and the best response is then, 100 − q q = j . i 2 Now lets change a small detail of the game, and assume that first, firm 1 will choose q1, and before firm 2 makes its choice of q2 it will observe the choice made by firm 1. From the fact that firm 2 maximizes its profit when q1 is already known, it should be clear that firm 2 will follow its best response function, since it not only has a belief, but this belief must be correct due to the observation of q1.
- 
												  Environmental Health Criteria 242INTERNATIONAL PROGRAMME ON CHEMICAL SAFETY Environmental Health Criteria 242 DERMAL EXPOSURE IOMC INTER-ORGANIZATION PROGRAMME FOR THE SOUND MANAGEMENT OF CHEMICALS A cooperative agreement among FAO, ILO, UNDP, UNEP, UNIDO, UNITAR, WHO, World Bank and OECD This report contains the collective views of an international group of experts and does not necessarily represent the decisions or the stated policy of the World Health Organization The International Programme on Chemical Safety (IPCS) was established in 1980. The overall objec- tives of the IPCS are to establish the scientific basis for assessment of the risk to human health and the environment from exposure to chemicals, through international peer review processes, as a prerequi- site for the promotion of chemical safety, and to provide technical assistance in strengthening national capacities for the sound management of chemicals. This publication was developed in the IOMC context. The contents do not necessarily reflect the views or stated policies of individual IOMC Participating Organizations. The Inter-Organization Programme for the Sound Management of Chemicals (IOMC) was established in 1995 following recommendations made by the 1992 UN Conference on Environment and Development to strengthen cooperation and increase international coordination in the field of chemical safety. The Participating Organizations are: FAO, ILO, UNDP, UNEP, UNIDO, UNITAR, WHO, World Bank and OECD. The purpose of the IOMC is to promote coordination of the policies and activities pursued by the Participating Organizations, jointly or separately, to achieve the sound management of chemicals in relation to human health and the environment. WHO Library Cataloguing-in-Publication Data Dermal exposure. (Environmental health criteria ; 242) 1.Hazardous Substances - poisoning.