Mechanism Design and Analysis Using Simulation-Based Game Models

Total Page:16

File Type:pdf, Size:1020Kb

Mechanism Design and Analysis Using Simulation-Based Game Models Mechanism Design and Analysis Using Simulation-Based Game Models by Yevgeniy Vorobeychik A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in The University of Michigan 2008 Doctoral Committee: Professor Michael P. Wellman, Chair Professor Edmund H. Durfee Associate Professor Satinder Singh Baveja Associate Professor Emre Ozdenoren c Yevgeniy Vorobeychik 2008 All Rights Reserved I dedicate this work to my wife, Polina, and my daughter, Avital. ii ACKNOWLEDGMENTS There is little doubt that my dissertation would never have been completed, had it not been for the support of a great number of people, including family, friends, and faculty mentors. I am very greatful to my thesis advisor, Michael Wellman, for having guided me along these past six years with timely advice, encouragement, and invaluable ideas that have had a direct and profound influence on my work. I also want to thank my committee members, Satinder Singh, Ed Durfee, and Emre Ozdenoren. My numerous adventures into each of their offices had invariably left me rejuvenated and brimming with new ideas. All of my published work to date is co-authored, and I am grateful to all of the co- contributors. Prominent among these are Chris Kiekintveld, my perennial officemate— before and after the move to our wonderful new building—who has been both a great collegue and a great friend, Daniel Reeves, who has been not only a collegue and a friend, but also a mentor, particularly in my days as a fledgling researcher, and Patrick Jordan, whose intellectual insight is often sharp and always invaluable. I am grateful to my col- legues at Yahoo! Research, Jenn Wortman, John Langford, Lihong Li, Yiling Chen, and David Pennock for the fantastic time I spend working with them during my internship, and to Isaac Porche for his tutelage during my summer tenure at RAND Corporation. Finally, I am grateful to my family for having supported me in my lengthy doctoral endeavor, particularly to my wife, Polina, who had been with me through the highs and the lows, and who gave me a most precious gift—a beautiful daughter Avital. iii TABLE OF CONTENTS DEDICATION .................................... ii ACKNOWLEDGMENTS .............................. iii LIST OF TABLES .................................. x LIST OF FIGURES ................................. xi LIST OF APPENDICES .............................. xv ABSTRACT .....................................xvi CHAPTER 1 Introduction . 1 1.1 A High-Level Model . 4 1.2 Contributions . 6 1.3 Overview of Thesis . 7 2 Game-Theoretic Preliminaries . 9 2.1 Games in Normal Form . 9 2.1.1 Notation . 9 2.1.2 Basic Normal-Form Solution Concepts . 11 2.2 Games of Incomplete Information and Auctions . 13 2.3 Games in Extensive Form . 15 2.3.1 A Formal Description . 15 2.3.2 Infinitely Repeated Games with Complete and Perfect Information . 17 Part I Will the Optimal Mechanism Please Stand Up? 19 3 Mechanism Design: Classical and Computational Approaches . 19 3.1 Social Choice Theory . 19 3.2 Mechanism Design Under Incomplete Information . 23 iv 3.2.1 Setup . 23 3.2.2 Implementation in Dominant Strategies . 24 3.2.3 Implementation in Bayes-Nash Equilibrium . 28 3.2.4 Participation Constraints . 30 3.2.5 Bilateral Trade . 31 3.2.6 Weak vs. Strong Implementation . 31 3.3 Implementation vs. Optimality . 32 3.4 Auction Design . 33 3.4.1 Single Item Auctions . 34 3.4.2 Multi-Unit Auctions . 38 3.4.3 Combinatorial Auctions . 39 3.5 Computational Approaches to Mechanism Design . 41 3.5.1 Automated Mechanism Design (AMD) . 41 3.5.2 Partial Revelation of Preferences . 45 3.5.3 Incremental Mechanism Design . 46 3.5.4 Evolutionary Approaches . 47 3.5.5 The Metalearning Approach . 47 4 Simulation-Based Mechanism Design . 49 4.1 Trading Agent Competition and the Supply-Chain Game . 50 4.1.1 The Story of Day-0 Procurement in TAC/SCM . 51 4.1.2 The TAC/SCM Design Problem . 52 4.2 What is Simulation-Based Mechanism Design? . 54 4.3 A General Framework for Simulation-Based Mechanism Design 55 4.4 Formalization of the TAC/SCM Design Problem . 57 4.5 Simulation-Based Design Analysis . 58 4.5.1 Estimating Nash Equilibria . 58 4.5.2 Data Generation . 62 4.5.3 Results . 63 4.6 “Threshold” Design Problems . 70 4.7 General Objective Functions . 72 4.7.1 Learning the Objective Function . 72 4.7.2 Stochastic Search . 74 4.7.3 Constraints . 75 4.8 Solving Games Induced by Mechanism Choices . 77 4.9 Sensitivity Analysis . 77 4.10 Probabilistic Analysis in TAC/SCM . 79 4.11 Comparative Statics . 85 4.12 Conclusion . 86 5 A General Framework for Computational Mechanism Design on Con- strained Design Spaces . 87 5.1 Automated Mechanism Design for Bayesian Games . 87 5.1.1 Designer’s Optimization Problem . 88 5.1.2 Computing Nash Equilibria . 90 v 5.1.3 Dealing with Constraints . 91 5.1.4 Evaluating the Objective . 95 5.2 Extended Example: Shared-Good Auction (SGA) . 97 5.2.1 Setup . 97 5.2.2 Automated Design Problems . 99 5.3 Applications . 105 5.3.1 Myerson Auctions . 106 5.3.2 Vicious Auctions . 110 5.4 Conclusion . 117 Part II One Nash, Two Nash, Red Nash, Blue Nash 118 6 Related Work on Computational Game Theory . 118 6.1 Complexity of Computing Nash Equilibria . 118 6.2 Solving Finite Two-Person Games . 119 6.2.1 Zero-Sum Games . 120 6.2.2 General Two-Person Games with Complete Information . 121 6.3 Solving General Games . 125 6.3.1 Simplicial Subdivision . 125 6.3.2 The Liapunov Minimization Method . 125 6.3.3 The Govindan-Wilson Algorithm . 126 6.3.4 Logit Equilibrium Path Following Method . 129 6.3.5 Search in the Pure Strategy Support Space . 130 6.4 “Learning” Algorithms . 131 6.4.1 Iterative Best Response . 132 6.4.2 Fictitious Play . 132 6.4.3 Replicator Dynamics for Symmetric Games . 133 6.5 Solving Games of Incomplete Information . 134 6.5.1 Restricted Strategy Spaces and Convergence . 135 6.5.2 Empirical Game Theory . 135 6.6 Compact Representations of Games . 136 6.6.1 Graphical Games . 136 6.6.2 Multi-Agent Influence Diagrams . 137 6.6.3 Action-Graph Games . 137 7 Simulation-Based Game Modeling: Formal Notation and Finite-Game Methods . 138 7.1 Simulation-Based and Empirical Games . 139 7.2 A General Procedure for Solving Simulation-Based Games . 140 7.3 Simulation Control . 141 7.4 Approximating Nash Equilibria in Empirical Games . 147 7.4.1 Nash Equilibrium Approximation Metrics . 148 7.4.2 Estimating Nash Equilibria in (Small) Finite Games . 150 7.4.3 Estimating Nash Equilibria in Large and Infinite Games . 151 vi 7.5 Applications to TAC/SCM Analysis . 153 7.5.1 Analysis of Day-0 Procurement in TAC/SCM 2003 . 154 7.5.2 Analysis of Day-0 Procurement in TAC/SCM 2004 . 157 7.6 Consistency Results About Nash Equilibria in Empirical Games . 158 7.7 Probabilistic Bounds on Approximation Quality . 161 7.7.1 Distribution-Free Bounds . 162 7.7.2 Confidence Bounds for Finite Games with Normal Noise 164 7.7.3 Confidence Bounds for Infinite Games That Use Finite Game Approximations . 165 7.7.4 Experimental Evaluation of Bounds . 166 7.8 Maximum Likelihood Equilibrium Estimation . 167 7.9 Comparative Statics . 170 7.10 Conclusion . 171 8 Learning Payoff Functions in Infinite Games . 173 8.1 Payoff Function Approximation . 174 8.1.1 Problem Definition . 174 8.1.2 Polynomial Regression . 175 8.1.3 Local Regression . 176 8.1.4 Support Vector Machine Regression . 177 8.1.5 Finding Mixed-Strategy Equilibria . 177 8.1.6 Strategy Aggregation . 178 8.2 First-Price Sealed-Bid Auction . 179 8.3 Market-Based Scheduling Game . 183 8.4 Using in Model Selection . 192 8.5 Using as the Approximation Target . 194 8.6 Future Work . 198 8.6.1 Active Learning . 198 8.6.2 Learning Bayes-Nash Equilibria . 199 8.7 Conclusion . 199 9 Stochastic Search Methods for Approximating Equilibria in Infinite Games202 9.1 Best Response Approximation . 203 9.1.1 Continuous Stochastic Search for Black-Box Optimization203 9.1.2 Globally Convergent Best Response Approximation . 205 9.2 Nash Equilibrium Approximation . 207 9.2.1 Equilibrium Approximation via Iterated Best Response . 208 9.2.2 A Globally Convergent Algorithm for Equilibrium Ap- proximation . 208 9.3 Examples . 212 9.3.1 Mixed Strategy Equilibria in Finite Games . 212 9.3.2 Symmetric Games . 212 9.4 Infinite Games of Incomplete Information . 213 9.4.1 Best Response Approximation . 213 9.4.2 Strategy Set Restrictions . 213 vii 9.5 Experimental Evaluation of Best Response Quality . 214 9.5.1 Experimental Setup . 214 9.5.2 Two-Player One-Item Auctions . 216 9.5.3 Five-Player One-Item Auctions . 219 9.5.4 Sampling and Iteration Efficiency . 219 9.6 Experimental Evaluation of Equilibrium Quality . 221 9.6.1 Experimental Setup . 221 9.6.2 Two-Player One-Item Auctions . 221 9.6.3 Five-Player One-Item Auctions . 222 9.6.4 Two-Player Two-Item First-Price Combinatorial Auction 223 9.7 Experimental Comparison of Equilibrium Approximation Meth- ods . 224 9.8 Conclusion . ..
Recommended publications
  • California Institute of Technology
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Caltech Authors - Main DIVISION OF THE HUM ANITIES AND SO CI AL SCIENCES CALIFORNIA INSTITUTE OF TECHNOLOGY PASADENA, CALIFORNIA 91125 IMPLEMENTATION THEORY Thomas R. Palfrey � < a: 0 1891 u. "')/,.. () SOCIAL SCIENCE WORKING PAPER 912 September 1995 Implementation Theory Thomas R. Palfrey Abstract This surveys the branch of implementation theory initiated by Maskin (1977). Results for both complete and incomplete information environments are covered. JEL classification numbers: 025, 026 Key words: Implementation Theory, Mechanism Design, Game Theory, Social Choice Implementation Theory* Thomas R. Palfrey 1 Introduction Implementation theory is an area of research in economic theory that rigorously investi­ gates the correspondence between normative goals and institutions designed to achieve {implement) those goals. More precisely, given a normative goal or welfare criterion for a particular class of allocation pro blems (or domain of environments) it formally char­ acterizes organizational mechanisms that will guarantee outcomes consistent with that goal, assuming the outcomes of any such mechanism arise from some specification of equilibrium behavior. The approaches to this problem to date lie in the general domain of game theory because, as a matter of definition in the implementation theory litera­ ture, an institution is modelled as a mechanism, which is essentially a non-cooperative game. Moreover, the specific models of equilibrium behavior
    [Show full text]
  • Game Theory 2021
    More Mechanism Design Game Theory 2021 Game Theory 2021 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 More Mechanism Design Game Theory 2021 Plan for Today In this second lecture on mechanism design we are going to generalise beyond the basic scenario of auctions|as much as we can manage: • Revelation Principle: can focus on direct-revelation mechanisms • formal model of direct-revelation mechanisms with money • incentive compatibility of the Vickrey-Clarke-Groves mechanism • other properties of VCG for special case of combinatorial auctions • impossibility of achieving incentive compatibility more generally Much of this is also (somewhat differently) covered by Nisan (2007). N. Nisan. Introduction to Mechanism Design (for Computer Scientists). In N. Nisan et al. (eds.), Algorithmic Game Theory. Cambridge University Press, 2007. Ulle Endriss 2 More Mechanism Design Game Theory 2021 Reminder Last time we saw four auction mechanisms for selling a single item: English, Dutch, first-price sealed-bid, Vickrey. The Vickrey auction was particularly interesting: • each bidder submits a bid in a sealed envelope • the bidder with the highest bid wins, but pays the price of the second highest bid (unless it's below the reservation price) It is a direct-revelation mechanism (unlike English and Dutch auctions) and it is incentive-compatible, i.e., truth-telling is a dominant strategy (unlike for Dutch and FPSB auctions). Ulle Endriss 3 More Mechanism Design Game Theory 2021 The Revelation Principle Revelation Principle: Any outcome that is implementable in dominant strategies via some mechanism can also be implemented by means of a direct-revelation mechanism making truth-telling a dominant strategy.
    [Show full text]
  • Adversarial Search (Game)
    Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2019 Soleymani “Artificial Intelligence: A Modern Approach”, 3rd Edition, Chapter 5 Most slides have been adopted from Klein and Abdeel, CS188, UC Berkeley. Outline } Game as a search problem } Minimax algorithm } �-� Pruning: ignoring a portion of the search tree } Time limit problem } Cut off & Evaluation function 2 Games as search problems } Games } Adversarial search problems (goals are in conflict) } Competitive multi-agent environments } Games in AI are a specialized kind of games (in the game theory) 3 Adversarial Games 4 Types of Games } Many different kinds of games! } Axes: } Deterministic or stochastic? } One, two, or more players? } Zero sum? } Perfect information (can you see the state)? } Want algorithms for calculating a strategy (policy) which recommends a move from each state 5 Zero-Sum Games } Zero-Sum Games } General Games } Agents have opposite utilities } Agents have independent utilities (values on outcomes) (values on outcomes) } Lets us think of a single value that one } Cooperation, indifference, competition, maximizes and the other minimizes and more are all possible } Adversarial, pure competition } More later on non-zero-sum games 6 Primary assumptions } We start with these games: } Two -player } Tu r n taking } agents act alternately } Zero-sum } agents’ goals are in conflict: sum of utility values at the end of the game is zero or constant } Perfect information } fully observable Examples: Tic-tac-toe,
    [Show full text]
  • Best-First Minimax Search Richard E
    Artificial Intelligence ELSEVIER Artificial Intelligence 84 ( 1996) 299-337 Best-first minimax search Richard E. Korf *, David Maxwell Chickering Computer Science Department, University of California, Los Angeles, CA 90024, USA Received September 1994; revised May 1995 Abstract We describe a very simple selective search algorithm for two-player games, called best-first minimax. It always expands next the node at the end of the expected line of play, which determines the minimax value of the root. It uses the same information as alpha-beta minimax, and takes roughly the same time per node generation. We present an implementation of the algorithm that reduces its space complexity from exponential to linear in the search depth, but at significant time cost. Our actual implementation saves the subtree generated for one move that is still relevant after the player and opponent move, pruning subtrees below moves not chosen by either player. We also show how to efficiently generate a class of incremental random game trees. On uniform random game trees, best-first minimax outperforms alpha-beta, when both algorithms are given the same amount of computation. On random trees with random branching factors, best-first outperforms alpha-beta for shallow depths, but eventually loses at greater depths. We obtain similar results in the game of Othello. Finally, we present a hybrid best-first extension algorithm that combines alpha-beta with best-first minimax, and performs significantly better than alpha-beta in both domains, even at greater depths. In Othello, it beats alpha-beta in two out of three games. 1. Introduction and overview The best chess machines, such as Deep-Blue [lo], are competitive with the best humans, but generate billions of positions per move.
    [Show full text]
  • Designing Mechanisms to Make Welfare- Improving Strategies Focal∗
    Designing Mechanisms to Make Welfare- Improving Strategies Focal∗ Daniel E. Fragiadakis Peter Troyan Texas A&M University University of Virginia Abstract Many institutions use matching algorithms to make assignments. Examples include the allocation of doctors, students and military cadets to hospitals, schools and branches, respec- tively. Most of the market design literature either imposes strong incentive constraints (such as strategyproofness) or builds mechanisms that, while more efficient than strongly incentive compatible alternatives, require agents to play potentially complex equilibria for the efficiency gains to be realized in practice. Understanding that the effectiveness of welfare-improving strategies relies on the ease with which real-world participants can find them, we carefully design an algorithm that we hypothesize will make such strategies focal. Using a lab experiment, we show that agents do indeed play the intended focal strategies, and doing so yields higher overall welfare than a common alternative mechanism that gives agents dominant strategies of stating their preferences truthfully. Furthermore, we test a mech- anism from the field that is similar to ours and find that while both yield comparable levels of average welfare, our algorithm performs significantly better on an individual level. Thus, we show that, if done carefully, this type of behavioral market design is a worthy endeavor that can most promisingly be pursued by a collaboration between institutions, matching theorists, and behavioral and experimental economists. JEL Classification: C78, C91, C92, D61, D63, Keywords: dictatorship, indifference, matching, welfare, efficiency, experiments ∗We are deeply indebted to our graduate school advisors Fuhito Kojima, Muriel Niederle and Al Roth for count- less conversations and encouragement while advising us on this project.
    [Show full text]
  • Exploding Offers with Experimental Consumer Goods∗
    Exploding Offers with Experimental Consumer Goods∗ Alexander L. Brown Ajalavat Viriyavipart Department of Economics Department of Economics Texas A&M University Texas A&M University Xiaoyuan Wang Department of Economics Texas A&M University August 29, 2014 Abstract Recent theoretical research indicates that search deterrence strategies are generally opti- mal for sellers in consumer goods markets. Yet search deterrence is not always employed in such markets. To understand this incongruity, we develop an experimental market where profit-maximizing strategy dictates sellers should exercise one form of search deterrence, ex- ploding offers. We find that buyers over-reject exploding offers relative to optimal. Sellers underutilize exploding offers relative to optimal play, even conditional on buyer over-rejection. This tendency dissipates when sellers make offers to computerized buyers, suggesting their persistent behavior with human buyers may be due to a preference rather than a miscalculation. Keywords: exploding offer, search deterrence, experimental economics, quantal response equi- librium JEL: C91 D21 L10 M31 ∗We thank the Texas A&M Humanities and Social Science Enhancement of Research Capacity Program for providing generous financial support for our research. We have benefited from comments by Gary Charness, Catherine Eckel, Silvana Krasteva and seminar participants of the 2013 North American Economic Science Association and Southern Economic Association Meetings. Daniel Stephenson and J. Forrest Williams provided invaluable help in conducting the experimental sessions. 1 Introduction One of the first applications of microeconomic theory is to the consumer goods market: basic supply and demand models can often explain transactions in centralized markets quite well. When markets are decentralized, often additional modeling is required to explain the interactions of consumers and producers.
    [Show full text]
  • Strategyproof Exchange with Multiple Private Endowments
    Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Strategyproof Exchange with Multiple Private Endowments Taiki Todo, Haixin Sun, Makoto Yokoo Dept. of Informatics Kyushu University Motooka 744, Fukuoka, JAPAN ftodo, sunhaixin, [email protected] Abstract efficient and individually rational. Moreover, it is the only We study a mechanism design problem for exchange rule that satisfies those three properties (Ma 1994). Due to economies where each agent is initially endowed with these advantages, TTC has been attracting much attention a set of indivisible goods and side payments are not al- from both economists and computer scientists. lowed. We assume each agent can withhold some en- In this paper we consider exchange economies where each dowments, as well as misreport her preference. Under this assumption, strategyproofness requires that for each agent is endowed with a set of multiple goods, instead of agent, reporting her true preference with revealing all a single good. Under that model, Pareto efficiency, indi- her endowments is a dominant strategy, and thus implies vidual rationality, and strategyproofness cannot be simulta- individual rationality. Our objective in this paper is to neously satisfied (Sonmez¨ 1999). Therefore, one main re- analyze the effect of such private ownership in exchange search direction on exchanges with multiple endowments is economies with multiple endowments. As fundamental to achieve strategyproof rules that guarantee a limited notion results, we first show that the revelation principle holds of efficiency. In particular, Papai´ (2003) defined a class of under a natural assumption and that strategyproofness exchange rules and gave a characterization by strategyproof- and Pareto efficiency are incompatible even under the ness and individual rationality with some other weak effi- lexicographic preference domain.
    [Show full text]
  • Proportionality and Strategyproofness in Multiwinner Elections
    Session 43: Social Choice Theory 3 AAMAS 2018, July 10-15, 2018, Stockholm, Sweden Proportionality and Strategyproofness in Multiwinner Elections Dominik Peters University of Oxford Oxford, UK [email protected] ABSTRACT focussed on cases where there are no parties, and preferences are Multiwinner voting rules can be used to select a fixed-size com- expressed directly over the candidates [23]. The latter setting allows mittee from a larger set of candidates. We consider approval-based for applications in areas outside the political sphere, such as in committee rules, which allow voters to approve or disapprove can- group recommendation systems. didates. In this setting, several voting rules such as Proportional To formalise the requirement of proportionality in this party-free Approval Voting (PAV) and Phragmén’s rules have been shown to setting, it is convenient to consider the case where input preferences produce committees that are proportional, in the sense that they are given as approval ballots: each voter reports a set of candidates proportionally represent voters’ preferences; all of these rules are that they find acceptable. Even for this simple setting, there isa strategically manipulable by voters. On the other hand, a generalisa- rich variety of rules that exhibit different behaviour [31], and this tion of Approval Voting gives a non-proportional but strategyproof setting gives rise to a rich variety of axioms. voting rule. We show that there is a fundamental tradeoff between One natural way of selecting a committee of k candidates when these two properties: we prove that no multiwinner voting rule can given approval ballots is to extend Approval Voting (AV): for each simultaneously satisfy a weak form of proportionality (a weakening of the m candidates, count how many voters approve them (their of justified representation) and a weak form of strategyproofness.
    [Show full text]
  • Adversarial Search (Game Playing)
    Artificial Intelligence Adversarial Search (Game Playing) Chapter 5 Adapted from materials by Tim Finin, Marie desJardins, and Charles R. Dyer Outline • Game playing – State of the art and resources – Framework • Game trees – Minimax – Alpha-beta pruning – Adding randomness State of the art • How good are computer game players? – Chess: • Deep Blue beat Gary Kasparov in 1997 • Garry Kasparav vs. Deep Junior (Feb 2003): tie! • Kasparov vs. X3D Fritz (November 2003): tie! – Checkers: Chinook (an AI program with a very large endgame database) is the world champion. Checkers has been solved exactly – it’s a draw! – Go: Computer players are decent, at best – Bridge: “Expert” computer players exist (but no world champions yet!) • Good place to learn more: http://www.cs.ualberta.ca/~games/ Chinook • Chinook is the World Man-Machine Checkers Champion, developed by researchers at the University of Alberta. • It earned this title by competing in human tournaments, winning the right to play for the (human) world championship, and eventually defeating the best players in the world. • Visit http://www.cs.ualberta.ca/~chinook/ to play a version of Chinook over the Internet. • The developers have fully analyzed the game of checkers and have the complete game tree for it. – Perfect play on both sides results in a tie. • “One Jump Ahead: Challenging Human Supremacy in Checkers” Jonathan Schaeffer, University of Alberta (496 pages, Springer. $34.95, 1998). Ratings of human and computer chess champions Typical case • 2-person game • Players alternate moves • Zero-sum: one player’s loss is the other’s gain • Perfect information: both players have access to complete information about the state of the game.
    [Show full text]
  • Learning Near-Optimal Search in a Minimal Explore/Exploit Task
    Explore/exploit tradeoff strategies with accumulating resources 1 Explore/exploit tradeoff strategies in a resource accumulation search task [revised/resubmitted to Cognitive Science] Ke Sang1, Peter M. Todd1, Robert L. Goldstone1, and Thomas T. Hills2 1 Cognitive Science Program and Department of Psychological and Brain Sciences, Indiana University Bloomington 2 Department of Psychology, University of Warwick REVISION 7/5/2018 Contact information: Peter M. Todd Indiana University, Cognitive Science Program 1101 E. 10th Street, Bloomington, IN 47405 USA phone: +1 812 855-3914 email: [email protected] Explore/exploit tradeoff strategies with accumulating resources 2 Abstract How, and how well, do people switch between exploration and exploitation to search for and accumulate resources? We study the decision processes underlying such exploration/exploitation tradeoffs by using a novel card selection task. With experience, participants learn to switch appropriately between exploration and exploitation and approach optimal performance. We model participants’ behavior on this task with random, threshold, and sampling strategies, and find that a linear decreasing threshold rule best fits participants’ results. Further evidence that participants use decreasing threshold-based strategies comes from reaction time differences between exploration and exploitation; however, participants themselves report non-decreasing thresholds. Decreasing threshold strategies that “front-load” exploration and switch quickly to exploitation are particularly effective
    [Show full text]
  • CS 561: Artificial Intelligence
    CS 561: Artificial Intelligence Instructor: Sofus A. Macskassy, [email protected] TAs: Nadeesha Ranashinghe ([email protected]) William Yeoh ([email protected]) Harris Chiu ([email protected]) Lectures: MW 5:00-6:20pm, OHE 122 / DEN Office hours: By appointment Class page: http://www-rcf.usc.edu/~macskass/CS561-Spring2010/ This class will use http://www.uscden.net/ and class webpage - Up to date information - Lecture notes - Relevant dates, links, etc. Course material: [AIMA] Artificial Intelligence: A Modern Approach, by Stuart Russell and Peter Norvig. (2nd ed) This time: Outline (Adversarial Search – AIMA Ch. 6] Game playing Perfect play ◦ The minimax algorithm ◦ alpha-beta pruning Resource limitations Elements of chance Imperfect information CS561 - Lecture 7 - Macskassy - Spring 2010 2 What kind of games? Abstraction: To describe a game we must capture every relevant aspect of the game. Such as: ◦ Chess ◦ Tic-tac-toe ◦ … Accessible environments: Such games are characterized by perfect information Search: game-playing then consists of a search through possible game positions Unpredictable opponent: introduces uncertainty thus game-playing must deal with contingency problems CS561 - Lecture 7 - Macskassy - Spring 2010 3 Searching for the next move Complexity: many games have a huge search space ◦ Chess: b = 35, m=100 nodes = 35 100 if each node takes about 1 ns to explore then each move will take about 10 50 millennia to calculate. Resource (e.g., time, memory) limit: optimal solution not feasible/possible, thus must approximate 1. Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result. 2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive search.
    [Show full text]
  • Optimal Symmetric Rendezvous Search on Three Locations
    Optimal Symmetric Rendezvous Search on Three Locations Richard Weber Centre for Mathematical Sciences, Wilberforce Road, Cambridge CB2 0WB email: [email protected] www.statslab.cam.ac.uk/~rrw1 In the symmetric rendezvous search game played on n locations two players are initially placed at two distinct locations. The game is played in discrete steps, at each of which each player can either stay where he is or move to a different location. The players share no common labelling of the locations. We wish to find a strategy such that, if both players follow it independently, then the expected number of steps until they are in the same location is minimized. Informal versions of the rendezvous problem have received considerable attention in the popular press. The problem was proposed by Steve Alpern in 1976 and it has proved notoriously difficult to analyse. In this paper we prove a 20 year old conjecture that the following strategy is optimal for the game on three locations: in each block of two steps, stay where you, or search the other two locations in random order, doing these with probabilities 1=3 and 2=3 respectively. This is now the first nontrivial symmetric rendezvous search game to be fully solved. Key words: rendezvous search; search games; semidefinite programming MSC2000 Subject Classification: Primary: 90B40; Secondary: 49N75, 90C22, 91A12, 93A14 OR/MS subject classification: Primary: Games/group decisions, cooperative; Secondary: Analysis of algorithms; Programming, quadratic 1. Symmetric rendezvous search on three locations In the symmetric rendezvous search game played on n locations two players are initially placed at two distinct locations.
    [Show full text]