Intrinsic Robustness of the Price of Anarchy

Intrinsic Robustness of the Price of Anarchy Tim Roughgarden Department of Computer Science Stanford University 353 Serra Mall, Stanford, CA 94305 [email protected] ABSTRACT ations, with large networks being one obvious example. The The price of anarchy, defined as the ratio of the worst-case past ten years have provided an encouraging counterpoint objective function value of a Nash equilibrium of a game to this widespread equilibrium inefficiency: in a number of and that of an optimal outcome, quantifies the inefficiency of interesting application domains, decentralized optimization selfish behavior. Remarkably good bounds on this measure by competing individuals provably approximates the optimal are known for a wide range of application domains. How- outcome. ever, such bounds are meaningful only if a game’s partici- A rigorous guarantee of this type requires a formal behav- pants successfully reach a Nash equilibrium. This drawback ioral model, in order to define “the outcome of self-interested motivates inefficiency bounds that apply more generally to behavior”. The majority of previous research studies pure- weaker notions of equilibria, such as mixed Nash equilibria, strategy Nash equilibria, defined as follows. Each player i correlated equilibria, or to sequences of outcomes generated selects a strategy si from a set Si, like a path in a network. by natural experimentation strategies, such as simultaneous The cost Ci(s) incurred by a player i in a game is a function regret-minimization. of the entire vector s of players’ chosen strategies, which We prove a general and fundamental connection between is called a strategy profile or an outcome. By definition, a the price of anarchy and its seemingly more general relatives. strategy profile s of a game is a pure Nash equilibrium if no First, we identify a“canonical sufficient condition”for an up- player can decrease its cost via a unilateral deviation: 0 per bound on the price of anarchy of pure Nash equilibria, Ci(s) ≤ Ci(si, s−i) (1) which we call a smoothness argument. Second, we prove an 0 “extension theorem”: every bound on the price of anarchy for every i and si ∈ Si, where s−i denotes the strategies that is derived via a smoothness argument extends automat- chosen by the players other than i in s. These concepts can ically, with no quantitative degradation in the bound, to be defined equally well via payoff-maximization rather than mixed Nash equilibria, correlated equilibria, and the aver- cost-minimization; see also Example 2.5. age objective function value of every no-regret sequence of The price of anarchy (POA) measures the suboptimality joint repeated play. Third, we prove that in routing games, caused by self-interested behavior. Given a game, a notion smoothness arguments are “complete” in a proof-theoretic of an “equilibrium” (such as pure Nash equilibria), and an sense: despite their automatic generality, they are guaran- objective function (such as the sum of players’ costs), the teed to produce an optimal worst-case upper bound on the POA of the game is defined as the ratio between the largest price of anarchy. cost of an equilibrium and the cost of an optimal outcome. An upper bound on the POA has an attractive worst-case flavor: it applies to every possible equilibrium and obvi- 1. INTRODUCTION ates the need to predict a single outcome of selfish behavior. Every student of game theory learns early and often that Many researchers have proved remarkably good bounds on equilibria are inefficient — self-interested behavior by au- the POA in a wide range of models; see [17, Chapters 17–21] tonomous decision-makers generally leads to an outcome and the references therein. inferior to the one that a hypothetical benevolent dictator would choose. Such inefficiency is ubiquitous in real-world 1.1 The Need For More Robust Bounds situations and arises for many different reasons: congestion A good bound on the price of anarchy of a game is not externalities, network effects, mis-coordination, and so on. enough to conclude that self-interested behavior is relatively It can also be costly or infeasible to eliminate in many situ- benign. Such a bound is meaningful only if a game’s par- ticipants successfully reach an equilibrium. For pure Nash The original version of this article was published in the equilibria, however, there are a number of reasons why this Proceedings of the 41st Annual ACM Symposium on Theory might not occur: perhaps the players fail to coordinate on of Computing, May 2009. one of multiple equilibria; or they are playing a game in Permission to make digital or hard copies of all or part of this work for which computing a pure Nash equilibrium is a computation- personal or classroom use is granted without fee provided that copies are ally intractable problem [9]; or, even more fundamentally, a not made or distributed for profit or commercial advantage and that copies game in which pure Nash equilibria do not exist. These cri- bear this notice and the full citation on the first page. To copy otherwise, to tiques motivate worst-case performance bounds that apply republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. to as wide a range of outcomes as possible, and under min- Copyright 2012 ACM 0001-0782/08/0X00 ...$5.00. imal assumptions about how players play and coordinate in Many of the POA upper bounds in the literature can easy to be recast as instantiations of this canonical method. No Regret (CCE) compute/ (B) We prove an “extension theorem”: every bound on the learn CorEq price of anarchy that is derived via a smoothness argument extends automatically, with no quantitative always degradation in the bound, to all of the more general exists, hard MNE equilibrium concepts pictured in Figure 1. to compute (C) We prove that routing games, with cost functions re- need not stricted to some arbitrary set, are “tight” in the fol- PNE lowing sense: smoothness arguments, despite their au- exist, hard to compute tomatic generality, are guaranteed to produce optimal worst-case upper bounds on the POA, even for the set of pure Nash equilibria. Thus, in these classes of games, the worst-case POA is the same for each of the equilibrium concepts of Figure 1. Figure 1: Generalizations of pure Nash equilibria. “PNE” stands for pure Nash equilibria; “MNE” for mixed Nash equilibria; “CorEq” for correlated equi- 2. SMOOTH GAMES libria; and “No Regret (CCE)” for coarse correlated equilibria, which are the empirical distributions cor- 2.1 Definitions responding to repeated joint play in which every By a cost-minimization game, we mean a game — play- player has no (external) regret. ers, strategies, and cost functions — together with the joint Pk cost objective function C(s) = i=1 Ci(s). Essentially, a “smooth game” is a cost-minimization game that admits a POA bound of a canonical type (a “smoothness argument”). a game. We give the formal definition and then explain how to inter- This article presents a general theory of “robust” bounds pret it. on the price of anarchy. We focus on the hierarchy of fundamental equilibrium concepts shown in Figure 1; the full Definition 2.1 (Smooth Games) A cost-minimization game version [22] discusses additional generalizations of pure Nash is (λ, µ)-smooth if for every two outcomes s and s∗, equilibria, including approximate equilibria and outcome se- k quences generated by best-response dynamics. We formally X ∗ ∗ C (s , s ) ≤ λ · C(s ) + µ · C(s). (2) define the equilibrium concepts of Figure 1 — mixed Nash i i −i i=1 equilibria, correlated equilibria, and coarse correlated equilibria — in Section 3.1, but mention next some of their im- Roughly, smoothness controls the cost of a set of“one-dimensional portant properties. perturbations” of an outcome, as a function of both the ini- Enlarging the set of equilibria weakens the behavioral and tial outcome s and the perturbations s∗. technical assumptions necessary to justify equilibrium anal- We claim that if a game is (λ, µ)-smooth, with λ > 0 and ysis. First, while there are games with no pure Nash equilib- µ < 1, then each of its pure Nash equilibria s has cost at ria — “Matching Pennies” being a simple example — every most λ/(1 − µ) times that of an optimal solution s∗. In (finite) game has at least one mixed Nash equilibrium [16]. proof, we derive As a result, the “non-existence critique” for pure Nash equi- k libria does not apply to any of the more general concepts in X C(s) = Ci(s) (3) Figure 1. Second, while computing a mixed Nash equilib- i=1 rium is in general a computationally intractable problem [5, k 8], computing a correlated equilibrium is not (see, e.g., [17, X ∗ ≤ Ci(si , s−i) (4) Chapter 2]). Thus, the “intractability critique” for pure and i=1 mixed Nash equilibria does not apply to the two largest sets ∗ ≤ λ · C(s ) + µ · C(s), (5) of Figure 1. More importantly, these two sets are “easily learnable”: when a game is played repeatedly over time, where (3) follows from the definition of the objective func- there are natural classes of learning dynamics — processes tion; inequality (4) follows from the Nash equilibrium condi- by which a player chooses its strategy for the next time step, tion (1), applied once to each player i with the hypothetical ∗ as a function only of its own payoffs and the history of play deviation si ; and inequality (5) follows from the defining — that are guaranteed to converge quickly to these sets of condition (2) of a smooth game.

Intrinsic Robustness of the Price of Anarchy

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support