<<

1

Game Couplings: Learning Dynamics and Applications

Maria-Florina Balcan, Florin Constantin, Georgios Piliouras, and Jeff S. Shamma

Abstract— Modern engineering systems (such as the Internet) We consider games (systems) with J ≥ 2 disjoint groups consist of multiple coupled subsystems. Such subsystems are N 1...N J of players (agents) such that agents in group N j designed with local (possibly conflicting) goals, with little or no take part in a game with a desirable property P for any fixed knowledge of the implementation details of other subsystems. Despite the ubiquitous nature of such systems very little is behavior of all agents from other groups. We define such formally known about their properties and global dynamics. settings as couplings of P-games. We examine the effect of We investigate such distributed systems by introducing a different localized classes of properties P on the performance novel game-theoretic construct, that we call game-coupling. of each group. We also look for necessary or sufficient Game coupling intuitively allows us to stitch together the payoff conditions for couplings to preserve (to some degree) P. structures of two or more games into a new game. In order to study efficiency issues, we extend the We first study the property of (λ, µ)-smoothness of cost framework to this setting, where we now care about local changes to individual updates. (λ, µ)-smoothness and global performance. Such concerns give rise to a new provably implies tight bounds on the inefficiency of standard notion of equilibrium, as well as a new learning paradigm. equilibrium concepts in several game classes [15], [17], [18]. We prove matching welfare guarantees for both, both for We show tight guarantees on group cost when each individual subsystems as well as for the global system, using a j j j generalization of the (λ, µ)-smoothness framework [17]. N exhibits local (λ , µ )-smoothness. These performance In the second part of the paper, we establish conditions guarantees carry over to a wide set of equilibrium notions as leading to advantageous couplings that preserve or enhance well as for no-regret learning algorithms. Using LP-duality desirable properties of the original games, such as convergence arguments, we identify a new equilibrium notion for which of dynamics and low price of anarchy. all these bounds extend automatically for all groups. Furthermore, we introduce a novel learning framework I.INTRODUCTION modeling interactions of competing groups/institutions (e.g. Game-theoretic approaches are successful in designing and Internet providers). In our framework each group has a center analyzing distributed systems and have recently been used that provides public advertisement [2], [3] of (possibly differ- for distributed control [9], [10], [14], [20]. A natural stable ent) strategies to each player in the group. We analyze centers state of a system of agents corresponds to the game-theoretic whose advertised behaviors exhibits vanishing average regret concept of equilibrium, at which each agent minimizes its with hindsight. This is a rather natural benchmark, since own cost. Basic properties of equilibria, such as existence, several simple learning algorithms offer such guarantees [5]. reachability via learning dynamics, and total cost are quite On the side of the individual agents, we make a similarly well-understood if the entire population of agents is homoge- weak assumption. We assume that the average performance neous [6], [8], [10], [13], [16]. In many systems, especially of each agent will eventually be roughly as high as that of large ones, such homogeneity is only encountered locally at her advertised strategy. We prove tight welfare guarantees for a group level while overall the system is heterogeneous. For each group in this framework as well as global guarantees example in routing of Internet traffic the same time delay which match those of our proposed equilibrium. results in much higher costs for low-latency applications We give conditions for couplings to preserve the existence such as Voice-over-IP than for other applications such as of types of potential functions. Specific learning dynamics email. Similarly, in load balancing (assignment of jobs to are known to converge [9], [13] given a potential, i.e. a strong machines) the delay of a job mostly depends on jobs that use measure of cost alignment among players. Our necessary heavily the same component of the machine (e.g. processors and sufficient condition for preserving an exact potential or input/output). What are the prerequisites or the tradeoffs leverages a condition in [13]. We also provide conditions for local guarantees to carry over to the global level when for game couplings to preserve the existence of a weak po- coupling different groups? In this work, we formally study tential (equivalently, weak acyclicity), a more general notion such questions (largely overlooked thus far) by introducing guaranteeing weaker convergence than an exact potential. game-couplings. We apply our techniques to solve in the affirmative an open question regarding the convergence of Nash dynamics This work was supported by NSF grants CCF-0953192 and CCF-1101215, of a heterogeneous population in aggregation games [11]. by ONR grant N00014-09-1-0751, and by AFOSR grant FA9550-09-1-0538. Maria-Florina Balcan and Florin Constantin ({ninamf,florin}@cc.gatech.edu) are with the College of Computing, Georgia Institute of Technology. RELATED WORK. Structural properties of games often follow Georgios Piliouras ([email protected]) is with the School of Elec- from analysis of their sub-games. Sandholm [19] considers trical and Computer Engineering, Georgia Institute of Technology and the sub-games defined by all subsets of players (as opposed to Department of , Johns Hopkins University. Jeff S. Shamma ([email protected]) is with the School of Electrical a specific partition as we do here) and shows that a game and Computer Engineering, Georgia Institute of Technology. has an exact potential if and only if all active players in a 2

sub-game have identical utility functions. Fabrikant et al. [6] cost function ce which depends only on e’s load, i.e. number show that the uniqueness of a in any sub- of jobs on it. PNE always exist in such games [13]. Jobs are game is sufficient, but not necessary, for a game to be weakly rarely this homogeneous; instead, there are often groups of acyclic. Monderer [12] defines the classes of J-potential jobs, e.g. computation-intensive or memory-intensive. In this j games and J-congestion games for J ∈ N and shows they case, each machine has a cost function ce for each type j are isomorphic. In a J-congestion game, players’ mappings of jobs. When fixing the strategies of other jobs, the game of costs to delay can belong in J classes. The case J = 1 experienced by jobs of any type j is a standard LBG and is treated in Monderer and Shapley’s seminal paper [13]. An hence admits a PNE. Thus, a LBG with heterogeneous jobs instance captured by our framework is J = 2 and N 1,N 2 is a coupling of games that admit PNE. The question is then being the groups of players with delays from the same class. when does the global (heterogeneous) LBG also admit PNE. Our view of a population divided into groups is often We start by analyzing costs of PNE and other equilibria adopted in distributed adaptive control, where a center can within each group and consequences for coupling efficiency. only control a local group of agents, e.g. in the collective intelligence [20] and probability collectives [21] frameworks. III.PRICEOF ANARCHYWITHIN GROUPS Roughgarden [17] identified (λ, µ)-smoothness, a canoni- II.PRELIMINARIES cal property of games that yields tight PoA bounds. We model distributed systems as games G with simulta- Definition 3:[17] Game G is (λ, µ)-smooth if for all σ, σ0∈Σ neous moves. For a game G we denote by {1, . . . , n} the Pn 0 0 i=1 Ci(σi, σ−i) ≤ λ · C(σ ) + µ · C(σ) players (agents), by Σi player i’s set of (pure) strategies, n and by ∆(Σ) all distributions over outcomes Σ = ×i=1Σi. If G is (λ, µ)-smooth (with λ≥0 and µ∈(0, 1)), then [17] 1 Any player i aims to minimize her cost Ci(σ1, . . . , σn) with each of G’s PNE has cost at most λ/(1 − µ) times that of a σh ∈ Σh chosen by player h = 1..n. Individual costs are socially optimal , i.e. P oA(G) ≤ λ/(1 − µ). P aggregated by the social costC(σ)= iCi(σ). We denote by PoA bounds based on (λ, µ)-smoothness extend [17] to v−i = (v1, . . . , vi−1, vi+1, . . . , vr) a vector v = (v1, . . . , vr) three other standard equilibrium concepts that we review without its ith entry, for 1≤i≤r. now. A mixed Nash equilibrium (MNE) is a product proba- A stable state (equilibrium) is an outcome in which each bility distribution in ∆(Σ) in which each player minimizes player minimizes its own cost in some sense. A pure Nash its (expected) cost given others’ strategies. For any is a basic stable outcome in which for any player, equilibrium (CE) π ∈ ∆(Σ), if a mediator draws σ from its strategy minimizes its cost given others’ strategies. π and reveals to each player i only its strategy σi then i Definition 1: A strategy vector (σ1, . . . , σn) ∈ Σ is a pure minimizes cost by playing σi, assuming others also follow Nash equilibrium (PNE) if any player i minimizes its cost by σ−i. A coarse correlated equilibrium (CCE or equivalently 0 0 playing σi, i.e. Ci(σi, σ−i) ≤ Ci(σi, σ−i), ∀i, ∀σi ∈ Σi Hannan-consistent strategy [5]) is more general than a CE in 0 While PNE do not exist in some games, all other equilibria that the deviation σi cannot depend on the draw σi. Average coarse correlated equilibria with respect to a socially optimal we study in Section III exist in any game. PNE may not be ∗ ∗ efficient from a social perspective. A standard measure of σ ∈Σ (ACCE [15]) comprise the class of distributions for which the social cost is lower than the sum of costs when distributed inefficiency is the price of anarchy (PoA) [8], ∗ ∗ defined as the ratio of the social cost of the worst PNE to each agent i unilaterally deviated to σi . ACCE is the class maxσ∈PNE C(σ) for which the best P oA bound via (λ, µ)-smoothness is tight. the optimum: P oA(G)= ∗ minσ∗∈Σ C(σ ) Definition 4:A correlated equilibrium (CE) π ∈∆(Σ)is a dis- In hierarchical systems, group-level stability is a prerequi- tribution such that for all i P C (σ , σ )π(σ , σ ) σ−i∈Σ−i i i −i i −i site for global stability. We assume that groups are P-games ≤P C (σ0, σ )π(σ , σ ), ∀σ , σ0 ∈Σ . σ−i∈Σ−i i i −i i −i i i i i.e. they satisfy a property P e.g. the existence of PNE or At a coarse correlated equilibrium (CCE) π ∈∆(Σ), ∀i, constant PoA. Our goal is to understand whether this local P C (σ , σ )π(σ , σ )≤ P C (σ0, σ )π (σ ) σ∈Σ i i −i i −i σ−i∈Σ−i i i −i i −i P is preserved (to some extent) globally. To this goal we 0 P for all σ ∈ Σi where πi(σ−i) = π(τi, σ−i) is the introduce and study the novel concept of game coupling. i τi∈Σi marginal probability that vector σ−i ∈ Σ−i will be played. Definition 2: G is a (N 1,...,N J )-coupling of P-games if At an ACCE∗ π for some socially optimal σ∗ (i.e. 1 J j ∗ P • N ,...,N are groups partitioning {1 . . . n}, i.e. N ∩ C(σ ) ≤ C(σ), ∀σ) we have σ∈ΣCi(σi, σ−i)π(σi, σ−i) ≤ j0 0 1 J P P ∗ N =∅, ∀ 1≤j

4 P j 0 −j P oAMNE(G) ≤ P oACE(G) ≤ P oACCE(G) ≤ Finally, the normalizing condition σ∈Σ sσC (σ , σ ) = λ P oA ∗ (G) = inf{ : G is (λ, µ)-smooth} 1 can be replaced by another normalizing condition (e.g. ACCE 1−µ P Here, we present a set of localized (λ, µ)-smoothness σ∈Σ sσ = 1), since the rest of the system is scaling arguments, which are applicable to game couplings. Anal- invariant (for any positive multiplicative factor). ogously to the social cost, we define group j’s total cost at P j P j 0 −j j P max sσC (σ) / sσC (σ , σ ) σ ∈ Σ by C (σ)= i∈N j Ci(σ). σ∈Σ σ∈Σ P s = 1 s ≥ 0 ∀σ ∈ Σ Definition 5 ((~λ, ~µ)-smoothness): A coupling is (~λ, ~µ)- s.t. σ∈Σ σ and σ ~ J J P P 0 j  smooth (with λ ∈ R , ~µ ∈ [0, 1) ) if each sub-game j has σ∈Σ sσ i∈N j Ci(σi, σ−i) − C (σ) ≥ 0 the local (λj, µj)-smoothness property. That is, if for all sub- j average coarse correlated games j, for every two outcomes (σj, σ−j) and (σ0, σ−j): Hence, we can define group ’s equilibria with respect to localized optima (ACCELj) as P 0 j −j j j 0 −j j j j −j j j j i∈N j Ci(σi, σ−i, σ ) ≤ λ ·C (σ , σ )+µ ·C (σ , σ ) distributions for which local P oA bounds via (λ , µ )- T j 0 j −j smoothness are always tight; ACCEL= ACCEL . where vectors σ , σ ∈×k∈N j Σk, σ ∈×k∈ /N j Σk. j∈J j j j j −j We define the local PoA for global equilibria C, comparing Definition 7: ACCEL = {s : ∃ r ∈ minσj C (σ , s ) s.t. P j P P group j’s costs at σ ∈ C to its minimal group cost given σ−j. σ∈Σ C (σ)s(σ)≤ σ∈Σ i∈N j Ci(ri, σ−i)s(σ)}. j λj ~ Definition 6: The local price of anarchy of group j in G for Theorem 2: P oAACCELj (G)=min{ 1−µj : G (λ, ~µ)-smooth} j j C (σ) j ACCEL are the distributions for which the average regret C ⊆ ∆(Σ) is P oAC(G) = max where OPT σ∈C Cj(OPTj, σ−j) of players within each group for not following the prerogative j j −j j j −j minimizes group j’s cost (i.e. C (OPT , σ )≤C (σ , σ ), of a group optimal strategy is non-positive. If G is an j j −j ∀σ ∈Σ ) given the behavior σ of the outside agents. organization with hierarchical structure, ACCEL correspond j P oAC(G) measures the inefficiency of equilibria C (e.g. to policies that are plausible given competent management PNE) of the coupling G instead of equilibria within group j that guides a competent-on-average population. −j 1 only for σ . For a single group (J =1), P oAC =P oAC. Local smoothness leads to a bound on the local PoA for B. ACCEL and Public Advertising the most general equilibrium concept we introduced thus far We introduce a novel learning procedure that incorporates that holds even with adversarial behavior of the other groups. public advertising and which offers welfare guarantees anal- j λj ~ Theorem 1: P oAACCE∗ (G)≤ 1−µj for all (λ, ~µ)-smooth G. ogous to those of ACCEL. Intuitively, the setting has as j A. Dual Equilibrium Notions follows. Within each group , there exists a broadcasting center that can communicate with all agents within the group. We use LP duality to characterize the distributions for On each day t = 1 ...T , the center of group j computes a j which P oA bounds derived via local smoothness arguments strategy vector ADVj(t) for the group and advertises to each are tight. For σ0 ∈Σj we finding the best such P oAj bound, j agent i her respective strategy ADVi (t). There exist two high formulated below as a linear fractional problem (LP): level issues in any such model: first, how does the center min λj / (1 − µj) decide on which vector to advertise and second, how do the P 0 j j 0 −j j j individual agents respond to the recommendations. s.t. i∈N j Ci(σi, σ−i) ≤ λ C (σ , σ ) + µ C (σ), ∀σ ∈Σ j j In terms of center actions, prior public advertising mod- 0 < µ < 1, λ > 0 els [2], [3] assumed that there existed a single center with full j λj j 1 information over the whole game that was able to broadcast Introducing p = 1−µj >0 and z = 1−µj >0 yields the LP to all agents. In such settings the center can easily broadcast j min p a global optimum solution or the best Nash equilibrium. j j 0 −j j j P 0 j s.t. p C (σ, σ ) + z (C (σ)− i∈N j Ci(σi, σ−i)) ≥C (σ) Here, we are moving towards a more restricted and realistic model where each center only controls a local neighborhood the corresponding dual to which has as follows: of agents. Many real life settings share this structure (e.g. P j max σ∈Σ sσC (σ) competing Internet providers, or more generally competing P j 0 −j institutions/organizations). In such settings, the managing s.t. σ∈Σ sσC (σ , σ ) ≤ 1 and sσ ≥ 0 ∀σ ∈ Σ centers have a high incentive in employing sophisticated P s P C (σ0, σ ) − Cj(σ) ≥ 0 σ∈Σ σ i∈N j i i −i online algorithms in order to effectively calibrate their pre- Since the social costs are positive, we can replace the first dictions. Here, we will analyze centers whose advertised inequality with an equality. Furthermore, since this quantity behaviors exhibits vanishing average regret with hindsight. is a constant (and furthermore equal to 1), we can divide the This is a rather natural benchmark, since several simple objective by it without having any effects on the system: learning algorithms can offer such guarantees5. max P s Cj(σ) / P s Cj(σ0, σ−j) σ∈Σ σ σ∈Σ σ 5A policy (sequence of strategies) satisfies no-regret if its cumulated P j 0 −j s.t. σ∈Σ sσC (σ , σ ) = 1 and sσ ≥ 0 ∀σ ∈ Σ payoffs are almost as good as ones of the best fixed (time-invariant) strategy P P 0 j  given the history of play. CCE are limit points of time-averages of no-regret σ∈Σ sσ i∈N j Ci(σi, σ−i) − C (σ) ≥ 0 policies. Generally, no-regret algorithms offer guarantees in expectation over their randomized strategies. For ease of notation, we consider pure strategy 4 Blum et al. [4] call P oACCE(G) the price of total anarchy in G. outcomes. The analysis trivially extends to the case of randomized strategies. 4

On the side of the individual agents, we make a similarly IV. COUPLINGSANDPOTENTIALS weak assumption. We assume that the average performance In this section we identify structural properties of games of each agent i (of group j) will eventually be (almost) as j within each group that are preserved in the global game high as that of his advertised strategy ADVi (t). Any dummy when augmented with conditions on the interplay among agent can meet this benchmark merely by following the groups (for clarity, we only present results for two groups). recommended strategy. A more realistic agent could still We consider two properties of the natural Nash dynamics, achieve such guarantees by interpolating between his innate namely existence of an exact potential and weak acyclicity. learning strategy and the provided advice. We will show that advertising-guided learning offers guarantees analogous A. Potential functions review to those of ACCEL. We start by bounding the possible A potential function Φ:Σ → R simultaneously encodes negative effects of agents’ experimentation. improvement opportunities and is closely linked to PNE. Lemma 1: If in group j the time-average6 cost of each i is j Definition 8: [13] Game G has an exact potential Φ(·) if (almost) as low as that of i’s advertised strategies ADVi , 0 0 Ci(σi, σ−i) − Ci(σi, σ−i) = Φ(σi, σ−i) − Φ(σi, σ−i) ∀i, 0 1 P  1 P j −i  ∀σi, σi ∈ Σi, σ−i ∈ Σ−i. Game G has an ordinal potential T t Ci σ(t) ≤ T t Ci ADVi (t), σ (t) + o(1) 0 0 Φ(·) if Ci(σi, σ−i)

j equilibrium there exists a player i that can simultaneously 1 P Cj(σ(t)) ≤ λ 1 P CjADVj(t), σ−j(t) + o(1) 0 T t 1−µj T t lower both her cost and Φ by switching to some strategy σi: C (σ , σ ) > C (σ0, σ ) and Φ(σ , σ ) > Φ(σ0, σ ) Given a history of play σ(1), . . . , σ(T ), we denote the best i i −i i i −i i −i i −i If G has an exact potential (respectively ordinal potential, group response of group j with hindsight as OPTj(T ): or weak potential) then G is called an exact potential j 1 P j j −j  (respectively ordinal potential, or weakly acyclic) game. OPT (T ) = argminsj ∈Σj T t C s , σ (t) Def. 8 reviews potentials in increasing order of generality. Given a game coupling (N 1,N 2,...,N J ), we define its super-game as follows: it is a game with J agents, exact potentials ⊆ ordinal potentials ⊆ weak potentials the available strategies to each super-agent j correspond to j An ordinal potential Φ is also a weak one, since it suffices strategy tuples for all agents in group j, i.e. σ ∈ × j Σ . i∈N i that one player increases Φ upon improving her utility. The Finally, the cost of the super-agent j is the group cost for j P existence of an ordinal potential is equivalent [13] to the all agents in group j, i.e. C (σ) = j C (σ). We let i∈N i convergence of Nash dynamics, i.e. asynchronous updates λsup ∈ , µsup ∈ [0, 1) such that the super-game is R+ by each player to a better strategy given others’ current (λsup, µsup)-smooth. Finally, we define a socially optimal strategies. Any weakly acyclic game has at least one PNE, strategy vector as global OPT ∈ argmin C(σ). σ for example the global optimum of the weak potential. We now prove cost bounds for advertising-guided learning. Theorem 3: If each agent’s time-average cost is (almost) as B. Exact potential games low as that of her advertised strategy, and the advertised The compelling property of existence of an exact potential strategy for each group j has vanishing time-average regret: function implies, among others, convergence of distributed Nash dynamics. In distributed control of multi-agent systems, 1P CjADVj(t), σ−j(t) ≤ 1P CjOPTj(T ), σ−j(t)+o(1) T t T t exact potentials arise for example in the “wonderful life then for advertising-guided learning, the group cost satisfies utility” scheme [20], by which a planner can ensure that individual agents will act in accordance to common welfare. 1/T P Cjσ(t) λj t ≤ + o(1) We give a necessary and sufficient condition for a coupling P j j −j  j 1/T t C OPT (T ), σ (t) 1 − µ of exact potential games to also have an exact potential. Our

j condition leverages a well-known characterization [13]. and for min 1−µ > µsup, the social cost satisfies j λj Lemma 2: [13] Game G has an exact potential if and only if 0 0 P  sup for any i, k, any strategies σi, σi of i and σk, σk of k and for 1/T t C σ(t) λ ≤ + o(1) any σ of the other players we have d 0 0 (σ ) = 0 0 1−µj −ik σiσiσkσ −ik minσ0 C(σ ) min − µsup k j λj where d 0 0 (σ ) := ∆ (σ ) − ∆ 0 (σ ) − σiσiσkσk −ik σiσk −ik σiσk −ik j ∆ 0 (σ ) + ∆ 0 0 (σ ) and ∆ (σ ) := sup 1−µ sup σiσk −ik σiσk −ik sisk −ik An identical PoA bound of (λ )/(minj j − µ ) λ Ci(si, sk, σ−ik)−Ck(si, sk, σ−ik), ∀si, sk for social cost can be derived for all ACCEL distributions. We will now shift focus from costs and PoA to the ex- That is, the differences in i’s and k’s utilities sum up istence of PNE and their reachability via learning dynamics to 0 on any 4-cycle of strategy updates. Our condition on – we have seen learning via advertising in this section. We couplings abstracts away the individuals in a group and will relate these topics to bounds on PoA in Section V. considers an auxiliary game using group potentials. Theorem 4: Let G be a (N 1,N 2)-coupling of exact potential 6 P PT j j In this section, whenever we write t, we mean t=1. games: ∀σ ∈Σ , the sub-game G|N j ←σj induced by group 5

j j N playing strategy vector σ has exact potential Φσj (·), because resource delays are increasing. Such players are nat- for j = 1, 2. Define a game Γ with players {1, 2} in urally vulnerable to malicious players that seek congestion. which player j’s strategy space is Σj and the utilities from Leveraging Lemma 3 with N 2 as the malicious congestion- 1 2 1 2 playing (σ , σ ) are (Φσ2 (σ ), Φσ1 (σ )). Then G is an exact seeking players, we can establish that a model of malice if and only if Γ is an exact potential game. preserves some structure in several congestion game classes, An exact potential is preserved when some players’ strate- unlike other models with a similar scope [1]. gies are fixed. Theorem 4’s condition that each induced sub- Corollary 1:A (N 1,N 2)-coupling G with |N 2| ≥ 2 pre- game has an exact potential is tight. Also, if Γ is not an exact serves weak acyclicity assuming that players in N 2 benefit potential game, then G may even lack a PNE: any two-player from using resources r used by others (in N 1 or N 2), namely game G satisfies Theorem 4’s condition since G is trivially 2 (A) for any i2 ∈ N if Supp(σ)\σi26=∅ then i2 can BR a coupling of potential games. by also using some r ∈Supp(σ)\σi2 , i.e. σi2 := σi2 ∪{r}. C. Weakly acyclic games where Supp(σ) denotes resources used (by players) in σ ∈Σ. We present a second instantiation of coupling with use- Such malicious players (in N 2) are “lone wolves” since ful structural properties. It requires and aims for signifi- they act independently; they may also increase the congestion cantly less cohesion among groups of players. Specifically, of other malicious players. We can show that Corollary 1 Lemma 3 below provides a sufficient condition for a game- applies to several classes of congestion games (in particular coupling to preserve weak acyclicity. We then use it to iden- assumption (A) holds): market-sharing games [17] (a natural tify a model of malice in congestion games that guarantees, model of uniform competition on resources), facility location unlike other similar models, the existence of a PNE. games [17] (a game-theoretic distributed allocation problem) Weak acyclicity, a natural property of engineered systems and load-balancing games, for which Corollary 1 is tight (e.g. in the consensus problem [10]), is defined as the exis- in several ways. Weak acyclicity is preserved even if each tence of a better-response path from any strategy vector to malicious player also minimizes delay (like a regular player) a PNE. In better-response dynamics, players asynchronously on some resources as long as assumption (A) holds. update to a better strategy given others’ current strategies. Our existence guarantees for PNE in game couplings with Even though defined as a property of global convergence malicious players imply that the (local) PoA is well-defined. 5 1 of coordinated Nash dynamics, simple distributed dynamics Load-balancing games with affine delays are ( 3 , 3 )-smooth 5 converge to PNE [10] in any weakly acyclic game. 5 3 [17], yielding a 2 = 1− 1 upper bound on their local P oA. Recall Def. 8: weak acyclicity amounts to existence of a 3 weak potential. Our sufficient condition is quite unrestrictive V. AGGREGATIONVIAHETEROGENEITY – it only concerns the change of the sub-game weak potential Convergence guarantees for dynamics are made more rele- at a sub-game equilibrium. Intuitively, this requires that the vant by quantitative statements about the quality of equilibria 7 progress (towards convergence) of one group is not impeded (or equilibria that such dynamics can reach) . by the progress of the other group. Interestingly, we only Mol et al. [11] show that heterogeneous systems in aggre- require this condition upon (local) convergence. gation games (AG, defined below), have significantly lower PoA than homogeneous AGs. They only prove convergence Lemma 3: If a coupling G of weakly acyclic games satisfies j j of Nash dynamics and hence existence of PNE for homoge- • G|N j ←σj has weak potential Φσj for ∀σ ∈Σ , j =1, 2. 2 1 neous systems and one instance of a heterogeneous system • if σ is PNE in G|N 1←σ1 then any better-response σ¯i 1 2 (we review their results below after introducing the model). to σ−i (and σ ) by any i in sub-game G|N 2←σ2 does We prove convergence of Nash dynamics for any heteroge- 2 2 2 2 not reduce the weak potential: Φ 1 (σ ) ≤ Φ 1 1 (σ ) σ σ¯i ,σ−i neous AG, thus solving a salient open question in their work. then G is a weakly acyclic game with weak potential For this we exhibit a new global potential that explicitly 1 2 2 1 Φ(σ , σ ) = C · Φσ1 (σ ) + Φσ2 (σ ) for C > 0 with couples the potentials of each homogeneous sub-system. We 2 2 2 1 C · min(Φ 1 (σ ) − Φ 1 (¯σ , σ )) > max |Φ 2 2 (σ ) − thus show that heterogeneity is useful even as a utility design σ σ i −i (σi ,σ−i) 1 Φ 2 2 (σ )| where the min and max are both over i ∈ criterion (to reduce PoA) while it preserves global stability. (¯σi ,σ−i) 2 2 2 2 2 1 1 N , σi , σ¯i ∈ Σi, σ−i ∈ Σ−i, σ ∈ Σ and, for the min only, A. Model the Φ2 difference in the argument must be strictly positive. Aggregation games model systems aiming for high inter- One can show that the weak acyclicity of each in- nal connectivity. Specifically, consider an undirected graph 1 2 duced sub-game for any fixed vectors σ , σ is necessary. Gr = ({1,...,N},E) without self-loops. There are n ≤ N Lemma 3’s second assumption relates sub-games. In general, players, that must each choose a different vertex in 1..N; weak acyclicity, unlike the existence of an exact or ordinal denote by H the set of all n players’ vertices: |H| = n. potential, is not preserved when a player’s strategy is fixed. Each player i has a parameter βi ∈ [0, 1], inducing its utility Congestion-seeking malice: We apply Lemma 3 to the 8 function uβi it aims to maximize, where if vi is i’s vertex well-studied [13], [16] congestion games. These games arise u (v ,H \{v }) = E + β E (1) in many settings with joint usage of resources and are iso- βi i i vi,H i vi,{1,...,N}\H morphic to exact potential games. In many congestion games, 7Profiles during learning dynamics, even non-convergent ones, may especially ones modeling routing applications through fixed however be much better than any PNE [7]. networks, players have higher costs for higher congestion, 8Utilities are more natural than costs for evaluating connectivity. 6

j j j j j j and EI1,I2 = |{e ∈ E : e = (i1, i2), i1 ∈ I1, i2 ∈ I2}| where Φβ (H ,H \ H )=(1 + β )EH + β EH ,{1,...,N}\H denotes the number of edges between vertex sets I1 and I2. is an exact potential of the aggregation sub-system over the j j We write G(Gr, β1, . . . , βn) for the corresponding AG G. (homogeneous) H given fixed vertices of others (in H \H ). Players with β = 0 are called followers as they maximize In a homogeneous system (J =1, i.e. same β for all), this u0 = Ev,H i.e. the number of edges to H (the other players’ weighted potential reduces to the exact one in Theorem 6. vertices). In contrast, players with β = 1 are called leaders The only AGs not covered by Theorem 7 are ones con- because they maximize u1 = Ev,{1,...,N} i.e. the degree of taining leaders (β = 1). Dealing with all leaders separately, v in the hope that other players, in particular followers, will one can easily identify an ordinal potential. Hence be drawn to the adjacent vertices. A player with a general β Corollary 2: A AG has an ordinal potential and thus a PNE. is called a β-leader; note uβ(·) = βu1(·) + (1 − β)u0(·). We call an AG G(Gr, β1, . . . , βn) homogeneous if βi = β, ∀i. VI.CONCLUDINGREMARKS The social welfare (the counterpart of social cost) is the number EH of internal edges, for any β1, . . . , βn. As EH We introduced game couplings, a concept allowing a maxH∗ EH∗ principled game-theoretic study of globally heterogeneous should be maximized (distributively),P oA(G)= min E . H PNE H systems exhibiting local homogeneity. We gave several ap- B. Price of anarchy and convergence plications of this framework to learning in games, quality For n=Θ(N) players there exist [11] graphs Gr for which of equilibria (PoA) and structural properties. Studying other any uniform β leads to high P oA = Θ(N). A balanced mix sources of heterogeneity, particularly by their effect on PoA, of β-leaders and followers has constant PoA for constant β, is an exciting research direction suggested by our work. but existence of PNE had only been established for β =1. REFERENCES Theorem 5: [11] There exist connected Gr such that for any β and homogeneous AG G = G(Gr, β, . . . , β), P oA(G)≥n. [1] M. Babaioff, R. Kleinberg, and C. Papadimitriou. Congestion games 1 with malicious players. In Proc. of ACM EC, 2007. For λn β-leaders (β ≥ n ) and (1−λ)n followers, we have [2] M.-F. Balcan, A. Blum, and Y. Mansour. Improved equilibria via 1 1 P oA(G) = O(min( 1−λ n, βλ(1−λ) )) for any AG G. Hence, public service advertising. In Proc. of SODA, pages 728–737, 2009. PoA is constant for constant λ (i.e. a balanced mix) and β. [3] M.-F. Balcan, A. Blum, and Y. Mansour. Circumventing the price of anarchy: Leading dynamics to good behavior. In Proc. of Symposium Given their high PoA, AGs cannot be (λ, µ)-smooth for on Innovations in Computer Science, pages 200–213, 2010. small (λ, µ). We instead prove their stability via coupling. [4] A. Blum, M. Hajiaghayi, K. Ligett, and A. Roth. Regret minimization and the price of total anarchy. In STOC, pages 373–382, 2008. Any homogeneous AG has an exact potential (implicitly [5] N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. shown in [11]). The form of this potential implies that any Cambridge University Press, 2006. Nash dynamics converges to a PNE in polynomial time. [6] A. Fabrikant, A. Jaggard, and M. Schapira. On the structure of weakly acyclic games. In 3rd Symposium on Algorithmic , 2010. Theorem 6: A homogeneous AG G, i.e. βi =β ∈[0, 1] ∀i has [7] R. Kleinberg, K. Ligett, G. Piliouras, and E. Tardos. Beyond the Nash exact potential Φ (H) = (1 + β)E + βE equilibrium barrier. In Proc. of Innovation in Computer Science, 2011. β H H,{1,...,N}\H [8] E. Koutsoupias and C. H. Papadimitriou. Worst-case equilibria. In In contrast, the only known structural result for a heteroge- STACS, pages 404–413, 1999. [9] J. Marden, G. Arslan, and J. S. Shamma. Autonomous vehicle- neous system is that when all βi are either 0 or 1, i.e. a mix target assignment: A game theoretical formulation. ASME Journal of of leaders and followers, the game has an ordinal potential. Dynamic Systems, Measurement, and Control, pages 584–596, 2007. Structural results are however critical to Theorem 5 since it [10] J. Marden, G. Arslan, and J. S. Shamma. Cooperative control and potential games. IEEE Transactions on Systems, Man, and bounds the quality of PNE without proving that they exist. Cybernetics, Part B: Cybernetics, pages 1393–1407, 2009. We significantly generalize these results, using a weighted [11] P. Mol, A. Vattani, and P. Voulgaris. The effects of diversity in potential function. This is always an ordinal potential and it aggregation games. In Innovation in Computer Science, 2011. [12] D. Monderer. Multipotential games. In IJCAI, pages1422–1427, 2007. is an exact potential if and only if all weights are 1. [13] D. Monderer and L. Shapley. Potential games. Games and Econ. Definition 9: A game has a weighted potential function [13] Behavior, 14:124–143, 1996. n [14] B. Moore and K. Passino. Distributed task assignment for mobile Φ: ×i=1Σi → R with (positive) weights w1, . . . , wn if agents. IEEE Transactions on automatic control, 52(4):749–753, 2007. 0 0 ui(σi, σ−i) − ui(σi, σ−i) = wi · (Φ(σi, σ−i) − Φ(σi, σ−i)) [15] U. Nadav and T. Roughgarden. The limits of smoothness: Primal-dual for any player i and any strategies σ , σ0 ∈ Σ , σ ∈ Σ . framework for price of anarchy bounds. InWINE, pages319–326, 2010. i i i −i −i [16] R. Rosenthal. A class of games possessing pure-strategy Nash We are ready now for this section’s main result: any set of equilibria. International Journal of Game Theory, 2:65–67, 1973. players, with arbitrary β < 1 parameters, leads to a weighted [17] T. Roughgarden. Intrinsic robustness of the price of anarchy. In Proc. i of STOC, pages 513–522, 2009. potential function. The weighted potential is an explicit [18] T. Roughgarden and F. Schoppmann. Local smoothness and the price mapping of potentials in each sub-game. An analogous result of anarchy in atomic splittable congestion games. In SODA, 2011. when some β’s equal 1 follows easily. Thus Nash dynamics [19] W. Sandholm. Decompositions and potentials for normal form games. Games and Econ. Behavior, 70(2):446–456, 2010. converge (and PNE exist) in any aggregation system. [20] K. Tumer and D. Wolpert. A survey of collectives. In Collectives and Theorem : Fix an AG G with H = H1 ∪...∪HJ where Hj the Design of Complex Systems, pages 1–42. Springer, 2004. 7 [21] D. Wolpert and S. Bieniawski. Distributed control by Lagrangian j are vertices occupied by all β -leaders. Then G has weighted steepest descent. In Proc. 43rd IEEE Conf. on Decision and Control, potential (with weights 1 − βj > 0 for each player i ∈ Hj) pages 1562–1567, 2004.

j j XJ Φβj (H ,H \ H ) − EH Φ(H) = EH + j=1 1 − βj