Consensus over evolutionary graphs
Michalis Smyrnakis, Nikolaos M. Freris and Hamidou Tembine
Abstract— We establish average consensus on graphs with agents. An archetypal problem is average consensus, where dynamic topologies prescribed by evolutionary games among the goal is for each agent to asymptotically compute the strategic agents. Each agent possesses a private reward function average of all nodal values; cf. [5], [6], [7], [8], for a and dynamically decides whether to create new links and/or whether to delete existing ones in a selfish and decentralized largely non-exhaustive list of references. This theme has fashion, as indicated by a certain randomized mechanism. This proven a prevalent design tool for distributed multi-agent model incurs a time-varying and state-dependent graph topol- optimization [9], [10], signal processing [11], numerical ogy for which traditional consensus analysis is not applicable. analysis [12], and estimation [13], [14]. We prove asymptotic average consensus almost surely and in In this paper, we seek to bridge the gap between evolu- mean square for any initial condition and graph topology. In addition, we establish exponential convergence in expectation. tionary game theory and distributed optimization by studying Our results are validated via simulation studies on random average consensus over evolutionary graphs. In our setting, networks. each agent has a local variable that represents its ‘strategy’: Index Terms— Consensus, Evolutionary Games, Evolutionary the strategies evolve following a consensus protocol over Graphs, Distributed Algorithms, Randomized Algorithms. a time-varying graph capturing inter-agent cooperations. At each time instant, agents dynamically select the agents with I.INTRODUCTION which they cooperate (from a candidate set) based on their Evolutionary game theory has been established as a mod- utility (i.e., fitness) that depends on their own strategy and eling tool for interactions between populations of strate- the strategies of neighboring agents they intend to coop- gic entities. In specific, evolutionary games describe the erate with. Specifically, agents create or drop links (i.e., population dynamics resulting from pairwise interactions. It cooperations) when they deem it beneficial for them, and has found numerous applications in various areas of multi- they do so via a randomized decentralized mechanism, in agent systems such as in wireless networks [1], [2], swarm a selfish manner. Examples enlist social networks [15] and robotics [3], and dynamic routing protocols [4]. coordination of robot swarms [3]. Evolutionary graphs arise as an application of evolutionary We consider the problem of average consensus over game theory in modeling dynamic graph topologies. In such networks of time-varying topology captured by a Markov context, a population is organized as a network (graph) Decision Process (MDP). Unlike prior work on the sub- with the nodes (vertices) representing atoms (agents) and ject [8], [10], the topology depends on the agents’ values, links (edges) representing interactions among them. This is which renders previous analysis techniques inapplicable in captured by a weighted graph with time-varying edge set our case. We proceed to establish average consensus a.s. determined by an evolutionary game. We will present a spe- (almost surely) and in m.s. (mean square) sense, using cific randomized decentralized mechanism for determining stochastic Lyapunov techniques. Additionally, we prove that the graph topology based on the individual fitness functions the convergence is exponential in expectation, and provide a of the agents. In our setting, each node maintains a local lower bound on the expected convergence rate. Finally, our variable and computes its fitness function using its own value method was empirically assessed via numerical simulations. along with the values from its neighbors (both active and The remainder of the paper is organized as follows: arXiv:1803.02564v1 [math.OC] 7 Mar 2018 inactive, cf. Sec. III). Subsequently, it dynamically readjusts Sec. II exposes preliminaries on graph theory, consensus its neighbor set by randomly adding or deleting links with and evolutionary graph theory. In Sec. III, we present the probabilities dictated by the resulting change of its fitness problem formulation. The convergence analysis is presented function. in Sec. IV. Sec. V illustrates simulation results, while Sec. VI Consensus is a canonical example of in-network coordi- concludes the paper and discusses future research directions. nation in multi-agent systems. Each agent maintains a local value and the goal is for the entire network to reach agree- II.PRELIMINARIES ment to a common value in a distributed fashion, i.e., via In this section, we recap essential background on graph local exchanges of messages between neighboring (adjacent) theory, consensus protocols and evolutionary games. In the remainder of the paper, we use boldface for vectors and The authors are with the Division of Engineering at New York University Abu Dhabi, Saadiyat Island, P.O. Box 129188, Abu Dhabi, UAE. The last capital letters for matrices. Vectors are meant as column two authors are also with NYU Tandon School of Engineering. E-mails: vectors, and we use 0, 1 to denote the vectors with all entries {ms10775, nf47, tembine}@nyu.edu. This work was supported equal to zero, one, respectively, and I to denote the identity by the National Science Foundation (NSF) under grant CCF-1717207, and the U.S. Air Force Office of Scientific Research (AFOSR) under grant matrix (with the dimension made clear from the context FA9550-17-1-0259. in all cases). Last, we use the terms ‘agent’, ‘vertex’ and ‘node’ interchangeably, and the same holds true for ‘edge’ connectivity of the graph, and is denoted as λ2(G) := λ2(). and ‘link,’ in what follows. It is positive if and only if the graph is connected.
A. Graph theory B. Consensus Consider the case of n interacting agents which aim to Each agent i ∈ V maintains a local scalar value xi. We call achieve consensus over a quantity of interest, for instance the state of the network the vector x obtained by stacking compute the average of their values. Such problem is an in- all nodal values {xi}i∈V . We use xi,t to denote the value of stance of computing on graphs, where each agent is modeled node i at time t, and, correspondingly, xt for the network as a vertex of the graph and edges are drawn only between state at time t. interacting nodes: we assume that two nodes can interact with A widely studied method which describes the evolution of each other (for instance, exchange private information) if and xi is linear consensus [5], [8]. The dynamics for xi can be only if they are connected, i.e., there is an edge between them written (up to a multiplicative constant) as in the graph. X x˙ i,t = ai,j(xj,t − xi,t), Formally, let a graph be denoted by G = (V, E), where j∈Ni V is the non-empty set of vertices (nodes) and E is the set of edges. In this paper, we restrict attention to undirected which can further be written compactly in matrix form: graphs, that is to say the edge set E consists of unordered x˙ = −x . pairs: (i, j) ∈ E implies that agents i, j can interact in a t t symmetric fashion, i.e., cooperate1. We further assume that For an undirected and connected graph, the network reaches the graph does not contain self-loops (i.e., (i, i) ∈/ E for all average consensus, i.e., i ∈ V); this is without any loss in generality, since edges xt −→ Ave(x0)1, capture inter-agent interactions in our framework. We set t→∞ n := |V|, m = |E|, to denote the number of nodes, edges where Ave(x ) := 1 1>x is the average of the values at the respectively. We say that the graph G is connected if there 0 n 0 is a path between any two nodes i, j ∈ V; otherwise, we say initial time 0 [8]. Besides, the convergence rate is exponential that the graph is disconnected. with rate lower-bounded by the Fiedler value [8]. In this paper, we will study consensus over time-varying The adjacency matrix A ∈ n×n of the graph captures R graphs as abstracted by a time-varying Laplacian , dictated connections between the nodes: for any two nodes i, j ∈ V, t by agents’ randomized decisions; that is to say, = (x , E − ) a is defined as: t t t ij is a random matrix that depends on both the state at time 1, if (i, j) ∈ E, t and the topology ‘right before’ time t; consequently, the a = ij 0, otherwise. analysis in existing literature [5], [6], [7], [8], [10] does not directly carry through. The definition can be extended to weighted graphs, in which case aij can take an arbitrary value if (i, j) ∈ E. For an C. Evolutionary graphs undirected graph, aij = aji, for all i, j ∈ V, i.e., A is An evolutionary graph is a graph whose topology is symmetric. Besides, A has a zero-diagonal (aii = 0 for all i), specified by an evolutionary game [17], [18] of a single since G is assumed to have no self-loops. The neighborhood population with finite number of players, n, placed on a of a node i contains all nodes that the node has a connection graph G. The interactions of players are captured by the with, and is denoted by Ni := {j : aij 6= 0}. The degree edge set of G. Each player i has a set of actions si ∈ Si of node i is defined as the number of its neighbors, i.e., P and receives a certain pay-off according to its utility (or di := |Ni| = j∈V aij. Let D be the diagonal degree- fitness) function which is a mapping from the joint action matrix (i.e., its (i, i)−th entry equals di, and off-diagonal space S := S1 ×S2 ×...Sn to the real numbers, fi : S → R. entries are zero). We define := D − A, the Laplacian of The evolution of graph topology follows a stochastic pro- n×n the graph. Clearly, ∈ R is symmetric. Additionally, it cess administered through pairwise interactions. In particular, can be shown that is positive semidefinite [16]. It is well- two players (that are allowed to interact) are randomly known [16] that rank = n − k, where k is the number of selected and a “copy” of the player with the higher fitness, connected components of the graph. In particular, a graph is takes the place of the player with the lower one. As a result, connected if and only if rank = n−1. By its very definition, the strategy of the player with the smaller fitness is replaced the Laplacian has a zero eigenvalue with corresponding by the strategy of the player with the higher one. eigenvector the all-one vector 1. In fact, the multiplicity of the zero eigenvalue equals the number of connected III.PROBLEM FORMULATION components of the graph. The second smallest eigenvalue In a cyberphysical system, such as a wireless sensor of the Laplacian is called the Fiedler value or algebraic network [19], [20], it is common that agents may opt to dynamically create new links with other agents or drop 1The extension of our methods to directed graphs that model asymmetric interactions (e.g., asymmetric reward functions) will be the focal point of existing ones during the coordination process; this behavior future work. results in time-varying graphs. In order to properly describe this process, it is necessary to formulate a dynamic graph, whether to: a) maintain an existing link (i.e., a link that whose structure depends on time, topology and network state. is active ‘right before’ time t, equivalently (i, j) with j ∈ In what follows, we define a dynamic graph as a graph (1) m m Ni,t− ), if χij,t = 1 (χij,t = 0 means that the link is with fixed predetermined vertex set V of agents, in which the (2) dropped); and b) create a new link (i, k) with k ∈ Ni,t− , edge set E varies over time, in a state-dependent randomized c m m c c if χik,t = 1. Clearly, we set χji,t ≡ χij,t, χki,t ≡ χik,t. fashion. In particular, the edge set E = E(xt, t, ω) ⊆ V × V The decisions are Bernoulli random variables with respective is a state-/time-dependent random set in a probability space ‘success’ probabilities (the probability of the value 1) given (Ω, F, P) (where Ω is the sample space, F is the σ−algebra by 0 ≤ wm ≤ 1, and 0 ≤ wc ≤ 1. Ω ij,t ik,t on , and P is the probability measure); correspondingly, In this paper, the decision rules are state-dependent and A = A(x , t, ω) we define the adjacency matrix t and the time-invariant, i.e., χm , χc are independent Bernoulli = (x , t, ω) ij,t ik,t Laplacian matrix t . In this paper, we focus on random variables with success probabilities that depend time-varying graphs with topology at time t depending on m on the coordination levels of the two neighbors: wij,t = both the state at time t and the topology ‘right before’ time m c c wij,t(xi,t, xj,t), wik,t = wik,t(xi,t, xk,t), cf. (6), (7) for their t = (x , E − , ω) E , A , , i.e., t t . We use the shorthand notation t t t exact definition. and G = (V, E ) to emphasize the type-varying aspect. t t The following remark underlines the inapplicability of For each node i ∈ V, we denote by N the set of all i previous analysis [8] in our setting. feasible neighbors, i.e., the set of all nodes that node i Remark 1: The graph corresponding to the Laplacian ma- can potentially create a connection with. Since we focus trix may be disconnected. on undirected graphs, we assume that j ∈ N implies that t i Indeed, since decisions are probabilistic, there is a positive i ∈ N . At each time t, for any given node i ∈ V, we j probability that the resulting graph is disconnected (even the denote the set of active neighbors, i.e., the set of nodes with (1) event of an empty edge set has positive probability) at each which i is connected, using Ni,t ⊆ Ni. Furthermore, we let (2) (1) given time instant. Ni,t := Ni \Ni,t , the set of inactive neighbors, i.e., the We may stack the decisions {χm , χc } into a cor- set of nodes that i is not connected with, but may decide ij,t ij,t responding Laplacian matrix = (xt, Et− , ω) ≡ t with to connect with based on the evolutionary game. The degree entries {lij}i,j∈V (dropping time dependency for notational (1) of a node i at time t, Ni,t , is the total number of active simplicity) defined by: neighbors of node i at time t. For the subsequent treatment, m c we make the assumption that the graph obtained by taking lij = lji := − 1 (1) χij + 1 (2) χij , {j∈Ni } {j∈Ni } the union of all feasible neighbor sets is connected: P lii := − lij, Assumption 1: The graph G0 = (V, E0), with E0 := j6=i
∪i∈V ∪j∈Ni {(i, j)} is connected. where 1{·} is the 0−1 indicator function (1 if the event holds This condition is necessary for consensus to be achieved: oth- and 0 else). We proceed to write the update rule in matrix erwise, the graph will be disconnected at each time regardless form as follows: of the agents’ decisions, with no possible exchange of x˙ t = −txt. (2) information across its connected components, which implies that consensus is infeasible. We will establish the sufficiency We call this the state evolution equation; note that, by its of the condition for the dynamic topology instructed by the very definition, it constitutes a continuous Markov Decision evolutionary game we propose. Process (MDP). Let xt = x1,t, . . . , xn,t denote the coordination levels The following proposition shows that all coordination of the agents, where we will restrict attention (without any levels are guaranteed to remain in the interval [0, 1] if they n loss in generality) to the case that 0 ≤ xi,t ≤ 1, ∀i ∈ are initialized in [0, 1], i.e., it establishes that the set [0, 1] V, t ≥ 0. For instance, the evolution of the coordination is forward invariant. n levels may be considered as a resource allocation process, in Proposition 1: Suppose x0 ∈ [0, 1] . Then, under state n n which agents decide to share a percentage of a resource they evolution (2), xt ∈ [0, 1] for all t > 0, i.e., [0, 1] is own with their neighbors. Similarly, in an opinion dynamics forward-invariant. setup, the coordination levels may reflect the beliefs of the Proof: The proof considers two cases: the first case agents, e.g., the information state of vehicles in a robot team. considers a nodal value reaching the upper bound (1), and The agents adjust their coordination levels based on inter- the second the lower bound (0). actions with their neighbors, according also to their tendency Case 1: Assume that for some t ≥ 0, there exists i ∈ V n to create a link with other agents or drop an existing one. The with xi,t = 1 and that xs ∈ [0, 1] for all s ≤ t. Then, given evolution of an agent’s coordination level may be described that xj,t ∈ [0, 1] for all j 6= i it follows that by the following dynamic consensus protocol: X P m x˙ i = −lij(xj,t − 1) ≤ 0, x˙ i,t = j∈N (1) χij,t(xj,t − xi,t) i,t− j∈V\{i} P c (1) + (2) χik,t(xk,t − xi,t), k∈N − i,t because xj,t ≤ 1 and −lij ≥ 0. Therefore xi can never m c where χij,t, χik,t are 0−1 variables that respectively indicate exceed the value 1. Case 2: Assume that for some t ≥ 0, there exists i ∈ V In the evolutionary game, a link may be created/dropped n with xi,t = 0 and that xs ∈ [0, 1] for all s ≤ t. Since if both agents desire to coordinate or not based on the xj,t ∈ [0, 1] for all j 6= i it follows that corresponding increment (or decrement) of their individual X fitness functions. Essentially, if both agents benefit from x˙ i = −lijxj,t ≥ 0, maintaining/creating a link, the corresponding probability j∈V\{i} must be higher than the case where only one node benefits therefore xi can never go below 0. or when the fitness of both agents is decreased. In [22] a m Remark 2: The forthcoming analysis applies irrespec- sigmoid function was used to determine the weights wij,t n c tively of the assumption that the values xt ∈ [0, 1] , i.e., for and wij,t. In our formulation, the weights correspond to the arbitrary initial conditions x0. This assumption is adopted probabilities that agent i maintains (one minus the probabil- solely for the sake of interpretability in the context of ity that it drops) or creates link (i, j), respectively. Following evolutionary games. a similar approach, the weights are selected as (where we once again drop time dependency since the weight-rule is A. Evolutionary game state-dependent but time-invariant): In this section, we provide a rule for selecting the weights (i.e., probabilities) wm , wc based on a particular evolu- wm = 1 − 1 tanh f˜d (x) + f˜d (x) ij,t ij,t ij 2 2 ij ji (6) tionary game. We use Continuous Actions Iterative Pris- 1 1 = 2 − 2 tanh (c − b)(xi + xj) , oner’s Dilemma (CAIPD) [21] to define the fitness function of a given node and illustrate how the weights are selected. wc = 1 + 1 tanh f˜c (x) + f˜c (x) ij 2 2 ij ji (7) In CAIPD, there are n agents that choose their coordina- = 1 + 1 tanh (b − c)(x + x ) . tion levels given their neighbors’ decisions. Each agent i has 2 2 i j to pay a fee that is related to its coordination level and gains a Note that, by definition, the two values are equal and 1 n reward related to the coordination levels of its neighbors: the lower-bounded by 2 (in light of the fact that xt ∈ [0, 1] , for higher the coordination level of agent i and the coordination all t ≥ 0; cf. Proposition 1). We define the weighted Lapla- levels of its neighbors are, the higher the cost and reward cian matrix W with entries (again dropping time dependency are, respectively. for notational simplicity) given by: Formally, the reward of agent i when it sets its coordina- m c wij = wji := − 1 (1) wij + 1 (2) wij , (8) tion level to xi is defined using the following fitness function {j∈Ni } {j∈Ni } P (where we drop dependency from time t henceforth, since wii := − j6=i wij, the definition of fitness in CAIPD is time-independent): IV. CONVERGENCE ANALYSIS
X (1) fi(x) = b xj − c Ni xi, (3) Formally, for t > 0, t is an Ft− -measurable random (1) matrix where the σ−algebra F − is defined by F − := j∈Ni t t σ(∪s
REFERENCES [1] H. Tembine, E. Altman, R. El-Azouzi, and Y. Hayel, “Evolutionary games in wireless networks,” IEEE Transactions on Systems, Man,