Consensus of Interacting Particle Systems on Erd¨os-R´enyi Graphs
Grant Schoenebeck∗ Fang-Yi Yu†
Abstract Node Dynamics with Interacting Particle Systems—exemplified by the voter f(x) = x. model, iterative majority, and iterative k−majority processes—have found use in many disciplines including dis- This models has been extensively studied in math- tributed systems, statistical physics, social networks, and ematics [15, 22, 27, 28], physics [6, 9], and even in Markov chain theory. In these processes, nodes update their social networks [8, 34, 35, 36, 14]. A key question “opinion” according to the frequency of opinions amongst studied is how long it takes the dynamics to reach their neighbors. We propose a family of models parameterized by an consensus on different network typologies. update function that we call Node Dynamics: every node Iterative majority: In the iterative majority dynam- initially has a binary opinion. At each round a node is ics, in each round, a randomly chosen node updates uniformly chosen and randomly updates its opinion with the probability distribution specified by the value of the update to the opinion of the majority of its neighbors. This function applied to the frequencies of its neighbors’ opinions. corresponds to the Node Dynamics where In this work, we prove that the Node Dynamics con- 1 if x > 1/2; verge to consensus in time Θ(n log n) in complete graphs and f(x) = 1/2 if x = 1/2; dense Erd¨os-R´enyi random graphs when the update function 0 if x < 1/2. is from a large family of “majority-like” functions. Our tech- nical contribution is a general framework that upper bounds Typically works about Majority Dynamics study 1) the consensus time. In contrast to previous work that relies when the dynamics converge, how long it takes the on handcrafted potential functions, our framework system- dynamics to converge, and whether they converge atically constructs a potential function based on the state space structure. to the original majority opinion—that is, does ma- jority dynamics successfully aggregate the original 1 Introduction opinion [25, 7, 23, 31, 37]. We propose the following stochastic process—that we Iterative k-majority: In this dynamics, in each call Node Dynamics—on a given network of n agents round, a randomly chosen node collects the opinion parameterized by an update function f : [0, 1] → [0, 1]. of k randomly chosen (with replacement) neighbors In the beginning, each agent holds a binary “opinion”, and updates to the opinion of the majority of those either red or blue. Then, in each round, an agent is k opinions. This corresponds to the Node Dynam- uniformly chosen and updates its opinion to be red with ics where probability f(p) and blue with probability 1−f(p) where k X k p is the fraction of its neighbors with the red opinion. f(x) = x`(1 − x)n−`. ` Node dynamics generalizes processes of interest in `=dk/2e many different disciplines including distributed systems, statistical physics, social networks, and even biology. A synchronized variant of this dynamics is pro- posed as a protocol for stabilizing consensus: col- Voter Model: In the voter model, at each round, lection of n agents initially hold a private opinion a random node chooses a random neighbor and and interact with the goal of agreeing√ on one of updates to its opinion. This corresponds to the the choices, in the presence of O( n)-dynamic ad- versaries which√ can adaptively change the opinions of up to O( n) nodes at every round. In the syn- ∗University of Michigan, [email protected]. He is grate- chronized variant of this dynamics, Doerr et al. [17] fully supported by the National Science Foundation under Career prove 3-majority reaches “stabilizing almost” con- Award 1452915 and AitF Award 1535912. sensus on the complete graph in the presence of †University of Michigan, [email protected]. He is gratefully √ supported by the National Science Foundation under AitF Award O( n)-dynamic adversaries. Many works extend 1535912. this result beyond binary opinions [16, 13, 5, 1].
Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited Iterative ρ-noisy majority model: [20, 21] In this and concentration property to control the behavior of dynamics, in each round, a randomly chosen node the process [5], or (3) using handcrafted potential func- updates the majority opinion of its neighbors with tions [31]. probability 1 − ρ and uniformly at random with Our results fill in a large gap that these results probability ρ. do not adequately cover. Mixing time is not well- defined in non-reversible or reducible Markov chains, 1 − ρ/2 if x > 1/2; and so does not apply to Markov chains with multiple f(x) = 1/2 if x = 1/2; absorption states, like in the consensus time question we ρ/2 if x < 1/2. study. Reducing the dimension and using a mean field approximation fails for two reasons. First, summarizing Genetic Evolution Model: In biological systems, with a small set of parameters is not possible when the the chance of survival of an animal can depend process of interest has small imperfections (like in a on the frequencies of its kin and foes in the net- fixed Erd¨os-R´enyi graph). Second, the mean-field of work [3, 29]. Moreover, this frequency depend- our dynamics has unstable fixed points; in such cases ing dynamics is also known to model the dynamics the mean field does not serve as a useful proxy for the for maintaining the genetic diversities of a popula- Markov process. Handcrafting potential functions also tion [24, 32]. runs into several problems: the first is that because we Our Contribution We focus on a large set of consider dynamics on random graphs, the dynamic is update functions f that are symmetric, smooth, and not a priori well specified; so there is no specific dynamic satisfy a property well call “majority-like”, intuitively to handcraft a potential function for. Secondly, we wish meaning that agents update to the majority opinion to solve the problem for a large class of update functions strictly more often than the fraction of neighbors hold- f, and so cannot individually hand-craft a potential ing the majority opinion. We obtain tight bounds for function for each one. Typically, the potential function the consensus time—the time that it takes the system to is closely tailored to the details of the process. reach a state where each node has an identical opinion— Additional Related Work Our model is similar on Erd¨os-R´enyi random graphs. to that of Schweitzer and Behera [33] who study a va- Our main technical tool is a novel framework for riety of update functions in the homogeneous setting upper bounding the hitting time for a general discrete- (complete graph) using simulations and heuristic argu- time homogeneous Markov chain (X ,P ), including non- ments. However, they leave a rigorous study to future reversible and even reducible Markov chains. This work. framework decomposes the problem so that we only need to upper bound two sets of parameters for all 2 Preliminaries x ∈ X —the reciprocal of the probability of decreasing 2.1 Node Dynamics Given an undirected graph the distance to target 1/p+(x) and the ratio of the G = (V,E) let Γ(v) be the neighbors of node v and probability of decreasing the distance to the target and deg(v) = |Γ(v)|. the probability of increasing the distance to the target: We define a configuration x(G) : V → {0, 1} to p−(x)/p+(x). Our technique can give much stronger assign the “color” of each node v ∈ G to be x(G)(v) bounds than simply lower bounding p−(x) and upper so that x(G) ∈ {0, 1}n. We will usually suppress the bounding p+(x). superscript when it is clear. We will use uppercase (e.g., Once we apply this decomposition to our consensus X(G)) when the configuration is a random variable. time problem, the problem becomes very manageable. Moreover we say v is red if x(v) = 1 and is blue if We show the versatility of our approach by extending x(v) = 0. We then write the set of red vertices as the results to a variant of the stabilizing consensus prob- x−1(1). We say that a configuration x is in consensus lem, where we show that all majority-like dynamics con- if x(·) is the constant function (so all nodes are red or vergence quickly to the “stabilizing almost” consensus all nodes are blue). Given a node v in configuration x on the complete graph in the presence of adversaries. |Γ(v)∩X−1(1)| we define rx(v) = deg(v) to be its fraction of red A large volume of literature is devoted to bound- neighbors. ing the hitting time of different Markov process and achieving fast convergence. The techniques typically Definition 2.1. An update function is a mapping employed are (1) showing the Markov chain has fast f : [0, 1] 7→ [0, 1] with the following properties: mixing time [30], (2) reducing the dimension of the pro- cess into small set of parameters (e.g. the frequency Monotone ∀x, y ∈ [0, 1], if x < y, then f(x) ≤ f(y). of each opinion) and using a mean field approximation Symmetric ∀t ∈ [0, 1/2], f(1/2 + t) = 1 − f(1/2 − t).
Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited Absorbing f(0) = 0 and f(1) = 1. By the Markov property, the expected hitting time can be written as linear equation. We define node dynamics as follows: 1 + P P [τ (y)] if x 6∈ A, [τ (x)] = y∈Ω x,yEM A Definition 2.2. A node dynamics ND(G, f, X0) EM A 0 if x ∈ A with an undirected graph G = (V,E), update function f and initial configuration X0 is a stochastic process Due to the memory-less property of Markov chain, sometimes it is useful to analyze its first step. Let’s over configurations at time t, {Xt}t≥0 where X0 is the initial configuration. The dynamics proceeds in rounds. consider a general measurable function w :Ω 7→ R. If At round t, a node v is picked uniformly at random, and the Markov chain starts at state X = x, the next state we update is the random variable X0, then the average change of w(X) in one transition step is given by 1 with probability f(rXt−1 (v)) Xt(v) = 0 X 0 otherwise (Lw)(x) , EM[w(X )−w(X)|X = x] = Px,yw(y)−w(x) y∈Ω This formulation is general enough to contain many 0 well known dynamics such as the aforementioned voter To reduce the notation we will use EM[w(X )|X] to de- model, iterated majority model, and 3-majority dynam- note the expectation of the measurable function w(X0) ics. given the previous state at X. Note that in some of the original definitions the Definition 2.5. Given Markov chain M with state nodes syncronously update; whereas, to make our pre- space Ω, D ( Ω, and two real-valued functions ψ, φ sentation more cohesive, we only consider asynchronous with domain Ω, we define the Poisson equation as updates. the problem of solving the function w :Ω 7→ such that In this paper, we will focus on the interaction R between the update function f and geometric structure Lw(x) = −φ(x) where x ∈ D, of G. More specifically, we are interested in the w(x) = ψ(x) where x ∈ ∂D. consensus time defined as following. where the ∂D ∪ supp p(x, ·) \ D is the exterior Definition 2.3. The consensus time of node dy- , x∈D boundary of D w.r.t the Markov chain. namics ND(G, f, X0) is a random variable T (G, f, X0) denoting the first time step that ND is in a consen- Note that solving the expected hitting time of set A is a sus configuration. The (maximum) expected con- special case of the above problem by taking D = Ω \ A, sensus time ME(G, f) is the maximum expected con- φ(x) = 1 and ψ(x) = 0. The next fundamental theorem sensus time over any initial configuration, ME(G, f) = shows that super solutions to an associated boundary maxX0 E[T (G, f, X0)]. value problem provide upper bounds for the Poisson equation in Definition 2.5. Now we define some properties of functions. Theorem 2.1. (Maximum principle [19]) Given Definition 2.4. Given positive M ,M , a function f : 1 2 Markov Chain M with state space Ω, D Ω, and I ⊆ 7→ is called M -Lipschitz in I ⊆ if for all ( R R 1 R two real-valued functions ψ, φ with domain Ω, suppose x, y ∈ I, s :Ω 7→ R is a non-negative function satisfying |f(x) − f(y)| ≤ M1|x − y|. Ls(x) ≤ −φ(x) where x ∈ D, Moreover, f is M2-smooth in I ⊆ R if for all x, y ∈ I, 0 0 s(x) ≥ ψ(x) where x ∈ ∂D. |f (x) − f (y)| ≤ M2|x − y|. Then s(x) ≥ w(x) for all x ∈ D. 2.2 Potential Theory for Markov Chains Let Corollary 2.1. (Super solution for hitting time) M = (Xt,P ) be a discrete time-homogeneous Markov chain with finite state space Ω and transition matrix P . Given Markov Chain M with state space Ω and a set of states A Ω, suppose sA :Ω 7→ is a non-negative For x, z ∈ Ω, we define τa(x) to be the hitting time for ( R a with initial state x: function satisfying
LsA(x) ≤ −1 where x 6∈ A, τa(x) , min{t ≥ 0 : Xt = a, X0 = x}, (2.1) sA(x) ≥ 0 where x ∈ A. and τA(x) to be the hitting time to a set of state A ⊆ Ω: Then sA(x) ≥ EM[τA(x)] for all x 6∈ A. Moreover we τA(x) , min{t ≥ 0 : Xt ∈ A, X0 = x}. call sA a potential function for short. Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited 2.3 Concentration Inequalities Let AG be the adjacency matrix of G, so (AG)i,j = 1 ¯ if vi ∼ vj and 0 otherwise, and A = EGn,p [AG], so Theorem 2.2. ([4]) Let X = (x1, . . . , xN ) be a finite ¯ set of N real numbers, that X ,...,X denote a random Ai,j = p if i 6= j and 0 otherwise. Let deg(v) be the 1 n degree of node v. sample without replacement from X and that Y1,...,Yn denote a random sample with replacement from X . If Definition 2.7. The weighted adjacency matrix of f : R 7→ R is continuous and convex, then undirected graph G is defined by n ! n ! ( 1 X X √ if (AG)i,j = 1; deg(v )deg(v ) Ef Xi ≤ Ef Yi MG(i, j) = i j i=1 i=1 0 otherwise .
Theorem 2.3. (A Chernoff bound [18]) Let X , Definition 2.8. (Expansiveness [12]) For Pn i=1 Xi where Xi for i ∈ [n] are independently dis- λ ∈ [0, 1], we say that a undirected graph G is a tributed in [0, 1]. Then for 0 < < 1 λ-expander if if λk(MG) ≤ λ for all k > 1 where 2 λk(MG) is the k-th largest eigenvalue. P[X > (1 + )EX] ≤ exp − EX 3 Theorem 2.5. (Spectral profile of Gn,p [11]) 2 For Gn,p, we denote I as identity matrix and J as the P[X < (1 − )EX] ≤ exp − EX log n 2 matrix that has ones. If Gn,p has p = ω( n ), then with probability at least 1 − 1/n, for all k If a bounded function g on a probability space (X,P ) which is Lipschitz for most of the measure in X, p |λk(MG) − λk(M¯ )| = O log n/(np) then the following theorem prove a concentration prop- erty of g by using union bound and Azuma inequality. ¯ 1 ¯ where (M)i,j = n−1 if i 6= j and (M)i,i = 0. Theorem 2.4. (Bad events [18]) Consider a ran- Because the spectrum of M¯ is {1, −1/(n − 1)} where dom object X = (X1,...,Xn) with probability mea- −1/(n − 1) has multiplicity n − 1, we can have the sure P. Let E be an event in the space of X. Fix a following corollary real-valued function g with domain X which is bounded, log n m ≤ g(X) ≤ M. Let c be the maximum effect of chang- Corollary 2.2. If p = ω( ), the G ∼ Gn,p is i q n ing the ith input coordinate of g conditioned both inputs log n O np -expander with probability 1 − O(1/n), being in E: 0 Let e(S, T ) denote the number of edges between S sup |g(x) − g(x )| ≤ ci. 0 0 and T (double counting edges from S ∩ T to itself), and x,x ∈E,∀j6=i xj =x j let vol(S) count the number of edges adjacent to S. The Then, following lemma relates the number of edges between two sets of nodes in an expander to their expected P g(X) > E[g(X)] + t + (M − m)P[¬E] E number in a random graph. 2 P 2 Lemma 2.1. (Irregular mixing lemma [10]) If G is bounded above by exp −2t / ci . i is a λ-expander, then for any two subsets S, T ⊆ V : We say a sequence of events {An}n≥1 happens with vol(S)vol(T ) p high probability if limn→∞ P[An] = 1 that is P[An] = e(S, T ) − ≤ λ vol(S)vol(T ) 1 − o(1). vol(G)
Finally, let E(δd; v) denote the event that the degree 2.4 Erd¨os-R´enyi Random Graphs Here we of some fixed node v is between (1−δd)np and (1+δd)np present the definition of Erd¨os-R´enyi random graphs and let E(δd) = ∩v∈V E(δd; v) a nearly uniform and show several well-known properties of them that degree event. By applying theorem 2.3 we yields the we need. following lemma. Definition 2.6. (Erdos-R¨ enyi´ Random Graph) Lemma 2.2. (Uniform degree) For any v ∈ V , if 1 Gn,p is a random undirected graph on node set V = [n] G ∼ Gn,p where each pair of nodes is independently connected 2 with a fixed probability p. We further use Gn,p to denote (2.2) P[E(δd; v)] ≤ 2 exp −δdnp/3 this random object. Furthermore, by union bound
1 2 [n] = {1, 2, . . . , n} (2.3) P[E(δd)] ≤ 2n exp −δdnp/3 Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited 3 Warm-up: Majority-liked Update Function model f(x) = x, there is no drift. For majority-like on Complete Graph function because there is a positive drift toward n when In this section we consider majority-like node dynamics P os(x) > n/2; and a negative drift toward 0 when when P os(x) < n/2. Informally the drift is always helping on the complete graph Kn with n nodes in which every pair of nodes has an edge (no self-loops). We use this as and thus the potential function for voter models works. a toy example to give intuition for dense Erd¨os-R´enyi Claim 3.1. Our definition of φ satisfies the inequali- graphs even though we will obtain better bounds later. ties (2.1): Given Markov Chain M = ND(Kn, f, X0) in theorem 3.1, φ defined in (3.4) are non-negative and Theorem 3.1. Let M = ND(K , f, X ) be a node n 0 satisfy dynamic over the complete graph Kn with n nodes. If the update function f satisfies ∀x : 1/2 < x < 1 then Lφ(x) ≤ −1 where x 6= 0n, 1n, x ≤ f(x), then the maximum expected consensus time φ(x) ≥ 0 where x = 0n, 1n. of a node dynamic over Kn is Combining claim 3.1 and corollary 2.1, we have 2 ME(Kn, f) = O n . EM[T (Kn, f, x)] ≤ φ(x). A standard method of proving fast convergence is to guess a potential function of each state and prove the By direct computation, if 0 ≤ k < n, ψ(k + 1) − ψ(k) = expectation decreases by 1 after every step—this is just (n − 1)(H(n − k − 1) − H(k)). Therefore, the maximum an application of corollary 2.1. ψ(k) happens at k = bn/2c, As a warm-up, we will prove theorem 3.1 by guess- 2 ME(Kn, f) ≤ ψ(bn/2c) ≤ (ln 2)n , ing a potential function and applying corollary 2.1. and completes our proof. Proof. [theorem 3.1] Given a configuration x, define −1 P os(x) , |x (1)| then for all red nodes v where 4 Smooth Majority-like Update Function on (Kn) P os(x)−1 x (v) = 1 have rx(v) = n−1 ; otherwise rx(v) = Dense Gnp P os(x) n−1 . In this section, we consider the smooth Majority-liked Because the node dynamics M is on the com- update function defined as follows: plete graph, M is lumpable with respect to partition We call an update function f a {Σ } where Σ = {x ∈ Ω: P os(x) = l} such that Definition 4.1. l 0≤l≤n l smooth majority-like update function if it satisfies ∀x : for any subsets Σ and Σ in the partition, and for any i j 1/2 < x < 1 then x < f(x) and the following technical states x, y in subset Σ , i conditions hold: X X P (x, z) = P (y, z) Lipschitz There exists M1 such that f is M1-Lipschitz
z∈Σj z∈Σj in [0, 1].
Furthermore, inspired by an analysis of Voter Condition at 1/2 There exists an open interval I1/2 ˇ ˇ Model [2] we consider ψ :[n] 7→ R as containing 1/2 and constants 1 < M1, 0 < M2 such ˇ ˇ 0 that f is M2-smooth in I1/2 and 1 < M1 ≤ f (1/2). ψ(k) = (n − 1)k(H(n − 1) − H(k − 1)) Condition at 0 and 1 There exists intervals I0 3 0, + (n − k)(H(n − 1) − H(n − k − 1)) I1 3 1 and a constant Mˆ 1 < 1 such that ∀x ∈ I0, f(x) ≤ Mˆ x and ∀x ∈ I , 1 − f(x) ≤ Mˆ (1 − x). Pk 1 1 1 1 where H(k) , `=1 ` , and define the potential function as Intuitively, the majority-like update function should be “smooth” and not tangent with y = x. The following (3.4) φ(x) = ψ(P os(x)) figure shows an example of smooth majority-like update function. Now we are ready to state our main theorem. The proof of the following claim is deferred to the full version, and here we just give some intuition Theorem 4.1. Let M = ND(G, f, X0) be a node dy- as to why this potential function for the voter model namic over G ∼ Gnp with p = Ω(1), and let f be a works. The sequence (P os(Xt))t≥0 can be seen as smooth majority-like function. Then the expected con- a random walk on 0, 1, . . . , n with drift 2. Moreover sensus time of a node dynamic over G is the drift depends on f(P os(x)) − P os(x). For voter ME(G, f) = O (n log n)
2The formal definition of drift is in Equation (4.8). with high probability.
Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited ploit this structure of our Markov chain and construct a potential function for Equations (2.1).
4.2 A Framework for Upper Bounding the Hit- ting Time We want to upper bound the hitting time from arbitrary state x to {0n, 1n} denoted as τ(x) of a given time-homogeneous Markov chain M = (Ω,P ) with finite state space Ω = {0, 1}n where P (x, y) > 0 only if the states x, y only differ by one digit, |x−y| ≤ 1. We let P os(x) be the position of state x ∈ Ω:
−1 (4.5) P os(x) , |x (1)|, and pos(x) , P os(x)/n and the bias of x as (4.6) Figure 1: An example of smooth majority-like update Bias(x) |n/2 − P os(x)| , and bias(x) Bias(x)/n function. , , Note that the Bias(x) = n/2 if and only if x = 0n, 1n. This theorem shows the fast convergence rate of Suppose that M can be “almost” characterized this process. Note that there is some chance of getting by one parameter Bias(x). Informally, we want the transitions at states x and y to be similar if Bias(x) = a disconnected graph G ∼ Gn,p which results in a Bias(y). Therefore with the notion of first step analysis reducible Markov chain M which cannot converge from + − some initial configurations. Therefore, we can only ask we define {(pG(x), pG(x))}x∈Ω where for the fast convergence result with high probability. p+(x) = [Bias(X0) = Bias(X) + 1|X = x], We note that, the technical conditions exclude G PM (4.7) − 0 interactive majority updates, which we leave for future pG(x) = PM[Bias(X ) = Bias(X) − 1|X = x]. work. + Moreover, we call pG(x) the exertion and define 4.1 Proof Overview Here we will first outline the the drift of state x as follows structure of the proof. In section 4.2 we propose a (4.8) D(x) [Bias(X0) − Bias(X)|X = x]. paradigm for proving an upper bound for the hitting , EM time when the state space has special structure. In It is easy to see D(x) = p+(x) − p−(x). section 4.3, we use the result in section 4.2 to prove Since M can be almost characterized by one param- theorem 4.1. eter, Bias(x), M is almost lumpable with respect to where large literature have devote to different pro- the partition induced by Bias(·). The following lemma cess. Most of them achieve fast convergence result by us- gives us a scheme for constructing an upper bound for ing handcraft potential function or showing the Markov the hitting time: chain has fast mixing time. However it is not easy to find clever potential function for any process, and the Lemma 4.1. (Pseudo-lumpability lemma) Let fast mixing time is not well defined in reducible Markov M = (Ω,P ) have finite state space Ω = {0, 1}n with chain. Recall that the expected consensus time is even n3 and P (x, y) > 0 only if the states x and y differ in at most one coordinate and τ(x) , EM[T (G, f, x)] (4.9) which is exactly the hitting time of states 0n and 1n. 1 d0 = max However in contrast to section 3, finding a clever poten- x:Bias(x)=0 p+(x) tial function is much harder here. We prove theorem 4.1 1 p−(x) using that the expected hitting can be formulated as a dl = max + max dl−1 x:Bias(x)=l p+(x) x:Bias(x)=l p+(x) system of linear equations (2.1) and by explicitly esti- + − mating an upper bound of this system of linear equa- where 0 < l < n/2, and {(p (x), p (x)}x∈Ω are as tions. Moreover, following the intuition in section 3, defined in (4.7). Then the maximum expected hitting the Markov chain M can be nearly characterized by one parameter P os(x) when the node dynamics is on 3To avoid cumbersome notion of parity we only consider n to a graph that is close to the complete graph. We ex- be even here.
Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited time from state x to {0n, 1n} can be bounded as follows: In theorem 4.2 we partition the states into Σs, Σm, Σl according to the bias as follows: X max EM[τ(x)] ≤ d` x∈Ω Σs = {x ∈ Ω: bias(x) < ˇ} 0≤` 1. An upper bound for 1/p+(x). The proof of theorem 4.2, it is rather straightfor- ward using lemma 4.1, and carefully constructing the 2. An upper bound for p−(x)/p+(x). potential function from the recursive Equation (4.9). In theorem 4.2 we give a framework that uses the 4.3 Proof of Theorem 4.1 In this section, we will + − + upper bounds for 1/p (x) and p (x)/p (x) to obtain use theorem 4.2, to prove an O(n log n) time bound by upper bounds for expected hitting time. To have exploiting properties of our process. Specifically, let our some intuition about the statement of the theorem, node dynamic M = ND(G, f, X0) be a node dynamic observe that if the drift D(x) is bounded below by some over G sampled from Gn,p, it is sufficient to prove an positive constant both 1/p+(x) and p−(x)/p+(x) have + − + upper bound for 1/pG(x) and pG(x)/pG(x). Note that nice upper bounds. However, this bound fails when the we use subscripts to emphasize the dependency of the drift is near zero or even negative. Taking our node graph G. dynamics on dense Gnp as an example, when the states To apply theorem 4.2, we partition the states into have either very small or very large bias(x) the drift three groups Σs, Σm, and Σl defined in (4.10). The D(x) can be very close to zero or even negative. The constants ˇ and ˆ depend on the update function f and drift near 1/2 is close to 0 because the effects of red the probability of an edge p and will be specified later. and blue largely cancel each other. The drift near the Figure 4.3 illustrates the partitions of the states. extreme point is small because there are very few nodes The following lemma upper bounds 1/p+(x): outside the majority. G + As a result, we partition the states into subsets, and Lemma 4.2. (lower bound for pG(x)) Given node take addition care on the sets of states with small drift. process M on G, if G λ-expander with nearly uniform Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited can apply lemma 4.2 and 4.3 to theorem 4.2, which finishes the proof. 5 The Stabilizing Consensus Problem The consensus problem in the presence of an adversary (known as Byzantine agreement) is a fundamental prim- itive in the design of distributed algorithms. For the stabilizing-consensus problem—a variant of the consensus problem, Doerr et al. [17] proves synchronized 3-majority converges fast to an almost stable√ consensus on a complete graph in the presence of O( n)-dynamic adversaries which, at every√ round, can adaptively change the opinions of up to O( n) nodes. Here we consider an asynchronous protocol for this Figure 2: An illustration of partition in section 4.3. problem: 2 2 1−δd ˆ (1/2−ˆ) Given a complete network of n anony- degree E(δd), δd < 1 and λ < · min{ , }, Definition 5.1. 1+δd 18 2 mous nodes with update function f, and F ∈ . In the then for p+ ˆf ˆ, N , 2 2 beginning configuration, each node holds a binary opin- + + s m (4.16) p < pG(x) ≤ 1 if x ∈ Σ ∪ Σ ion specified by x0(·). In each round: + 1 pG(x) l (4.17) 4 < (1/2−bias(x)) ≤ 1 if x ∈ Σ . 1. An adaptive dynamic adversary can arbitrarily cor- rupt up to F agents, and change the reports of their This lemma is proved by apply mixing lemma 2.1 to opinions in this run (the true opinion of these nodes show that the probability of increasing bias is (1) larger is restored and will be reported once the adversary than some constant for x ∈ Σs ∪ Σm lemma B.1 and (2) stops corrupting them). proportional to the size of minority in Σl in lemma B.2. The proof details are in appendix B. 2. A randomly chosen node updates its opinion accord- The second part follows from the following lemma: ing to node dynamics. (If the chosen node is cor- rupted by adversary in that run, the adversary can − + arbitrarily update the opinion of the chosen node.) Lemma 4.3. (upper bound for pG(x)/pG(x)) Given node process M on G, if G ∼ G , then n,p nγ We say a there exist positive constant A ,A ,A ,B , and Definition 5.2. ( -almost consensus) 1 2 3 1 complete network of n anonymous nodes reaches an nγ - 0 < A ,A < 1 such that, with high probability, 2 3 almost consensus if all but O(nγ ) of the nodes support the same opinion. p−(x) B (4.18) G ≤ 1 + A √1 − bias(x) if x ∈ Σs + 1 Our analysis in section 4 can be naturally extended pG(x) n − to the stabilizing consensus problem and proves all pG(x) m (4.19) ≤ 1 − A2 if x ∈ Σ majority-liked update functions (definition 4.1) are sta- p+(x) G bilizing almost consensus protocols and have the same − pG(x) l convergence rate. (4.20) + ≤ 1 − A3 if x ∈ Σ pG(x) Theorem 5.1. Given n nodes, fixed γ > 1/2, F = − + √ n Instead of bounding pG(x)/pG(x) directly, the drift O( n), and initial configuration X0 ∈ {0, 1} , the + − DG(x) , pG(x) − pG(x) is more natural to work node dynamic ND(Kn, f, x0) on a complete graph with γ with. Taking the complete graph as example, DG(x) = update function f reaches an n -almost consensus in the f(pos(x)) − pos(x). Therefore instead of proving an presence of any F -corrupt adversary within O(n log n) − + upper bound of pG(x)/pG(x) directly, we prove an lower rounds with high probability. bound for the drift in Appendix B (lemma B.3, B.4, B.5, and B.8). Combining with lemma 4.2, these gives us an Remark 5.1. The goal of this section is not to promote − + desired upper bound for pG(x)/pG(x). majority-liked node dynamics as a state-of-art protocol for the stabilizing consensus problem, but to show the Proof. [theorem 4.1] By corollary 2.2 G ∼ G is a n,p versatile power of the our framework of proving conver- q log n O np -expander with high probability. Thus, we gence time in section 2.1. Additionally we modify the Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited formulation of the problem here to make our presenta- 5.2 Monotone Coupling Between Y(F ) And tion more cohesive. X (AF ). To transfer the upper bound of Y(F ) to X (AF ), we need to build a “nice” coupling between Let the random process with the presence of some them which is characterized as follow: fixed F -dynamic adversary AF defined in theorem 5.1 Definition 5.3. (Monotone Coupling) Let X,Y be denoted X (AF ) = (Xt)t≥0. Observe that our framework in section 4.2 only works for Markov chain, be two random variables on some partially ordered set but with the presence of adaptive adversary the process (Σ, ≥). Then a monotone coupling between X and Y ˜ ˜ is no longer a Markov chain. As a result, we “couple” is a measure (X, Y ) on Σ × Σ such that this process with a nice Markov chain Y(F ) = (Yt)t≥0, • The marginal distributions X˜ and X have the same and use the Markov chain as a proxy to understand the distribution; original process. ˜ The proof has two parts: we first define the proxy • The marginal distributions Y and Y have the same Markov chain Y(F ) and prove an upper bound of distribution; almost consensus time by using the tools in section 4.2. ˜ ˜ • P(X,˜ Y˜ )[X ≥ Y ] = 1. Secondly, we construct a monotone coupling between Note that the function bias(·) induces a natural Y(F ) and X (AF ) to prove X (AF ) also converges to n almost consensus fast. total order ≤bias of our state space Ω = {0, 1} such that for x, y ∈ Ω, x ≤bias y if and only if 5.1 Upper Bounding the Expected Almost bias(x) ≤ bias(y). We can also define a partial order Consensus Time for Y(F ). With the notation de- over sequences of states: given two sequences (Xt)t≥0, fined in section 4, we define Y(F ). Informally, we con- (Yt)t≥0 we call (Xt)t≥0 ≤bias (Yt)t≥0 if ∀t ≥ 0 Xt ≤bias Y . We use calligraphic font to represent the whole struct Y(F ) as a pessimistic version of ND(Kn, f, X0) t with the presence of an adversary: at every round the random sequence, e.g. Z = (Zt)t≥0. adversary tries to push the state toward the unbiased Lemma 5.2. There exists a monotone coupling (X˜, Y˜) configuration, and it always corrupts F nodes with the between X (AF ) and Y(F ) under the partial order ≤bias minority opinion. Initially, Y0 = X0. At time t if we set y = Yt, Yt+1 The proof of this lemma is straightforward, and we defer is uniformly sampled from the proof to the full version. 0 0 {y ∈ Ω: ∃i ∈ [n], ∀j 6= i, y = yj} 5.3 Proof of Theorem 5.1 (5.21) j ∩ {y0 ∈ Ω: Bias(y0) = Bias(y) + 1} Proof. [theorem 5.1] We call an event A increasing if x ∈ A implies that any y ≥ x is also in A. Observe 1 1 −(1−γ) with probability max{f 2 + bias(y) 2 − bias(y) − that Aγ := {y ∈ Ω: bias(y) > 1/2 − n } is (M1+1)F n , 0}, or uniformly sampled from increasing with respect to ≤bias. Therefore given a random sequence Z = (Zt)t≥0 0 0 {y ∈ Ω: ∃i ∈ [n], ∀j 6= i, yj = yj} (5.22) −(1−γ) 0 0 PZ [Tγ (z) > τ] = PZ max bias(Zt) ≤ 1/2 − n ∩ {y ∈ Ω: Bias(y ) = Bias(y) − 1} t≤τ