(2∆ 1)-Edge-Coloring is Much Easier than − Maximal in the Distributed Setting

Michael Elkin ∗ Seth Pettie † Ben-Gurion University of the Negev University of Michigan

Hsin-Hao Su † University of Michigan

Abstract 1 Introduction is a central problem in distributed 1.1 Edge-Coloring Consider an unweighted computing. Both vertex- and edge-coloring problems have been extensively studied in this context. In this undirected n-vertex graph G = (V,E) with max- paper we show that a (2∆ − 1)-edge-coloring can be imum ∆ whose vertices host processors.  computed in time√ smaller than log n for any  > 0, The vertices communicate with one another over O( log log n) specifically, in e rounds. This establishes the edges of G in synchronous rounds, where local a separation between the (2∆ − 1)-edge-coloring and computation is unbounded. We aim at devising Maximal Matching√ problems, as the latter is known to require Ω( log n) time [15]. No such separation is algorithms for this setting that run for as few rounds currently known between the (∆ + 1)-vertex-coloring and as possible. The running time of an algorithm in this Maximal Independent Set problems. We devise a (1 + )∆-edge-coloring algorithm for an context is the number of rounds. arbitrarily small constant  > 0. This result applies In this paper we focus on the (2∆ 1)- and whenever ∆ ≥ ∆, for some constant ∆ which depends − ∗ (1 + )∆-edge-coloring problems, as well as on the on . The running time of this algorithm is O(log ∆ + (∆ + 1)-vertex-coloring problem, in this setting. In log n ). A much earlier logarithmic-time algorithm ∆1−o(1) an α-edge-coloring (respectively, α-vertex-coloring) by Dubhashi, Grable and Panconesi [11] assumed ∆ ≥ problem, the objective is to color all edges (resp., (log n)1+Ω(1). For ∆ = (log n)1+Ω(1) the running time of our algorithm is only O(log∗ n). This constitutes a drastic vertices) of G with α colors so that no two incident improvement of the previous logarithmic bound [11, 9]. edges (resp., adjacent vertices) are colored by the Our results for (2∆ − 1)-edge-coloring also follows same color. Coloring problems are among the most from our more general results concerning (1 − )-locally fundamental and well-studied problems in the area sparse graphs. Specifically, we devise a (∆ + 1)-vertex of distributed algorithms. See, e.g., [5] and the coloring algorithm for (1 − )-locally sparse graphs that references therein. runs in O(log∗ ∆ + log(1/)) rounds for any  > 0, The study of these problems can be traced back provided that ∆ = (log n)1+Ω(1). We conclude that to the seminal works of Luby [17] and Alon, Babai − the (∆ + 1)-vertex coloring problem for (1 √)-locally and Itai [1], who devised O(log n)-time algorithms for sparse graphs can be solved in O(log(1/)) + eO( log log n) Maximal Independent Set problem. ∗ Then, Luby time. This imply our result about (2∆−1)-edge-coloring, [17] showed a reduction from the (∆ + 1)-coloring because (2∆ − 1)-edge-coloring reduces to (∆ + 1)-vertex- problem to MIS problem, so that the (∆+1)-coloring coloring of the of the original graph, and problem can be solved in O(log n) rounds. Since because line graphs are (1/2 + o(1))-locally sparse. the (2∆ 1)-edge-coloring problem on a graph G reduces to− the (∆ + 1)-vertex-coloring problem on the line graph of G, the results of [17, 1] give rise to O(log n)-time algorithms for the (2∆ 1)-edge- ∗This research has been supported by the Israeli Academy − of Science, grant 593/11, and by the Binational Science coloring problem as well. Foundation, grant 2008390. In addition, this research has Remarkably, even though these problems have been supported by the Lynn and William Frankel Center for been intensively investigated for the last three Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Computer Science. decades (see Section 1.3 for a short overview of †Supported by NSF grants CCF-1217338 and CNS- 1318294. This research was performed partly at the Center for Massive Data Algorithmics (MADALGO) at Aarhus Uni- ∗A subset U V of vertices is called an MIS if there is ⊆ versity, which is supported by Danish National Research Foun- no edge in G connecting two vertices of U, and for any vertex dation grant DNRF84. v V U there exists a neighbor u U. ∈ \ ∈

355 Copyright © 2015. by the Society for Industrial and Applied Mathematics. some of the most related results), the logarithmic vertex-coloring algorithm for (1 )-locally sparse bound [17, 1] remains the state-of-the-art to this graphs that run in O(log∗ ∆+log 1−/) rounds for any date. Indeed, the currently best-known algorithm  > 0, provided that ∆ = (log n)1+Ω(1). Without for these problems (due to Barenboim et al. [7]) re- this restriction on the range of ∆ our algorithm has quires O(log ∆) + exp(O(√log log n)) time. However, running time O(log 1/) + exp(O(√log log n)). for ∆ = nΩ(1) this bound is no better than the loga- It is easy to see that in a line graph of degree rithmic bound of [17, 1]. ∆ = 2(∆0 1) (∆0 is the degree of its underlying On the lower bound front, Linial [16] showed that graph) every− neighborhood induces at most (∆0 ∗ 2 2 ∆ − these problems require Ω(log n) time. Kuhn, Mosci- 1) = (∆/2) = (1/2 + 1/2(∆ 1)) 2 edges. Hence, broda, and Wattenhofer [15] showed that Maximal our (∆ + 1)-vertex-coloring algorithm− requires only Matching (henceforth, MM)† and the MIS problems exp(O(√log log n)) time for ∆0 2. (For ∆0 = O(1) require Ω(√log n) time. Observe that by eliminating a graph can be (2∆0 1)-edge-colored≥ in O(∆0 + one color class at a time one can obtain, in O(∆) log∗ n) = O(log∗ n) time,− using a classical (2∆0 1)- time, an MM from a (2∆ 1)-edge-coloring, or an edge-coloring algorithm of Panconesi and Rizzi− [20].) MIS from a (∆+1)-vertex-coloring.− Nevertheless the Our result that (1 )-locally sparse graphs lower bounds of [15] are not known to apply to the can be (∆ + 1)-vertex-colored− in time O(log 1/) + coloring problems. On the other hand, no results are exp(O(√log log n)) time shows that the only “hurdle” known that separate the complexities of MM and MIS that stands on our way towards a sublogarithmic- from their edge-coloring and vertex-coloring counter- time (∆ + 1)-vertex-coloring algorithm is the case parts. of dense graphs. In particular, these graphs must In this paper we devise the first subloga- have ‡ λ(G) > (1 )∆/2, for any con- rithmic time algorithm for the (2∆ 1)-edge- stant  > 0. (Note that λ(G−) ∆/2.) Remark- coloring problem. Specifically, our algorithm− requires ably, graphs with arboricity close≤ to the maximum exp(O(√log log n)) time, i.e., less than log n time degree are already known to be the only hurdle that for any  > 0. (In particular, it is far below the stands on the way towards devising a determinis- Ω(√log n) barrier of [15].) Therefore, our result es- tic polylogarithmic-time (∆+1)-vertex-coloring algo- tablishes a clear separation between the complexities rithm. Specifically, Barenboim and Elkin [4] devised of the (2∆ 1)-edge-coloring and MM problems. a deterministic polylogarithmic-time algorithm that We also− devise a drastically improved algorithm (∆ + 1)-vertex-colors all graphs with λ(G) ∆1−, for (1 + )∆-edge-coloring. Using the R¨odlnibble for some constant  > 0. ≤ method Dubhashi, Grable, and Panconesi [11] de- vised a (1 + )∆-edge-coloring algorithm for graphs 1.3 Related Work All our algorithms in this pa- with ∆ = (log n)1+Ω(1) which requires O(log n) time. per are randomized. This is also the case for most In PODC 2014 Chung, Pettie and Su [9] extended of the previous works that we mentioned above. (A the result of [11] to graphs with ∆ ∆, for ∆ notable exception though is the deterministic algo- being some constant which depends on≥. In this pa- rithm of [20].) The study of distributed randomized per we devise a (1 + )∆-edge-coloring algorithm for edge-coloring was initiated by Panconesi and Srini- graphs with ∆ ∆ (∆ is as above) with running vasan [21]. The result of [21] was later improved in time O(log∗ ∆ ≥max 1, log n ). In particular, for the aforementioned paper of [11]. · { ∆1−o(1) } ∆ = (log n)1+Ω(1) the running time of our algorithm Significant research attention was also devoted is only O(log∗ n), as opposed to the previous state- to deterministic edge-coloring algorithms, but those of-the-art of O(log n) [9, 11]. typically use much more than 2∆ 1 colors. (An exception is the aforementioned algorithm− of Pan- 1.2 Vertex Coloring Our results for (2∆ 1)- conesi and Rizzi [20].) Specifically, Czygrinow et al. edge-coloring problem follow, in fact, from− more [10] devised a deterministic O(∆ log n)-edge-coloring · 4 general results concerning (∆ + 1)-vertex-coloring algorithm with running time O(log n). More re- (1 )-locally sparse graphs. A graph G = (V,E) cently Barenboim and Elkin [5] devised a determin-  is said− to be (1 )-locally sparse if for every vertex istic O(∆1+ )-edge-coloring algorithm with running ∗ v V , its neighborhood− Γ(v) = u (v, u) E time O(log ∆ + log n), and an O(∆)-edge-coloring ∆  ∗ Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ∈  { | ∈ } algorithm with time O(∆ + log n), for an arbitrary induces at most (1 ) 2 edges. We devise a (∆+1)- − small  > 0.

†A subset M E of edges is called an MM if no two edges ⊆ in M are incident to one another and for every edge e0 E M ‡The arboricity λ(G) of a graph G is the minimum number ∈ \ there exists an incident edge e M. of edge-disjoint forests required to cover the edge set of G. ∈

356 Copyright © 2015. by the Society for Industrial and Applied Mathematics. The notion of (1 )-locally sparse graphs was We will first analyze just one round of the standard introduced by Alon, Krivelevich− and Sudakov [2] and trial algorithm, where each vertex randomly selects was studied also by Vu [25]. Distributed vertex- exactly one color from its palette. We show that coloring of sparse graphs was studied in numerous because the neighborhood is sparse, at least Ω(∆) papers. See, e.g., [7, 3, 24, 6, 23, 8], and the references pairs of neighbors will be assigned the same color, and therein. so the palette size will concentrate at a value Ω(∆) larger than its degree. Then by using the idea of se- 1.4 Technical Overview We begin by discussing lecting multiple colors, we develop an algorithm that the (1 + )∆-edge coloring problem. Our algorithm colors the graph rapidly. In this algorithm, instead consists of multiple rounds that color the edges of of selecting the colors with a uniform probability as the graph gradually. Let P (u) denote the palette of in the edge coloring algorithm, vertices may select u, which consists of colors not assigned to the edges different probabilities that are inversely proportional incident to u. Therefore, an edge uv can choose a to their palette sizes. Note that Schneider and Wat- color from P (uv) def= P (u) P (v). Our goal is to show tenhofer [24] showed that (1 + )∆-vertex coloring ∗ that P (uv) will always be non-empty∩ as the algorithm problem can be solved in O(log(1/) + log n) rounds if ∆ log n. However, it is not clear whether their proceeds and we hope to color the graph as fast as  possible. If P (u) and P (v) behave like independent proof extends directly to the case where palettes can random subsets out of the (1 + )∆ colors, then the be non-uniform as in our case. expected size of P (uv) is at least (/(1+))2 (1+)∆, The main technical challenge is to prove the since the size of P (u) and P (v) is /(1+) fraction· of concentration bounds. To this end, we use exisiting the original palette. This means if the size of P (uv) techniques and develop new techniques to minimize concentrates around its expectation, then it will be the dependencies introduced. First, we use the non-empty. wasteful coloring procedure [18]: Instead of removing We use the following process to color the graph colors from the palette that are colored by the while keeping the palettes behaving randomly. In neighbors, we remove the colors that are selected each round, every edge selects a set of colors in its by the neighbors in each round. In this way, we palette. If an edge selected a color that is not selected can zoom in the analysis into the 2-neighborhood by adjacent edges, then it will become colored with of a vertex instead of 3-neighborhood. Also, we one such color. The colored edges will be removed use the expose-by-ID-ordering technique introduced from the graph. in [22]. In the edge coloring problem, assume that In contrast with the framework of [11, 13], where each edge has a unique ID. In each round, we let each edge selects at most one color in each round, se- an edge become colored if it selected a color that lecting multiple colors allows us to break symmetry is not selected by its neighbor with smaller ID. faster. The idea of selecting multiple colors indepen- Therefore, the choices of the neighbors with larger dently has been used in [14, 23, 25] to reduce the ID will not affect the outcome of the edge. That dependency introduced in the analysis for triangle- makes bounding the difference or the variance of free graphs and locally-sparse graphs. Our analysis the martingales much simpler when we expose the is based on the semi-random method or the so-called choices of the edges according to the order of their ID. R¨odlNibble method, where we show by induction Finally, we derive a modification of Chernoff Bound that after each round a certain property Hi holds (Lemma A.2) that is capable to handle the sum w.h.p., assuming Hi−1 holds. In particular, Hi is the of non-independent random variables conditioned on property that the palette size of each edge is lower some likely events. In particular, although the bounded by pi, and the c-degree of a vertex, that is, expectation of the i-th random variable may be heavily affected by the configuration of first i 1 the number of uncolored adjacent edges having the − color c in its palette, is upper bounded by ti. In- random variables, our inequality applies if we can tuitively, the symmetry is easier to break when the bound the expectation when conditioning on some very likely events that depend on the first i 1 random size of the palette is larger and when the c-degree is − smaller. Therefore, we hope that the probability an variables. When combined with the expose-by-ID- edge becomes colored increases with pi/ti. By select- ordering technique, it becomes a useful tool for the

Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ing multiple colors for each edge in each round, we analysis of concentration. (See the proofs of Lemma will capture this intuition and be able to color the 2.4 and Lemma 4.2.) graph faster than by selecting just one single color. For the (∆+1)-vertex coloring problem in (1 )- locally sparse graphs, we give a twofold approach.−

357 Copyright © 2015. by the Society for Industrial and Applied Mathematics. 2 Distributed Edge Coloring 5: for each e Gi− do ∈ 1 Given a graph G = (V,E), we assume each edge 6: (Si(e),Ki(e)) Select(e, πi, βi) ← ∗ 7: Set Pi(e) Ki(e) Si(N (e)) e has a unique identifier, ID(e). For each edge, we ← \ i−1 8: if Si(e) Pi(e) = then color e with any maintain a palette of available colors. Our algorithm ∩ 6 ∅ color in Si(e) Pi(e) end if proceeds by rounds. In each round, we color some ∩ portion of the graph and then delete the colored 9: end for 10: Gi Gi− colored edges edges. Let Gi be the graph after round i and Pi(e) be ← 1 \{ } 11: until the palette of e after round i. Initially, P0(e) consist of all the colors 1, 2,..., (1 + )∆ . We define the { E } E Algorithm 2.2. Select(e, πi, βi) sets Ni( ): V E 2 , Ni,c( ): V E 2 , ∗ · ∪ →E · ∪ → and N (e): E 2 as follows. Ni( ) is the set i,c → · of neighboring edges of a vertex or an edge in G . 1: Include each c Pi−1(e) in Si(e) independently i ∈ with probability πi. Ni,c( ) is the set of neighboring edges of a vertex or ∗ · ∗ 2 degi−1,c(e) an edge in G having c in its palette. N (e) is the 2: For each c, calculate rc = βi /(1 πi) . i i,c − set of neighboring edges having smaller ID than e and 3: Include c Pi−1(e) in Ki(e) independently with ∈ having c in its palette in Gi. probability rc. For clarity we use the following shorthands: 4: return (Si(e),Ki(e)). ∗ degi( ) = Ni( ) , degi,c( ) = Ni,c( ) , and degi,c(e) = N ∗ (·e) ,| where· | deg ( ·) is| often· referred| as the c- In each round i, each edge e selects two set of | i,c | i,c · degree. Also, if F ( ) is a set function and S is a set, colors Si(e) and Ki(e) by using Algorithm 2.2. Si(e) S· is selected by including each color in Pi−1(e) with we define F (S) = s∈S F (s). probability πi independently. The colors selected by ∗ Theorem 2.1. Let , γ > 0 be constants. There the neighbors with smaller ID than e, Si(Ni−1(e)), exists a constant ∆,γ 0 and a distributed algorithm will be removed from e’s palette. To make the ≥ such that for all graphs with ∆ ∆,γ , the algorithm analysis simpler, we would like to ensure that each colors all the edges with (1 + )∆≥ colors and runs in color is removed from the palette with an identical ∗ 1−γ O(log ∆ max(1, log n/∆ )) rounds. probability. Thus, Ki(e) is used for this purpose. · A color c remains in Pi(e) only if it is in Ki(e) Corollary 2.1. For any ∆, the (2∆ 1)-edge- − and no neighboring edge with smaller ID selected coloring problem can be solved in exp(O(√log log n)) c. The probability that this happens is exactly ∗ rounds. degi−1,c(e) 2 (1 πi) rc = βi . Note that rc is always at most− 1 if deg∗ (·u) t0 (defined below), which we Proof. Let  = 1 and γ = 1/2. By Theorem 2.1, i−1,c i−1 later show holds by induction.≤ An edge will become there exists a constant ∆1,1/2 such that for ∆ 2 ≥ colored if it has selected a color remaining in Pi(e). max((log n) , ∆1,1/2), the problem can be solved in ∗ 2 Obviously, no two adjacent edges will be colored the O(log ∆) rounds. Otherwise ∆ = O(log n) and we same in the process. can apply the (∆ + 1)-vertex coloring algorithm in We will assume ∆ is sufficiently large whenever [7] to the line graph of G, which takes O(log ∆ + we need certain inequalities to hold. The asymptotic exp(O(√log log n))) = exp(O(√log log n))) rounds. notations are functions of ∆. Let p0 = (1 + )∆ and t = ∆ be the initial lower bound on the palette size We describe the algorithm of Theorem 2.1 in 0 and initial upper bound on the c-degree of a vertex. Algorithm 2.1 for ∆ > (log n)1/(1−γ). In the end of Let the section, we show how to generalize it to smaller ∆ by using a distributed algorithm for contructive π = 1/(Kt0 ) δ = 1/ log ∆ Lov´aszLocal Lemma [9]. The algorithm proceeds i i−1 0 0 pi t −1 in rounds. We will define πi and βi later. For αi = (1 πi) βi = (1 πi) i−1 { } { } − − now, let us think πi is inversely proportional to the 2 pi = βi pi−1 ti = max(αiβiti−1,T ) c-degrees and βi is a constant. 0 i 0 2i p = (1 δ) pi t = (1 + δ) ti i − i Algorithm 2.1. Edge-Coloring-Algorithm K = 4 + 4/ T = ∆1−0.9γ /2 Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php (G, πi , βi ) { } { } 1: G G pi and ti are the ideal (that is, expected) lower 0 ← 2: i 0 and upper bounds of the palette size and the vertex c- ← 0 0 3: repeat degrees after round i. pi and ti are the relaxed version i 2i 4: i i + 1 of pi and ti with error (1 δ) and (1+δ) , where δ is ← −

358 Copyright © 2015. by the Society for Industrial and Applied Mathematics. i chosen to be small enough such that (1 δ) = 1 o(1) defn. pi and (1 + δ)2i = 1 + o(1) for all i we consider,− i.e.− for   1  −2/K  ti−1 i = O(log∗ ∆). exp (1 o(1)) e (1 + ) 1 ≤ − − · K − · pi−1 πi is the sampling probability in our algorithm. pi−1/ti−1 (1 + ) We will show that αi is an upper bound on the  ≥  probability an edge remains uncolored in round i and ((1 2/K)(1 + ) 1) ti−1 2 exp (1 o(1)) − − βi is the probability a color remains in the palette of ≤ − − · K · pi−1 an edge depending on . Since e−x 1 x ≥ − t0 −1  2  Kt0 − · i−1  ti−1  ( i−1 1) Kt0 −1 1 i−1 = exp (1 o(1)) βi = 1 − − · 8(1 + ) · pi−1 − (Kt0 1) + 1 i−1 − K = 4(1 + )/ (Kt0 −1)· 1  1  i−1 K 1 Therefore, after at most (1 + ≥ − (Kt0 1) + 1 i−1 − 8(1+) 3/K   x o(1)) 2 ln 1.1Ke rounds, this stage will e−1/K . Since 1 1 e−1. end. Let j be the first round when the second stage ≥ − x+1 ≥ starts. For i > j, we have −1/K Therefore, βi is bounded below by e , which is p0 α = (1 π ) i a constant. While p shrinks by β2, we will show t i i i i i −  shrinks by roughly αiβi. Note that p0/t0 (1 + ) 1 pi −x ≥−2/K exp (1 o(1)) 1 x e initially. The constant K is chosen so that e (1+ ≤ − − K · ti−1 − ≤ ) 1 = Ω() and so αi is smaller than βi initially,  1 β2p  − exp (1 o(1)) i i−1 defn. p since we would like to have ti shrink faster than pi. i ≤ − − K · ti−1 Then, αi becomes smaller as the ratio between ti and  2  1 βi−1 βi pi−2 pi becomes smaller. Finally, we cap ti by T , since exp (1 o(1)) our analysis in the first phase does not have strong ≤ − − K · αi−1 · ti−2 p β p enough concentration when ti decreases below this i−1 = i−1 i−2 threshold. Thus, we will switch to the second phase, ti−1 αi−1 ti−2 where we trade the amount ti decreases (which is  −3/K  1 e pi−2 −1/K supposed to be decreased to its expectation as in the exp (1 o(1)) βi e ≤ − − K · αi−1 · ti−2 ≥ first phase) for a smaller error probability. ti−2 1 We will show that the first phase ends in exp ( 1/αi−1) < /K ∗ ≤ − p 1.1Ke3 O(log ∆) rounds and the second phase ends in a con- i−2 e stant number of rounds. We will discuss the number ·· e· of rounds in the second phase later in this section. Therefore, 1 e ∆, and so αj+log∗ ∆+1 ≥ | {z } ≥ log∗ ∆ after at most ∗ Lemma 2.1. tr = T r = O(log ∆) tj+log∗ ∆+1 max(αj+log∗ ∆+1 ∆,T ) = T . rounds. ≤ · Then, we show the bound on the palette size Proof. We divide the process into two stages. The remains large throughout the algorithm. 3/K first is when ti−1/pi−1 1/(1.1e K). In this stage, 0 1−o(1) ∗ ≥ Lemma 2.2. pi = ∆ for i = O(log ∆). ti αi ti−1 0 i i Qi 2 = Proof. p = (1 δ) pi (1 δ) β ∆ (1 pi βi pi− i j=1 j 1 − ≥ −2i ≥ − i −2i/K − K log ∆ 1−o(1) p0 −t0 +1 ti−1 δ) e ∆ = (1 o(1))∆ ∆ = ∆ . = (1 πi) i i−1 defn. αi, βi − · − · pi−1 0 Let Hi(e) denote the event that Pi(e) pi 0 0  ti−1 −x | |0 ≥ exp πi (pi ti−1 + 1) 1 x e and Hi,c(u) denote the event degi,c(u) ti. Let ≤ − · − · pi−1 − ≤ ≤ Hi be the event such that for all u, e G and all    ∈ 1 pi ti−1 c Pi(u), Hi,c(u) and Hi(e) hold. Supposing that Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php exp (1 o(1)) 1 ∈ ≤ − − · K ti−1 − · pi−1 Hi−1 is true, we will estimate the probability that 0 pi pi Hi(e) and Hi,c(u) are true. defn. πi, t0 = (1 o(1)) t i−1 − i−1   2  Suppose that is true, then 1 βi pi−1 ti−1 Lemma 2.3. Hi−1 exp (1 o(1)) 1 2 −Ω(δ2p0 ) Pr( Pi(e) < (1 δ)β Pi− (e) ) < e i . ≤ − − · K ti−1 − · pi−1 | | − i | 1 |

359 Copyright © 2015. by the Society for Industrial and Applied Mathematics. 2 2 0 Figure 1 0 −Ω(δ T ) −Ω(δ pi) Pr(degi,c(u) > ti) < 2e + ∆e . u Proof. Define the auxiliary set

Ni 1,c(u) def − Nbi,c(u) = e Ni−1,c(u) (c Ki(e)) and { ∈ ∗ | ∈ (c / S(N (e) Ni− ,c(u))) ∈ i−1 \ 1 }

and degd (u) = Nbi,c(u) (see Figure 1a). Nbi,c(u) i,c | | is the set of edges uv Ni− ,c(u) that keep the ∈ 1 c color c in Ki(uv) and no edges adjacent to v (except possibly uv) choose c. We will first show that c c −Ω(δ2T ) Pr(degd (u) (1 + δ)βi deg (u)) e . i,c ≤ i−1,c ≤ Consider e = uv Ni−1,c(u). The probability that ∈ ∗ c Ki(e) and c / S(Ni−1(e) Ni−1,c(u)) both happen Ni∗ 1,c(Ni 1,c(u)) Ni 1,c(u) ∈ ∈ \ − − \ − is (a) The bold lines denote the edges in Nbi,c(u). In this 2t0 −2 example, we assume all the edges in the bottom have (1 πi) i−1 ∗ degi−1,c(v)−1 smaller ID than the edges on the top. The solid square deg−∗ (v)+deg∗ (u)−2 (1 πi) (1 πi) i−1,c i−1,c · − besides an edge e in the top part denote that c ∈ Ki(e). − t0 −1 The character ‘c’ besides an edge e in the bottom part (1 πi) i−1 ∗ degi−1,c(v)−1 − ∗ (1 πi) = βi. denote that c ∈ Si(e). The set Nbi,c(u) is determined deg (v)−1 ≤ (1 πi) i−1,c · − by the squares and the c’s. − u Let e1, . . . , ek be the edges in Ni−1,c(u) and 0 0 ∗ let e , . . . , e 0 be the edges in N (Ni− ,c(u)) 1 k i−1,c 1 \ 2: Pi(e2) E Ni−1,c(u). Clearly, degdi,c(u) is determined solely by 0 0 e1 e2 Ki(e1),...,Ki(ek) and Si(e1),...,Si(ek0 ). Define the following sequence: 1: P (e ) E i 1 A C   j = 0 ∅ Y = (Ki(e1),...,Ki(ej)) 1 j k j ≤ ≤  0 0  0  Y k,Si(e1),...,Si(ej−k) k < j k + k B D ≤

Let Vj be (b) An illustration showing the probability that e2 se- 0   lects a color c ∈ Pi(e2) is unaffected when conditioning Var E[degdi,c(u) Y j−1] E[degdi,c(u) Y j] Y j−1 . on E1, E2, and whether e1 is colored or not. Note that | − | e , e ∈ N − (u) and ID(e ) < ID(e ). E is a func- 1 2 bi 1,c 1 2 1 We will upper bound Vj and apply the concentration tion of Ki(e1) and the colors chosen by the edges in A inequalities of Lemma A.5. For 1 j k, the and B. E2 is a function of Ki(e2) and the colors cho- ≤ ≤ exposure of Ki(ej) affects degdi,c(u) by at most 1, so sen by the edges in C and D. Thus, conditioning on P 0 0 0 Vj 1 and Vj t . For k < j k+k , the them does not affect the probability e2 select c . Fur- ≤ 1≤j≤k ≤ i−1 ≤ thermore, whether e1 is colored does not depend on exposure of Si(ej) affects degdi,c(u) by at most 2, since whether e selects the colors in P (e ), but only pos- 0 2 i 2 edge ej is adjacent to at most 2 edges in Ni−1,c(u). sibly depends on whether the colors in the grey area Since the probability ej selects c is πi, Vj 4πi. (Pi−1(e2) \ Pi(e2)) are selected. (We make a query about whether c is contained≤ in Si(ej). For an yes/no query, the variance is bounded by p C2, if the function is C-Lipschitz and p yes · yes Proof. Consider a color c Pi−1(e). The probability is the probability that the answer to the query is ∈ ∗ P 0 c remains in Pi(e) is exactly Pr(c / Si(Ni−1(e))) yes [11, 12].) Therefore, k

0 Lemma 2.4. Suppose that Hi−1 is true, then Pr(degdi,c(u) > (1 + δ)βiti−1)

360 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Pr(degd (u) > βi deg (u) + t) By the union bound, the probability that both ≤ i,c i−1,c 0 degi,c(u) αi degdi,c(u) + δ max(αi degdi,c(u),T ) degi−1,c(u) ti−1 ≤ ≤ · 0 · 2 ! and degdi,c(u) (1 + δ)βiti−1 hold is at least 1 t −Ω(δ2T ) ≤−Ω(δ2p0 ) − exp 0 2e ∆e i . When both of them are true: ≤ − Pk+k 2 2( j=1 σj + 2t/3) −  2  t degi,c(u) = exp 0 0 0 −2(5ti− + 2t/3) (1 + δ)αiβit + δ max((1 + δ)αiβit ,T ) 1 ≤ i−1 i−1  2 2 02  0 0 δ βi ti−1 (1 + δ)αiβiti−1 + δ max((1 + δ)αiβiti−1, ti) exp 0 0 ≤ ≤ −2(5ti− + 2(δβiti− )/3) T ti 1 1 ≤ 2 0  0 2i−1 0 = exp Ω(δ ti− ) (1 + δ)α β t + δ(1 + δ) t t − 1 i i i−1 i i ≤ ≤ 0 defn. ti and ti Next, we show Pr(degi,c(u) > (1 + δ)αidegdi,c(u)) −Ω(δ2p0 ) −Ω(δ2T ) ≤ ∆e i + e . Let e1, . . . , ek Nbi,c(u) listed Second Phase Suppose that Hr holds at the ∈ by their ID in increasing order. Let j denote the end of iteration r, where r is the first round where 0 E 0 likely event that Pi(ej) p . Notice that Pr(c tr = T and so deg (u) t 2T for all u | | ≥ i ∈ r,c ≤ r ≤ Pi(ej) ej Nbi,c(u)) Pr(c Pi(ej)) βi and and c. Now we will show the algorithm terminates 0 | ∈ ≥ ∈ 0 ≥ in a constant number of rounds. For i > r, let Pr(c Pi(ej) ej Nbi,c(u)) = Pr(c Pi(ej)) βi ∈ | ∈ ∈ ≥ 0 0 T 0 0 ti = ti−1 p0 . for all other c = c and c Pi−1(ej). Therefore, · i 6 ∈ 0 E[ Pi(ej) ej Ni,c(u)] βi Pi− (ej) βip . Recall that Hi(e) denotes the event that Pi(e) b 1 i−1 0 | | ≥0 | | | ∈ ≥ | − δ2p|0 ≥ p and Hi,c(u) denotes the event that deg (u) t By Lemma 2.3, Pr( ) e Ω( i). Let X be i i,c i j j (Notice that t0 has a different definition when i≤ > r the event that e is notE colored≤ after this round and i j than that when 0 i r). Also recall H denotes let X be the shorthand for (X ,...,X ). We will i j 1 j the event that H ≤(e)≤ and H (u) are true for all show that i i,c u, e Gi and all c Pi(u). If ∆ is large enough, then∈ we can assume∈ that p0 ∆1−0.8γ by Lemma max Pr(Xj Xj−1, 1,..., j) αi i X ≥ 0 j−1 | E E ≤ 2.2. Then from the definition of ti, it shrinks to less than one in 1 rounds, since T/p0 ∆−0.1γ and and so we can apply Lemma A.2, a Chernoff-type tail d 0.1γ e i ≤ 0 −0.1γ d1/(0.1γ)e 0 bound when conditioning on a sequence of very likely tr+1/(0.1γ) < ∆ tr < 1. 0 · events. First, we argue that for any Xj−1 and c Suppose that Hi−1 is true, we will estimate the 0 ∈ Pi(ej), Pr(c Si(ej) Xj−1, 1,..., j) = πi (see probability that Hi(e) and Hi,c(u) are true. Consider ∈ 0 | 0E E Figure 1b). Since c Pi(ej), c is not chosen by any a color c Pi−1(e). It is retained in the palette with ∈ ∈ 2 2 of the edges e1, e2, . . . , ej−1. Therefore, whether these probability exactly βi , so E[ Pi(e) ] βi Pi−1(e) 2 0 | | ≥ | | ≥ edges become colored does not depend on whether βi pi−1. Since each color is retained in the palette they choose c0 or not. Furthermore, conditioning on independently, by a Chernoff Bound, Pr( Pi(e) < 2 0 | | 2 0 −Ω(δ pi) 1,..., j has no effect on the probability ej selects (1 δ)βi pi−1) < e . E0 E − · c , because the palette sizes of e1, . . . ej do not depend on the colors chosen by ej, but only the choices of the Lemma 2.5. Suppose that Hi−1 is true where i > r, 2 0 0 −Ω(T ) −Ω(δ pi) edges with smaller ID. Therefore, we have: then Pr(degi,c(u) > ti) < e + ∆e .

Pr(Xj Xj− , ,..., j) Proof. We will now bound the probability that | 1 E1 E Y deg (u) > t0 . Let e , . . . , e N (u), listed = Pr(c0 / S (e ) X , ,..., ) i,c i 1 k i−1,c i j j−1 1 j by their ID in increasing order.∈ Let denote the 0 ∈ | E E j c ∈Pi(ej ) 0 E 0 likely event that Pi(ej) pi. Notice that Pr( j) |P e | p 2 0 i( j ) i −Ω(δ p ) | | ≥ E ≤ = (1 πi) (1 πi) j is true e i by Lemma 2.3. For each ej Ni,c(u), let Xj − ≤ − E ∈ = αi denote the event that ej is not colored. As we have shown previously Pr(Xj Xj−1, 1,..., j) αi, 2 0 | E E ≤ P −Ω(δ pi) therefore, Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Notice that by Lemma 2.3, j Pr( j) ∆e . By Lemma A.2 and Corollary A.1,E we≤ have: 0 Pr(degi,c(u) > ti)   0   Pr(degi,c(u) > αi degdi,c(u) + δ max(αi degdi,c(u),T )) ti 0 · · = Pr degi,c(u) > αiti− . − δ2T − δ2p0 0 1 e Ω( ) + ∆e Ω( i) αiti−1 · ≤

361 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Applying Lemma A.2 and Corollary A.1 with 1 + Now, each of the bad events Hi,c(u) or Hi(e) is 0 0 δ = ti/(αiti−1), and noticing that αi degi−1,c(u) dependent with other events only if their distance is 0 ≤ αiti−1, the probability above is bounded by at most 3. (The distance between two edges is the distance in the line graph; the distance between a   0 0  0  0 ti ti ti vertex and an edge is the distance between the vertex exp αiti−1 0 ln 0 0 1 − αiti−1 αiti−1 − αiti−1 − and the further endpoint of the edge). Since there are − δ2p0 O(∆) events on each vertex and O(1) events on each + ∆e Ω( i) edge, each event depends on at most d = O(∆3 ∆) =   0  t 2 0 4 1−0.95γ · 0 i −Ω(δ pi) O(∆ ) events. Let p = exp( ∆ ) be an upper exp ti ln 0 1 + ∆e − ≤ − αiti−1 − bound on the probability of each bad event. Now we      0  2 1−γ 1 eti−1 have epd exp( ∆ ). Therefore, we can make = exp ti ln ln ≤ − 1−γ 0 Hi hold in O(log /epd2 n) O(log n/∆ ) rounds − αi − ti 1 ≤ 2 0 w.h.p. This completes the proof of Theorem 2.1. −Ω(δ p ) + ∆e i Note that our proof for Theorem 2.1 does not   p0 et0  rely on all the palettes being identical. Therefore, exp t0 (1 o(1)) i ln i−1 i 0 0 our algorithm works as long as each palette has at ≤ − − Kti−1 − ti 0 least (1 + )∆ colors, which is known as the list edge 2 0 1 p −Ω(δ pi) i + ∆e ln = (1 o(1)) 0 coloring problem. αi − Kti−1    T 0 −Ω(δ2p0 ) 3 Coloring (1 )-Locally Sparse Graphs with exp (1 o(1)) t ln(e∆) + ∆e i ≤ − − K − i ∆ + 1 colors− 0 0 0 defn. ti and ti−1/ti < ∆ In this section and the following section we switch   0  contexts from edge coloring to vertex coloring. Now (1 o(1)) ti−1 exp T − 0 ln(e∆) the palette after round i, Pi(u), is defined on the ≤ − K − pi vertices rather than on the edges. Gi is the graph −Ω(δ2p0 ) + ∆e i obtained by deleting those already colored vertices.   1 2 ln(e∆) Also, we assume each vertex has an unique ID, ID(u). exp T (1 o(1)) V ≤ − − K − ∆0.1γ Redefine the set functions Ni(u): V 2 , Ni,c(u): V ∗ V → 0 V 2 , Ni,c(u): V 2 to be the neighboring − δ2p0 ti−1 2T 2 → → Ω( i) vertices of u, the neighboring vertices of u having + ∆e 0 0 0.1γ pi ≤ pi ≤ ∆ c in their palettes, and the neighboring vertices of − δ2p0 exp ( Ω(T )) + ∆e Ω( i) u having smaller ID than u and having c in their ≤ − palette. 2.1 Union bound or constructive Lov´aszLo- G is said to be (1 )-locally sparse if for − cal Lemma We want to ensure that Hi holds any u G, the number of edges spanning the ∈ ∆ for every round i. If Hi−1 is true, then neighborhood of u is at most (1 ) 2 (i.e. xy 1−0.95γ  − ∆ |{ ∈ Pr(Hi(e)) exp ∆ and Pr(Hi,c(u)) G x N(u) and y N(u) (1 ) ). ≤ − ≤ 2 exp ∆1−0.95γ . If ∆1−γ log n, then each of the | ∈ ∈ }| ≤ − − ≥ Theorem 3.1. Let , γ > 0 and G be a (1 )-locally bad event occur with probability at most 1/ poly(n). − 3 sparse graph. There exists a distributed algorithm Since there are at most O(n ) events, by the union ∗ bound, H holds w.h.p. On the other hand, if ∆1−γ that colors G with ∆ + 1 colors in O(log ∆ + i 1−γ log n, then one can use the constructive Lov´aszLo-≤ log(1/) + 1/γ) rounds if (∆) = Ω(log n). cal Lemma (LLL) to make Hi hold w.h.p. Suppose Corollary 3.1. Let  > 0 and G be a (1 )-locally that the probability each event happens is at most sparse graph. G can be properly colored with− (∆ + 1) p and each event is dependent with at most d other √ colors in O(log(1/) + eO( log log n)) rounds. events. If ep(d + 1) < 1, the LLL guarantees that the probability none of the events happen is positive. Proof. Let γ = 1/2. If ∆ = Ω(log2 n), Theorem 3.1 The celebrated results of Moser and Tardos [19] gave gives an algorithm that runs in O(log∗ ∆ + log(1/)) both sequential and parallel algorithms for construct- rounds. Otherwise if ∆ = O(log2 n), the (∆ + 1)- Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ing the underlying assignments of random variables. coloring√ algorithm given in [7] runs√ in O(log ∆ + In [9], Chung et al. showed that if a stronger condition eO( log log n)) = O(log log n + eO( log log n)) = 2 √  of LLL, epd < 1, is satisfied, then the assignment O(log (1/) + eO( log log n)) rounds. can be constructed more efficiently, in O(log1/epd2 n) rounds w.h.p. First we assume that each vertex u G has ∆ ∈

362 Copyright © 2015. by the Society for Industrial and Applied Mathematics. neighbors. If a vertex u has less than ∆ neighbors, e−3/∆. The expectation (assuming ∆ > 1) we will attach ∆ deg(u) imaginary neighbors to X it. We will analyze− the following process for just E[Z] = Pr(xy is successful) a single round. Initially every vertex has palette xy∈ /E x,y∈N(u) P0(u) = 1,... ∆ + 1 . Each vertex picks a tentative { } ∆(∆ 1) e−3 (∆ 1) ∆ color uniformly at random. For each vertex, if no − = − neighbors of smaller ID picked the same color, then ≥ 2 · ∆ 2e−3 ≥ 4e−3 it will color itself with the chosen color. Now each We will define the martingale sequence on the 2- vertex removes the colors that are colored by its P neighborhood of u and then show the variance i Vi neighbors. Let deg1(u) and P1(u) denote the degree has the same order with its expectation, O(∆). of u and the palette of u after the first round. The Let u0 = u, u1, . . . uk be the vertices in the 2- idea is to show that P (u) deg (u) + Ω(∆), { } | 1 | ≥ 1 neighborhood of u, where vertices with distance 2 then we can apply the algorithm in the previous are listed first and then distance 1. The distance section. Intuitively this will be true, because of those 1 vertices are listed by their ID in increasing order. neighbors of u who become colored, some fraction Let Xi denote the color picked by ui. Given Xi−1, of them are going to be colored the same, since the let Di,s be E[Z Xi−1,Xi = si] E[Z Xi−1] and neighborhood of u is not entirely spanned. i | | − | | Vi be Var(E[Z Xi] E[Z Xi−1] Xi−1). Note Let N(u) denote u’s neighbors. For x, y N(u), that (see [12]) | − | | we call xy a non-edge if xy / E. For x, y ∈ N(u) where ID(x) < ID(y), we call xy∈ a successful non-edge∈ p Vi max Di,si w.r.t. u if the following two condition holds: First, xy ≤ si 0 max E[Z Xi− ,Xi = si] E[Z Xi− ,Xi = s ] is not an edge and x and y are colored with the same 0 1 1 i ≤ si,s | | − | | color. Second, aside from x, y, no other vertices in i P N(u) with smaller ID than y picked the same color Also, E[Z Xi] = | x,y∈N(u),xy∈ /E with x, y. We will show that w.h.p. there will be at E[xy is successful X ]. We discuss the cases 3 i least ∆/(8e ) successful non-edges. Then P1(u) whether u is a| neighbor of u separately. If 3 | |3 ≥ i ∆+1 (∆ deg1(u))+∆/(8e ) deg1(u)+∆/(8e ). 0 − − ≥ ui / N(u), whether ui chose si or si only affects on∈ those non-edges xy such that at least one of x Lemma 3.1. Fix a vertex u G. Let Z denote the or y is adjacent to u . Let E denote such a set of ∈ i i number of successful non-edges w.r.t. u. non-edges. If xy Ei, then ∈

3 −Ω(∆) E[xy is successful Xi− ,Xi = si] Pr(Z < ∆/(8e )) e | | 1 − ≤ 0 2 E[xy is successful Xi−1,Xi = si] 2/(∆ + 1) Proof. We will assume without loss of generality that | | ≤ the neighborhood of u has exactly (1 )∆ edges. because they only differ when both x and y picked − 2 0 2 This can be assumed without loss of generality, be- si or si. Thus, maxsi Di,si 2 Ei /(∆ + 1) . No- 2 P≤ | | 2 cause we can arbitrarily add edges to its neighbor- tice that Ei ∆ and i Ei ∆ (2∆) 3 | | ≤ | | ≤ · ≤ hood until there are (1 )∆ edges. If Z0 is the 2∆ , since each of two endpoints of a non-edge − 2 2 number of successful non-edges in the modified sce- can be incident to ∆ edges in those Ei. This 0 P 2 2 5 nario, then Z statistically dominates Z , i.e. Pr(Z implies i Ei 2 ∆ , since the sum is maxi- ≥ | | ≤ 2 z) > Pr(Z0 z). Given the same outcomes of the mized when each Ei is either 0 or ∆ . Therefore, P | | P 2 4 2 ≥ Vi 4 Ei /(∆ + 1) 8 ∆. random variables, if a pair xy is a successful non- i:ui∈N(N(u))\N(u) ≤ i | | ≤ edge in the modified scenario, then it must also be a On the other hand, if ui N(u), we will first ∈ successful non-edge in the original scenario. bound Di,si = E[Z Xi] E[Z Xi−1] for a fixed | | − |P | We will first show that the expected number si. Then we will bound Vi = si Pr(Xi = si) 3 2 · of successful non-edges is at least ∆/(4e ). Then Di,s . Again, we break Z into sum of random vari- i P we will define a martingale sequence on the 2- ables Xu u , where Xu u is the P uaub∈/E,ua,ub∈N(u) a b a b neighborhood of u. After showing the variance i Vi event that the non-edge uaub is successful. The has the same order as its expectation, O(∆), we will indices a, b are consistent with our martingale se- Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php apply the method of bounded variance (Lemma A.5) quences. Without loss of generality, we assume

to get the stated bound. a < b and so ID(ua) < ID(ub). Let Di,si,ab =

Given a non-edge xy in the neighborhood of u, E[Xuaub Xi−1,Xi = si] E[Xuaub Xi−1] . In or- | | − P| | 2 the probability it is successful is at least (1 1/(∆ + der to derive an upper bound for ( Di,si,ab) , we 3∆−2 3∆−1 − 1)) (1/(∆+1)) = (1 1/(∆+1)) (1/∆) divide the non-edges ua, ub into five cases. · − · ≥

363 Copyright © 2015. by the Society for Industrial and Applied Mathematics. 1. a < b < i: In this case, the color chosen by ui Now we are ready to bound the variance Vi. For

does not affect E[Xuaub ], because ui has a higher readability we let ∆1 = ∆ + 1.

ID. Thus, Di,si,ab = 0. X 2 Vi = Pr(Xi = si) D 2. i < a < b: In this case, · i si

Di,si,ab E[Xuaub Xi−1,Xi = si] X 1 X X ≤ | | 0 Di,si,ab + Di,si,ab+ E[Xuaub Xi−1,Xi = si] ≤ ∆1 · − | | si a

5. a < i = b: In this case, E[Xuaub Xi−1,Xi = si] it can only create at most one sucessful edge when x | 0 is either 1 or 0. Note that E[Xuaub Xi−1] unselects s and destroy one when x selects s . When | i i is at most 1/(∆ + 1). Therefore, if si is the x N(u), we consider the effect when x unselects the ∈ color picked by ua and ua is the only vertex color si. It can create or destroy at most 1 successful

that picked si among u1 . . . , ui−1, then Di,si,ab non-edge. It creates a successful non-edge yz only Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php is at most 1. Otherwise, it is at most 1/(∆ + 1). when x, y, z picked si and no other vertices in N(u)

Let µsi be the indicator variables whether there with smaller ID than y, z picked si. It destroys a non- exists such a ua that colored si. We have edge when xy was a successful non-edge that both P a

364 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Similarly, it can create or destroy at most 1 successful only removes colors that are actually colored by their 0 non-edge when x picks si. It can be shown that this neighbors. Third, instead of selecting colors with

2-Lipschitz condition implies Di,si 2 [12, Corollary identical probabilty for each vertex, the vertices may 5.2]. ≤ select with different probabilities. Applying A.5 with t = ∆/(8e3) and M = 2, we get that Algorithm 4.1. Vertex-Coloring- Algorithm(G, di ) 3 3 { } Pr(Z < ∆/(8e )) = Pr(Z < ∆/(4e ) t) 1: G G − 0 ←  t2  2: i 0 exp repeat← ≤ −2(105∆ + 82∆ + 2t/3) 3: 4: i i + 1 = exp( Ω(∆)). ← 5: for each u Gi− do − ∈ 1 6: Include each c Pi− (u) in Si(e) indepen- Therefore, by Lemma 3.1, for any u G, ∈ 1 ∈ dently with probability 1 di−1+∆    −Ω(∆) 7: πi(u) = . Pr P (u) < deg (u) + ∆ e |Pi−1(u)| di−1+1 1 1 3 ∗ · | | 8e · ≤ 8: If Si(u) Si(Ni−1(u)) = , u color itself with \ 6 ∅∗ any color in Si(u) Si(Ni−1(u)). If ∆ = Ω(log n), then Pr( P1(u) < deg1(u) + \  −Ω(∆) | | 9: Set Pi(u) Pi−1(u) c 8e3 ∆) e 1/ poly(n). By the union ← \{ | · ≤ ≤  a neighbor of u is colored c . bound, P (u) deg (u) + 3 ∆) holds for all 1 1 8e end for } u G with| high| ≥ probability. If (∆)· 1−γ = Ω(log n), 10: ∈ 11: Gi Gi− colored vertices we show the rest of the graph can be colored in ← 1 \{ } ∗ until O(log ∆+log(1/)+1/γ) rounds in the next section. 12: Due to the second modification, at any round of 4 Vertex Coloring with deg(u) + ∆ Colors the algorithm, a vertex always has ∆ more colors in In this section we consider the vertex coloring its palette than its degree. The intuition of the third problem where each vertex has ∆ more colors in its modification is that if every vertex selects with an palette than its degree. The goal is to color each identical probability, then a neighbor of u having a vertex by using a color from its palette. Note that the palette with very large size might prevent u to become palette of each vertex may not necessarily be identical colored. To avoid this, the neighbor of u should and can have different sizes. choose each color with a lower probability. Define the parameters as follows: Theorem 4.1. Given , γ > 0, and G, where each di−1+∆ vertex u G has a palette containing at least deg(u)+ −γ − d = ∆ T = (∆)1 α = e 8(di−1+1) colors∈ and 1−γ . There exists 0 i ∆ (∆) = Ω(log n) ( a distributed algorithm that colors G properly in max(1.01αidi−1,T ) if di−1 > T ∗ di = T O(log ∆ + 1/γ + log(1/)) rounds. di− otherwise ∆ · 1 Corollary 4.1. Suppose that each vertex u G Let Hi(u) denote the event that degi(u) di has a palette containing at least ∈col- ≤ deg(u) + ∆ after round i. Let Hi denote the event that Hi(u) ors, then G can be properly colored in O(log(1/) + √ holds for all u Gi−1, where Gi−1 is the graph eO( log log n)) rounds. induced by the uncolored∈ vertices after round i 1. − Note that when Hi−1 is true, Proof. Let γ = 1/2. If ∆ = Ω(log2 n), Theorem 4.1 ∗ gives an algorithm that runs in O(log ∆ + log(1/)) 1 di−1 + ∆ 2 πi(u) = rounds. Otherwise if ∆ = O(log n), the (∆ + 1)- Pi−1(u) · di−1 + 1 coloring algorithm given in [7] runs in O(log ∆ + | | √ √ 1 degi−1(u) + ∆ 1 eO( log log n)) = O(log log n + eO( log log n)) = √  ≤ Pi−1(u) · degi−1(u) + 1 ≤ degi−1(u) + 1 O(log (1/) + eO( log log n)) rounds. | | Notice that u remains uncolored iff it did not

Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ∗ We will define di in Algorithm 4.1 later. Algo- select any color in Pi−1(u) Si(Ni−1(u)). We will \ ∗ rithm 4.1 is modified from Algorithm 2.1. The first show that the size of Pi− (u) Si(N (u)) is at least 1 \ i−1 modification is that instead of running it on the edges, Pi−1(u) /8 and so the probability u did not become | | |Pi−1(u)|/8 we run it on vertices. Second, instead of removing all colored is at most (1 πi(u)) αi. Then, − ≤ colors picked by the neighbors from the palette, we the expected value of degi(u) will be at most αidi−1.

365 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Depending on whether di−1 > T , we separate the defined w(c) to simplify the calculation because we P definition of di into two cases, because we would will argue that when c∈Pi−1(x) w(c) is fixed, some like the tail probability that di deviates from its inequality is minimized when each of the summand −Ω(T ) P expectation to be bounded by e . equals to c∈Pi−1(x) w(c)/ Pi−1(x) . The probability c is not chosen by any of x|’s neighbors| with smaller ∗ Lemma 4.1. di < 1 for some i = O(log ∆ + 1/γ + ID is log(1/)). Y (1 πi(y)) Proof. We analyze how di decreases in three stages. ∗ − y∈Ni−1,c(x) The first stage is when di−1 > ∆/33. During this Y min (1 π0(y)) stage, 0 P 0 i ≥ πi:( y∈N∗ (x) πi(y))=w(c) − i−1,c ∗ y∈Ni−1,c(x) di = 1.01αidi−1   0 ∗ di−1 + ∆ which is minimized when πi(y) = w(c)/ degi−1,c(u), 1.01 exp di−1 so the quantity above is ≤ −8(di−1 + 1) ·

1.01 exp ( 1/16) di−1 di−1 1 ∗ x ≤ − · ≥  w(c) degi−1,c( ) 0.99 di−1 1 ∗ ≤ · ≥ − degi−1,c(x) deg∗ (x) Therefore, this stage ends in O(log(1/)) rounds. The i−1,c ·w(c)  w(c)  w(c) second stage starts at first r1 such that T < dr1−1 = 1 ≤ − deg∗ (x) ∆/33. When i > r1: i−1,c  w(c)   1 w(c) 1 di−1 + ∆ ∗ αi 1.01 exp ≥ 4 degi−1,c(x) ≤ 2 ≤ · − 16di−1   ∆ w(c) 1 1.01 exp Note that the reason that deg∗ (x) 2 is πi(y) i−1,c ≤ ≤ ≤ · −16di−1 1 1 ∗ for y Ni−1,c(x). Therefore,  1   ∆  degi−1(y)+1 ≤ 2 ∈ exp exp ≤ 32 · −16di−1 X ∗ E[ Pbi(x) ] = Pr(c / Si(N (x)))  ∆  | | ∈ i−1 c∈P (x) exp di−1 ∆/33 ∆ i−1 ≤ −32di−1 ≤ ≤ w(c) X 1  ∆  exp ≥ 4 ≤ −33αi−1di−2 c∈Pi−1(x)   w0(c) 1 X 1 exp di−2 ∆/33 min ≤ −αi−1 ≤ ≥ w0:P w(c)=P w0(c) 4 c∈Pi−1(x) e ·· 1 e· 0 Therefore, α ∗ e r1+log (1.01∆)+1 ≥ | {z } ≥ which is minimized when w (c) are all equal, that is, ∗ 0 P 0 log (1.01∆) 0 w (c) = c ∈Pi−1(x) w(c )/ Pi−1(x) , hence 1.01∆, and so d ∗ | | r1+log (1.01∆)+1 ≤ max(1.01α ∗ ∆,T ) T . P r1+log (1.01∆)+1 w(c)/|Pi−1(x)| ≤ 1 c∈Pi−1(x) The third stages begins at the first round r2 such Pi− (x) T ≥ | 1 | · 4 that dr2−1 = T . If i r2, then di = ∆ di−1 −γ ≥ −1· ≤ (∆) di−1. Therefore, dr2+1/γ+1 < (∆) T < 1. ∗ We show the exponent is at most 1, so that The total· number of rounds is O(log(1/) + log· ∆ + E[ Pbi(x) ] Pi−1(x) /4. The exponent 1/γ). | | ≥ | | X w(c)/ Pi−1(x) = Lemma 4.2. Suppose that Hi−1 holds, then | | −Ω(T ) −Ω(∆) c∈Pi−1(x) Pr(degi(u) > di) e + ∆e .

Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ≤ X Pi−1(x) Pi−1(y) di−1 + ∆ 1 def ∗ | ∩ | Proof. Let Pbi(x) = Pi−1(x) Si(Ni−1(x)) denote the ∗ Pi−1(y) · di−1 + 1 · Pi−1(x) y∈Ni−1(x) | | | | current palette of x excluding\ the colors chosen by X di−1 + ∆ 1 its neighbors. We will first show that E[ Pbi(x) ] | | ≥ P ≤ di−1 + 1 · Pi−1(x) Pi−1(x) /4. Define w(c) = y∈N ∗ (x) πi(y). We | | i−1,c y∈Ni−1(x) | |

366 Copyright © 2015. by the Society for Industrial and Applied Mathematics.     di−1 + ∆ degi−1(x) T T T exp αidi−1 ln 1 ≤ di− + 1 · Pi− (x) ≤ − · αi∆ αi∆ − αi∆ − 1 | 1 | −Ω(∆) degi−1(x) di−1 + ∆e 1 ≤ Pi−1(x) ≤ di−1 + ∆    T −Ω(∆) | | exp di ln + ∆e Notice that the event whether the color c ≤ − eαi∆ ∗ ∈     Si(Ni−1(x)) is independent of other colors, so by a 1 e∆ −Ω(∆) exp di ln ln + ∆e Chernoff Bound: ≤ − αi − T    −Ω(|Pi−1(x)|) ∆ γ −Ω(∆) Pr( Pbi(x) < Pi(x) /8) e exp di ln(e(∆) ) + ∆e | | | | ≤ ≤ − 16di−1 − = e−Ω(∆). defn. αi    Let x . . . xk Ni− (u) be the neighbors of u, 1 di γ −  1 ∈ 1 exp T ln(e(∆) ) + ∆e Ω( ∆) listed by their ID in increasing order. Let j be the ≤ − 16 − T · E event that Pbi(xj) Pi(x) /8 for all x Ni−1(u). defn. di | | ≥ | | −Ω(∆)∈ We have shown that Pr( j) e . Let Xj    E ≤ 1 1 γ denote xj is not colored after this round. We will exp T ln(e(∆) ) 16 (∆)γ show that: ≤ − − · + ∆e−Ω(∆) max Pr(Xj Xj−1, 1,..., j) αi −Ω(∆) Xj−1 | E E ≤ exp ( Ω(T )) + ∆e ≤ − 0 0 In both cases, we have Pr(deg (u) > di ) Let c Pbi(xj). First we argue that Pr(c i +1 ≤ ∈ 0 ∈ exp( Ω(T )) + ∆ exp( Ω(∆)) Si(xj) Xj−1, 1,..., j) = πi(u). Since c Pbi(xj), − − 0 | E E ∈ 1−γ c is not chosen by any of x1, . . . , xj−1. Whether Since (∆) = Ω(log n), Pr(Hi(u)) 0 X ,...,Xj− hold does not depend on whether c ≤ 1 1 ∈ exp( Ω(T )) + ∆ exp( Ω(∆)) 1/ poly(n). By Si(xj). Furthermore, the events ... j− do not − − ≤ E1 E 1 union bound Hi holds with high probability. After depend on the colors chosen by xj, since xj has higher ∗ O(log ∆ + log(1/) + 1/γ) rounds, degi(u) = 0 for ID than x , . . . , xj− . Also, j does not depend on 1 1 E all u w.h.p., and so the isolated vertices can color the colors chosen by xj either. Therefore, Pr(Xj | themselves with any colors in their palette. Xj− , ,..., j) = πi(u) and we have: 1 E1 E References Pr(Xj Xj−1, 1,..., j) Y| E E 0 X = Pr(c / Si(ej) j−1, 1,..., j) [1] N. Alon, L. Babai, and A. Itai. A fast and sim- 0 ∈ | E E c ∈Pi(ej ) ple randomized parallel algorithm for the maximal |Pi−1(u)|/8 Journal of Algorithms (1 πi(u)) j is true independent set problem. , ≤ − E 7(4):567 – 583, 1986. di− + ∆ Pi− (u) exp 1 | 1 | [2] N. Alon, M. Krivelevich, and B. Sudakov. Coloring ≤ −(di−1 + 1) Pi−1(u) · 8 graphs with sparse neighborhoods. Journal of Com- | | 1 x e−x binatorial Theory, Series B, 77(1):73 – 82, 1999. − ≤ [3] L. Barenboim and M. Elkin. Sublogarithmic dis- αi ≤ tributed MIS algorithm for sparse graphs using Nash-Williams decomposition. Distributed Comput- If di−1 > T , by Lemma A.2 and Corollary A.1, ing, 22:363–379, 2010. [4] L. Barenboim and M. Elkin. Deterministic dis- Pr(deg (u) > max(1.01α d ,T )) i i i−1 tributed vertex coloring in polylogarithmic time. J. Pr(degi(u) > max(1.01αi degi−1(u),T )) ACM, 58(5):23, 2011. ≤ Distributed Graph e−Ω(T ) + ∆e−Ω(∆). [5] L. Barenboim and M. Elkin. ≤ Coloring: Fundamentals and Recent Developments. T −γ Synthesis Lectures on Distributed Computing The- Otherwise we have di = ∆ di−1 (∆) T . By Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php · ≤ · ory. Morgan & Claypool Publishers, 2013. Lemma A.2 and Corollary A.1 with 1+δ = T/(αi∆), [6] L. Barenboim, M. Elkin, and F. Kuhn. Distributed (∆ + 1)-coloring in linear (in ∆) time. SIAM J. Pr (degi(u) > di) Comput., 43(1):72–95, 2014.  T  [7] L. Barenboim, M. Elkin, S. Pettie, and J. Schneider. Pr degi(u) > αidi−1 ≤ αi∆ · The locality of distributed symmetry breaking. In

367 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Proc. IEEE 53rd Symposium on Foundations of ACM Symposium on Principles of Distributed Com- Computer Science (FOCS), pages 321 – 330, oct. puting (PODC), pages 257–266, 2010. 2012. [25] V. H. Vu. A general upper bound on the list [8] F. Chierichetti and A. Vattani. The local nature chromatic number of locally sparse graphs. Comb. of list colorings for graphs of high . SIAM J. Probab. Comput., 11(1):103–111, January 2002. Comput., 39(6):2232–2250, 2010. [9] K.-M. Chung, S. Pettie, and H.-H. Su. Distributed Appendix algorithms for Lov´aszlocal lemma and graph color- A Tools ing. In Proc. 33rd ACM Symposium on Principles of Distributed Computing (PODC), pages 134–143, Lemma A.1. (Chernoff Bound) Let X1,...,Xn Pn 2014. be independent trials and X = i=1 Xi. Then, for [10] A. Czygrinow, M. Hanckowiak, and M. Karonski. δ > 0: Distributed O(∆ log n)-edge-coloring algorithm. In E[X] ESA, pages 345–355, 2001.  eδ  Pr(X > (1 + δ) E[X]) [11] D. Dubhashi, D. A. Grable, and A. Panconesi. ≤ (1 + δ)(1+δ) Near-optimal, distributed edge colouring via the  δ E[X] Theor. Comput. Sci. e nibble method. , 203(2):225– Pr(X < (1 δ) E[X]) 251, August 1998. − ≤ (1 δ)(1−δ) [12] D. P. Dubhashi and A. Panconesi. Concentration of − Measure for the Analysis of Randomized Algorithms. The two bounds above imply that for 0 < δ < 1, we Cambridge University Press, 2009. have: [13] D. A. Grable and A. Panconesi. Nearly optimal −δ2 E[X]/3 distributed edge coloring in O(log log n) rounds. Pr(X > (1 + δ) E[X]) e ≤ 2 Random Structures & Algorithms, 10(3):385–405, Pr(X < (1 δ) E[X]) e−δ E[X]/2 1997. − ≤ [14] M. S. Jamall. Coloring Triangle-Free Graphs and Lemma A.2. Let 1,..., n be (likely) events and Network Games E E . Dissertation, University of Cali- X1,...,Xn be trials such that for each 1 i n fornia, San Diego, 2011. Pn ≤ ≤ and X = i Xi, [15] F. Kuhn, T. Moscibroda, and R. Wattenhofer. Local =1 computation: Lower and upper bounds. CoRR, max Pr(Xi Xi−1, 1,... i) p abs/1011.5470, 2010. Xi−1 | E E ≤ [16] N. Linial. Locality in distributed graph algorithms. where X denote the shorthand for (X ,...,X ).§ SIAM J. Comput., 21(1):193–201, February 1992. i 1 i Then for : [17] M. Luby. A simple parallel algorithm for the δ > 0 maximal independent set problem. SIAM Journal !! np ^  eδ  on Computing, 15(4):1036–1053, 1986. Pr (X > (1 + δ)np) i ∧ E ≤ (1 + δ)(1+δ) [18] M. Molloy and B. Reed. Graph Colouring and i the Probabilistic Method. Algorithms and Combi- natorics. Springer, 2001. and thus by union bound, [19] R. A. Moser and G. Tardos. A constructive proof of np  eδ  X the general Lov´aszlocal lemma. J. ACM, 57(2):11, Pr(X > (1 + δ)np) + Pr( i) ≤ (1 + δ)(1+δ) E 2010. i [20] A. Panconesi and R. Rizzi. Some simple distributed algorithms for sparse networks. Distributed Com- Proof. For now let us abuse i as 0/1 random vari- Q E puting, 14(2):97–100, 2001. ables and let = i. For any t > 0, E i E [21] A. Panconesi and A. Srinivasan. Randomized dis- !! tributed edge coloring via an extension of the ^ (A.1) Pr (X > (1 + δ)np) i Chernoff-Hoeffding bounds. SIAM J. Comput., ∧ E 26(2):350–368, 1997. i n ! [22] S. Pemmaraju and A. Srinivasan. The randomized Y = Pr i exp(tX) > exp(t(1 + δ)np) coloring procedure with symmetry-breaking. In E · Proc. 35th Int’l Colloq. on Automata, Languages, i=1 Qn and Programming (ICALP), volume 5125 of LNCS, E[ i=1 i exp(tX)] pages 306–319. 2008. E ·

Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ≤ exp(t(1 + δ)np) [23] S. Pettie and H.-H. Su. Distributed coloring al- gorithms for triangle-free graphs. Information and §We slightly abuse the notation that when conditioning Computation. To appear. on the random variable Xi, we are referring that Xi takes [24] J. Schneider and R. Wattenhofer. A new technique arbitrary values. While as when conditing on the event , we Ei for distributed symmetry breaking. In Proc. 29th are referring that happens. Ei

368 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Qn  np E[ i exp(tXi)] exp(δ) (A.2) = i=1 E · = . exp(t(1 + δ)np) (1 + δ)1+δ We will show by induction that The last equality follows from the standard derivation of Chernoff Bound by choosing t = ln(1 + δ). " k # Y t k E i exp(tXi) (1 + p(e 1)) E ≤ − Corollary A.1. Suppose that for any δ > 0, i=1 !! np ^  eδ  When k = 0, it is trivial that E[ ] 1. Pr (X > (1 + δ)np) i E ≤ ∧ E ≤ (1 + δ)(1+δ) i " k # Y E i exp(tXi) then for any M np and 0 < δ < 1, E i=1 ≥ !! M " k−1 ! ^  eδ  Y Pr (X > np + δM) i E i exp(tXi) ∧ E ≤ (1 + δ)(1+δ) ≤ E i i =1 2 # e−δ M/3 ≤ E[ k exp(tXk) Xi−1, 1,..., k−1] · E | E E Proof. Without loss of generality, assume M = tnp

" k−1 ! for some t 1, we have Y ≥ = E i exp(tXi) Pr( k) !! E · E ^ i=1 Pr (X > np + δM) i ∧ E # i E [exp(tXk) Xi−1, ,..., k]  tδ np · | E1 E e ≤ (1 + tδ)(1+tδ) " k−1 ! Y  δ M E i exp(tXi) e ≤ E = i=1 (1 + tδ)(1+tδ)/t # M  eδ  E [exp(tXk) Xi−1, 1,..., k] ( ) · | E E ≤ (1 + δ)(1+δ) ∗ " k−1 ! −δ2M/3 eδ −δ2/3 Y e (1+δ)(1+δ) e for 0 < δ < 1 = E i exp(tXi) ≤ ≤ E i=1 Inequality (*) follows if (1+tδ)(1+tδ)/t (1+δ)(1+δ), # ≥ t or equivalently, ((1+tδ)/t) ln(1+tδ) (1+δ) ln(1+δ). (1 + Pr(Xk Xi− , ,..., k)(e 1)) ≥ · | 1 E1 E − Letting f(t) = ((1+tδ)/t) ln(1+tδ) (1+δ) ln(1+δ), 0 1 − we have f (t) = t2 (δt ln(1 + δt)) 0 for t > 0. " k−1 ! # 0 − ≥ Y t Since f(1) = 0 and f (t) 0 for t > 0, we must have E i exp(tXi) (1 + p(e 1)) ≥ ≤ E · − f(t) 0 for t 1. i=1 ≥ ≥ " k−1 !# Lemma A.3. ([12], Azuma’s inequality) Let f Y t = E i exp(tXi) (1 + p(e 1)) E · − be a function of n random variables X1,...,Xn such i=1 0 that for each i, any Xi−1, any ai and ai, (1 + p(et 1))k ≤ − 0 E[f Xi−1,Xi = ai] E[f Xi−1,Xi = ai] ci Therefore, by (A.1), | | − | | ≤ then !! ^ −t2/(2 P c2) Pr (X > (1 + δ)np) i Pr( f E[f] > t) 2e i i . ∧ E i | − | ≤ Qn Lemma A.4. ([12], Corollary 5.2) Suppose that E[ i=1 exp(tXi)] = E· f(x1, . . . , xn) satisfies the Lipshitz property where Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php exp(t(1 + δ)np) 0 0 f(a) f(a ) ci whenever a and a differ in just (1 p(et 1))n | − | ≤ − − the i-th coordinate. If X1,...,Xn are independent ≤ exp(t(1 + δ)np) random variables, then t exp(np(e 1)) 0 − E[f Xi− ,Xi = ai] E[f Xi− ,Xi = a ] ci ≤ exp(t(1 + δ)np) | | 1 − | 1 i | ≤

369 Copyright © 2015. by the Society for Industrial and Applied Mathematics. Lemma A.5. ([12], Equation (8.5)) Let X1,...,Xn be an arbitrary set of random vari- ables and let f = f(X1,...,Xn) be such that E[f] is finite. For 1 i n, suppose there exists σ2 such ≤ ≤ i that for any Xi−1,

2 Var(E[f Xi] E[f Xi− ] Xi− ) σ | − | 1 | 1 ≤ i Also suppose that there exists M such that for 1 ≤ i n, E[f Xi] E[f Xi− ] M. Then, ≤ | | − | 1 | ≤ 2 − t 2(Pn σ2+Mt/3) Pr(f > E[f] + t) e i=1 i . ≤ Downloaded 07/22/15 to 68.40.198.68. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

370 Copyright © 2015. by the Society for Industrial and Applied Mathematics.