Electronic Colloquium on Computational Complexity, Revision 2 of Report No. 132 (2015)

On Percolation and NP-Hardness

Huck Bennett ∗ Daniel Reichman † Igor Shinkar ∗

July 13, 2016

Abstract The edge-percolation and -percolation models start with an arbitrary graph G, and randomly delete edges or vertices of G with some fixed probability. We study the computational hardness of problems whose inputs are obtained by applying percolation to worst-case instances. Specifically, we show that a number of classical NP-hard graph problems remain essentially as hard on percolated instances as they are in the worst-case (assuming NP * BPP). We also prove hardness results for other NP-hard problems such as Constraint Satisfaction Problems and Subset-Sum, with suitable definitions of random deletions. We focus on proving the hardness of the Maximum Independent Set problem and the problem on percolated instances. To show this we establish the robustness of the corresponding parameters α(·) and χ(·) to percolation, which may be of independent interest. Given a graph G, let G0 be the graph obtained by randomly deleting edges of G. We show that if α(G) is small, then α(G0) remains small with probability at least 0.99. Similarly, we show that if χ(G) is large, then χ(G0) remains large with probability at least 0.99.

1 Introduction

The theory of NP-hardness suggests that we are unlikely to find optimal solutions to NP-hard problems in polynomial time. This theory applies to the worst-case setting where one considers the worst running-time over all inputs of a given size. It is less clear whether these hardness results apply to “real-life” instances. One way to address this question is to examine to what extent known NP-hardness results are stable under random perturbations, as it seems reasonable to assume that a given instance of a problem may be subjected to noise. Recent work has studied the effect of random perturbations of the input on the runtime of . In their seminal paper Spielman and Teng [33] introduced the idea of smoothed analy- sis to explain the superior performance of algorithms in practice compared with formal worst-case bounds. Roughly speaking, smoothed analysis studies the running time of an on a per- turbed worst-case instance. In particular, they showed that subjecting the weights of an arbitrary linear program to Gaussian noise yields instances on which the simplex algorithm runs in expected polynomial time, despite the fact that there are pathological linear programs for which the simplex algorithm requires exponential time. Since then smoothed analysis has been applied to a number of other problems [13, 34].

∗Department of Computer Science, Courant Institute of Mathematical Sciences, New York University. Email: [email protected], [email protected] †Electrical Engineering and Computer Science, University of California Berkeley, Berkeley, CA, USA. Email: [email protected].

1

ISSN 1433-8092 In contrast to smoothed analysis, we study when worst-case instances of problems remain hard under random perturbations. Specifically, we study to what extent NP-hardness results are ro- bust when instances are subjected to random deletions. Previous work is mainly concerned with Gaussian perturbations of weighted instances. Less work has examined the robustness of hardness results of unweighted instances with respect to discrete noise. We focus on two forms of percolation on graphs. Given a graph G = (V,E) and a parameter 0 p ∈ (0, 1), we define Gp,e = (V,E ) as the probability space of graphs on the same set of vertices, 0 where each edge e ∈ E is contained in E independently with probability p. We say that Gp,e 0 0 is obtained from G by edge percolation. We define Gp,v = (V ,E ) as the probability space of 0 graphs, in which every vertex v ∈ V is contained in V independently with probability p, and Gp,v 0 is the subgraph of G induced by the vertices V . We say that Gp,v is obtained from G by vertex percolation. We also study appropriately defined random deletions applied to instances of other NP-hard problems, such as 3-SAT and Subset-Sum. Throughout we refer to instances that are subjected to random deletions as percolated instances. Our main question is whether such percolated instances remain hard to solve by polynomial-time algorithms assuming NP * BPP . The study of random discrete structures has resulted with a wide range of mathematical tools which have proven instrumental in proving rigorous results regarding such structures [5, 17, 19, 26]. One reason for studying percolated instances is that it may offer the opportunity to apply these methods to a broader range of distributions of instances of NP-hard problems beyond random graphs and random formulas. Furthermore, percolated instances are studied in host of different disciplines and are frequently used to explain properties of the the WWW [1,8], hence it is of interest to understand the computational complexity of instances arising from percolated networks.

1.1 A first example – 3-Coloring Consider the 3-Coloring problem, where given a graph G = (V,E) we need to decide whether G is 3-colorable. Suppose that given a graph G we sample a random subgraph G0 of G, by deleting 1 each edge of G independently with probability p = 2 , and ask whether the resulting graph is 3- colorable. Is there a polynomial time algorithm that can decide with high probability whether G0 is 3-colorable? We demonstrate that a polynomial-time algorithm for deciding whether G0 is 3-colorable is impossible assuming NP * BPP . We show this by considering the following polynomial time reduction from the 3-Coloring problem to itself. We will need the following definition. Definition 1.1. Given a graph G, the R-blowup of G is a graph G0 = (V 0,E0), where every vertex v is replaced by an independent set ve of size R, which we call the cloud corresponding to v. We then connect the clouds ue and ve by a complete R × R (u, v) ∈ E. Given an n-vertex graph H the reduction outputs a graph G that is an R-blow-up of H for R = Cplog(n), where C > 0 is large enough. It is easy to see that H is 3-colorable if and only if G is 3-colorable. In fact, the foregoing reduction satisfies a stronger robustness property for random subgraphs 0 1 G of G obtained by deleting edges of G with probability 2 . Namely, if H is 3-colorable, then G is 3-colorable, and hence G0 is also 3-colorable with probability 1. On the other hand, if H is not 3-colorable, then G is not 3-colorable, and with high probability G0 is not 3-colorable either.

2 Indeed, for any edge (v1, v2) in H let U1,U2 be two clouds in G corresponding to v1 and v2. 0 0 Fixing two arbitrary sets U1 ⊆ U1 and U2 ⊆ U2 each of size R/3, the probability that there is 0 −R2/9 no edge in G connecting a vertex from U1 to a vertex in U2 is 2 . By union bounding over R 2 R2/9 0 0 0 the |E| · R/3  2 choices of U1,U2 we get that there is at least one edge between U1 and 0 0 U2 with high probability. When this holds we can decode any 3-coloring of G to a 3-coloring of H by coloring each vertex v of H with the color that appears the largest number of times in the coloring of the corresponding cloud in G0, breaking ties arbitrarily. Therefore, a polynomial time algorithm for deciding the 3-colorability of G0 implies a polynomial time algorithm for determining the 3-colorability of H with high probability, and hence unless N P ⊆ coRP there is no polynomial time algorithm that given a 3-colorable graph G finds a 3-coloring of a random subgraph of G.1

Toward a stronger notion of robustness The example above raises the question of whether the blow-up described above is really necessary. Na¨ıvely, one could hope for stronger hardness of the 3-Coloring problem, namely, that for any graph H if H is not 3-colorable, then with high probability a random subgraph H0 of H is not 3-colorable either. However, this is not true in general, as H can be a 3-critical graph, i.e., a 3-colorable graph such that deletion of any edge of H decreases its chromatic number (consider for example the case of an odd cycle). Nonetheless, if random deletions do not decrease the chromatic number of a graph by much, then one could use hardness of approximation results on the chromatic number to deduce hardness results for coloring percolated graphs. This naturally leads to the following question. Question. Let G be an arbitrary graph, and let G0 be a random subgraph of G obtained from G by deleting each edge of G with probability 1/2. Is it true that if χ(G) is large, then χ(G0) is also large with probability at least 0.99?2 In this paper we give a positive answer to this question and show that in some sense the chro- matic number of any graph is robust against random deletions. We also consider the question of robustness for other graph parameters. For independent sets we demonstrate that if the inde- pendence number of G is small, then with high probability the independence number of a random subgraph of G is small as well. Hardness results are derived for other graph-theoretic problems such as Minimum Vertex Cover and Hamiltonian Cycle. Similarly, we show that for CSP formulas that are sufficiently dense randomly deleting its clauses does not change the maximum possible frac- tion of clauses that can be satisfied simultaneously. In particular, this implies that these problems remain essentially as hard on percolated instances as they are on worst-case instances.

Remark. It is worth noting that there are graph parameters for which hardness on percolated instances differs significantly from hardness on the original instance. For example, standard results in random graph theory imply that for every n-vertex graph G, with high probability the size of the 0 1 largest in the graph G obtained by edge percolation with p = 2 is O(log n). In particular, a 1Note that in the foregoing example, if we start with a bounded graph H, we can reduce it to a bounded degree graph G by using an R × R bipartite expander instead of the complete bipartite graph. 2We note that if instead of choosing the subgraph G0 at random, we choose an arbitrary subgraph of G with |E|/2 edges, then it is possible that χ(G0) is much smaller than χ(G). For example, consider the n-vertex graph G = (V,E) that consists of a clique of size n/3 and a complete bipartite graph with n/3 edges on each side. Then χ(G) = n/3, whereas if we remove all the edges of the n/3-clique, the graph becomes 2-colorable, while the number of removed `n/3´ 2 edges is 2 < n /18 < |E|/2.

3 maximum clique in G0 can be found in time nO(log n), which is significantly faster than the fastest known algorithm for finding a maximum clique in the worst-case.

1.2 Robustness of NP-hard problems under percolation In proving hardness results for percolated instances we use the concept of robust reductions which we explain below. It will be convenient to consider promise problems3. We start by introducing the following definition.

Definition 1.2. Let A = (AYES,ANO) and B = (BYES,BNO) be two promise problems. For each y ∈ {0, 1}∗ (an instance of the problem B) let noise(y) be a distribution on {0, 1}∗, that is samplable in time that is polynomial in |y|.

• A polynomial time reduction R from A to B is said to be noise-robust if

1. For all x ∈ AYES it holds that R(x) ∈ BYES, and Pr[noise(R(x)) ∈ BYES] > 0.99.

2. For all x ∈ ANO it holds that R(x) ∈ BNO, and Pr[noise(R(x)) ∈ BNO] > 0.99.

• If in the first item we have Pr[noise(R(x)) ∈ BYES] = 1, then we say that R is a noise-robust coRP-reduction. Similarly, if in the second item we have Pr[noise(R(x)) ∈ BNO] = 1, then we say that R is a noise-robust RP-reduction.

• The problem B = (BYES,BNO) is said to be NP-hard under a noise-robust reduction if there exists a noise-robust reduction from an NP-hard problem to B.

• We say that the problem A is strongly-noise-robust to B if

1. For all x ∈ AYES it holds that x ∈ BYES, and Pr[noise(x) ∈ BYES] > 0.99.

2. For all x ∈ ANO it holds that x ∈ BNO, and Pr[noise(x) ∈ BNO] > 0.99. We use the term noise-robust to avoid confusion with other notions of robust reductions that have appeared in the literature. In order to ease readability, we will often write robust reductions instead, always referring to noise-robust reductions as defined above. Note that in the last item of Definition 1.2 there is no reduction involved. Instead, we think of the problem A as a relaxation of B with AYES ⊆ BYES and ANO ⊆ BNO, and hence any algorithm that solves B in particular solves A. However, the relaxation is also robust: after applying noise to a YES-instance (resp. NO-instance) of A, it stays a YES-instance (resp. NO-instance) of B with high probability.

Proposition 1.3. Let L = (LYES,LNO) be a promise problem, and for each y instance of L, let noise(y) be a distribution on instances of L that is samplable in time that is polynomial in |y|. If L is NP-hard under a noise-robust reduction, then there is no polynomial time algorithm that when given an input y decides with high probability whether noise(y) ∈ LYES or noise(y) ∈ LNO, unless N P ⊆ BPP.

3Recall, that a promise problem is a generalization of a decision problem, where for the problem L there are two disjoint subsets LYES and LNO, such that an algorithm that solves L must accept all the inputs in LYES and reject all inputs in LNO. If the input does not belong to LYES ∪ LNO, there is no requirement on the output of the algorithm.

4 Indeed, the example given in Section 1.1 gives a noise-robust reduction from the 3-Coloring problem to itself, where noise refers to random deletions of the edges in a given graph (edge percolation). Therefore, the 3-Coloring problem is NP-hard under a noise-robust reduction.

1.3 Our results In this paper we show that a number of NP-hard problems remain hard to solve even after random deletions, i.e., they are NP-hard under noise-robust reductions. Furthermore, we show that some gap NP-hard problems are, in fact, strongly-noise-robust to the same problems with a smaller gap. Specifically, we focus on showing these results for the gap versions of the maximum independent set and chromatic number problems. As technical tools, we prove a number of combinatorial results about the independence number and the chromatic number of percolated graphs that might be of independent interest.

Maximum Independent Set and Percolation

Theorem 1.4. Let G = (V,E) be an n-vertex graph. Then, with high probability α(Gp,e) ≤  α(G)  O p log(np) . We observe that in general, the upper bound above cannot be improved, as it is well known  log(np)  that the independence number of G(n, p) is Ω p with high probability (see, e.g., [5]). In the Coloring-vs-MIS(q, a) problem, given a graph G the goal is to distinguish between the YES-case where χ(G) ≤ q and the NO-case where α(G) ≤ a. By using Theorem 1.4 together with inapproximability results of Feige and Kilian [14] we obtain the following hardness result.

Theorem 1.5. For any q, a the Coloring-vs-MIS(q, a) problem is strongly-noise-robust to  a  Coloring-vs-MIS(q, O p log(np) ), where n denotes the number of vertices in the given graph, and noise is edge-percolation with probability p. In particular, for any constant ε > 0, there is no polynomial time algorithm that given an n- 1 1−2ε vertex graph G approximates either α(Gp,e) or χ(Gp,e) within a pn1−2ε (resp. pn ) factor for 1 any p > n1−2ε unless N P ⊆ BPP. We also prove analogous theorems for vertex percolation.

Graph Coloring and Percolation Theorem 1.4 says that it is hard to approximate the chro- matic number of a percolated graph within a n1−ε factor, but says nothing about hardness of coloring percolated graphs with small (constant) chromatic number. We address this question be- low by proving lower bounds4 on the chromatic number of percolated graphs. To do this we use results from additive combinatorics and discrete Fourier analysis.

Theorem 1.6. Let G = (V,E) √be an n-vertex graph, and let k = χ(G). Then, for every λ > 0 it −λ2/2 holds that Pr[χ(G 1 ) ≥ k/2 − λ k] > 1 − e . 2 ,v Theorem 1.7. Let G = (V,E) be a graph with m edges, and let k = χ(G). Then, for every 1/3 1/4 α ∈ (0, 1) it holds that Pr[χ(G 1 ) ≥ max{Ωα(k ), Ωα(k/m )}] > 1 − α. 2 ,e

4 The notation Oα(f(n)) (resp. Ωα(f(n))) means that O(f(n)) (resp. Ω(f(n))) holds for fixed α.

5 1/4 3/8 In Theorem 1.7, the Ωα(χ(G)/m ) lower bound is better when χ(G) = ω(m ), and the 1/3 3/8 Ωα(χ(G) ) lower bound is better when χ(G) = o(m ). 1 We also obtain lower bounds on the chromatic numbers of Gp,v and Gp,e for p < 2 by composing the bounds in Theorems 1.6 and 1.7 dlog2(1/p)e times. In the Gap-Coloring(q, Q) problem we are given an n-vertex graph G and the goal is to distin- guish between the YES-case where G is q-colorable, and the NO-case where the chromatic number of G is at least Q. There is a large body of work proving hardness results for this problem [18, 23, 21] including stronger results assuming variants of the Unique Games Conjecture [10, 12]. The strongest NP-hardness result of this form, due to Huang [21], shows that Gap-Coloring(q, exp(Ω(q1/3))) is NP-hard. Combining this with Theorem 1.7 we obtain an analogous hardness result under noise- robust reductions for this problem.

Theorem 1.8. For all q < Q the Gap-Coloring(q, Q) problem is strongly-noise-robust to the 1/3 1 Gap-Coloring(q, Ω(Q )) problem, where noise is 2 -edge-percolation applied to the graph. In particular, for any sufficiently large constant q given a q-colorable graph G it is NP-hard to 1/3 find a exp(Ω(q ))-coloring of G 1 with high probability. 2 ,e

Satisfiability and other CSP’s We prove hardness of approximating the value of percolated k-CSP instances. An instance Φ of k-CSP over some alphabet Σ (e.g. Σ = {0, 1}) is a formula consisting of a collection of clauses C1, ..., Cm over n variables x1, ..., xn taking values in Σ, where k each clause is associated with some k-ary predicate f :Σ → {0, 1} over variables xi1 , . . . , xik . An instance Φ is said to be simple if all clauses in Φ are distinct. Given an assignment σ :

{x1, ..., xn} → Σ we say that the constraint C on the variables xi1 , . . . , xik is a satisfied by σ if fC (σ(xi1 ), ..., σ(xik )) = 1, where fC is the predicate corresponding to C. Given a formula Φ, and an assignment σ to its variables the value of Φ with respect to the assignment σ, denoted by valσ(Φ), is the fraction of constraints of Φ satisfied by σ. The value of Φ is defined as val(Φ) = maxσ valσ(Φ). If val(Φ) = 1 we say that Φ is satisfiable. We are typically interested in CSP where constraints belong to some fixed family of predicates F. Wk For example, in the k-SAT problem, the constraints are all of the form f(z1, . . . , zk) = i=1(zi = bi), for b1, . . . , bk ∈ {0, 1}. We assume that k, the arity of the constraints, is some fixed constant that does not depend on the number of variables n. These definitions give rise to the following optimization problem. Given a CSP instance Φ find an assignment that maximizes the value of Φ. We refer to this maximization problem as Max-CSP-F, where F denotes the family of predicates constraints are taken from. For 0 < s < c ≤ 1, Gap-CSP-F(c, s) is the promise problem where YES-instances are formulas Φ such that val(Φ) ≥ c, and NO-instances are formulas Φ such that val(Φ) ≤ s. Here we assume the constraints of CSP instances are restricted to be in the family F. We study two models of percolation on instances of CSP, clause percolation and variable perco- c lation. Given an instance Φ of CSP its clause percolation is a random formula Φp over the same set of variables, that is obtained from Φ by keeping each clause of Φ independently with probability p.

Theorem 1.9. Let ε, δ ∈ (0, 1) be fixed constants. There is a polynomial time reduction such that given a simple unweighted instance Φ of Max-CSP-F outputs a simple unweighted instance Ψ of 1 Max-CSP-F on n variables, such that val(Ψ) = val(Φ), and for any p > nk−1−δ the following holds. c 1. If val(Φ) = 1, then val(Ψp) = 1 with probability 1.

6 c 2. If val(Φ) < 1, then with high probability |val(Ψp) − val(Φ)| < ε. This immediately implies the following corollary.

Corollary 1.10. Let F be a collection of Boolean constraints of arity k, and suppose that for some 0 < s < c ≤ 1 the problem Gap-CSP-F(c, s) is NP-hard. Then Gap-CSP-F(c−ε, s+ε) is NP-hard 1 under a noise-robust reduction, where noise is p-clause percolation with p > nk−1−δ , with n denoting the number of variables in a given formula, and ε, δ > 0 arbitrary constants.

One ingredient of the proof of Theorem 1.9 that may be of independent interest is establishing that k-CSP does not admit a non-trivial approximation on formulas with nk−η clauses, where η > 0 is an arbitrary small positive constant. We also consider variable percolation. Given an instance Φ of CSP we consider a random v formula Φp whose set of variables is a subset S of the variables of Φ, where each variable of Φ is v in S independently with probability p ∈ (0, 1) and the clauses of Φp are all clauses of Φ induced by S. In other words, a clause C of Φ survives if and only if all variables of C are in S. Using ideas similar to those used for clause percolation we show that p-vertex percolated instance are 1 essentially as hard as in the worst case for p > n1−δ for any δ ∈ (0, 1).

Other Problems For the Minimum Vertex Cover problem, we prove that for any α and δ > 0 an algorithm that gives α approximation for percolated instances implies also a α − δ for worst-case instances. Our results hold for both edge and vertex percolation, where the edges or 1 the vertices of a given graph remain with probability p > n1−ε for some ε ∈ (0, 1). In particular, assuming the Unique Games Conjecture, the result of [24] implies that 2 − δ approximation for the Minimum Vertex Cover problem is NP-hard under a noise-robust reduction for any constant δ > 0. For the Hamiltonicity problem, we prove hardness results for percolated instances of both directed and undirected graphs with respect to edge percolation. We show that the problem where one needs to determine whether a graph contains a Hamiltonian cycle is also hard for percolated 1 graphs, where each edge of a given graph is kept in the graph with probability p > n1−ε for any ε ∈ (0, 1). We additionally study percolation on instances of the Subset-Sum problem, where each item of the set is deleted with probability 1 − p. We show that the problem remains hard as long as p = Ω( 1 ), where ε ∈ (0, 1/2) is an arbitrary constant, and n is the number of items in the n1/2−ε given instance.

1.4 Related Work There is a wide body of work on random discrete structures that has produced a wide range of mathematical tools [5, 17, 19, 26]. Randomly subsampling subgraphs by including each edge independently in the sample with probability p has been studied extensively in works concerned with cuts and flows (e.g., [22]). More recently, sampling subgraphs has been used to find independent sets [15]. The effect of subsampling variables in mathematical relaxations of constraint satisfaction problems on the value of these relaxations was studied in [4]. Edge-percolated graphs have been also used to construct hard instances for specific algorithms. For example, Kuˇcera[25] proved that the well-known greedy coloring algorithm performs poorly on the complete r-partite graph in which every edge is removed independently with probability 1/2

7 and r = nε for ε > 0. Namely, for this graph G, even if vertices are considered in a random order by n the , with high probability Ω( log n ) colors are used to color the percolated graph whereas χ(G) ≤ nε. Misra [27] studied edge percolated instances of the Max-Cut problem. He proves that assuming N P= 6 BPP there is no polynomial time algorithm that computes the size of the maximum cut 1+ε in Gp,e for any p > d−1 in graphs of maximal degree d. This result is tight in the sense that it 1−ε is easy to show that once p < d−1 , with high probability Gp,e breaks into connected components of logarithmic size, and hence can be solved optimally in polynomial time. The techniques used in [27] differ from ours and rely on a recent hardness result for counting independent sets in sparse graphs [32]. Bukh [7] has considered coloring edge-percolated graphs, and states the question of whether E[χ(G 1 )] = Ω(χ(G)/ log(χ(G))) as an “interesting problem.” Bukh observed that the chromatic 2 ,e number of G 1 ,e has the same distribution as the chromatic number of the complement of G 1 ,e, 2 p 2 and therefore E[χ(G 1 )] ≥ χ(G). However, it is not clear how to leverage the lower bound on 2 ,e the expectation to obtain a lower bound on χ(G 1 ) with high probability, which is required for 2 ,e our noise-robust reductions. For instance, standard martingale methods do not seem to imply that p χ(G 1 ) ≥ Ω( χ(G)) holds with high probability. 2 ,e

1.5 Preliminaries An independent set in a graph G = (V,E) is a set of vertices that spans no edges. The independence number α(G) denotes the maximum size of an independent set in G.A legal coloring of a graph G is an assignment of colors to vertices such that no two adjacent vertices share the same color. The chromatic number χ(G) denotes the minimum number of colors needed for a legal coloring of G. Note that in a legal coloring of G each color class forms an independent set, and hence χ(G) · α(G) ≥ n. A vertex cover in a graph G = (V,E) is a set of vertices S ⊆ V such that every edge e ∈ E is incident to at least one vertex in S. Note that a subset of the vertices S ⊆ V is an independent set in G if and only if V \ S is a vertex cover. In particular, G contains a vertex cover of size k if and only if it contains an independent set of size n − k. We will always measure the running time of algorithms in terms of the size of the percolated instance. Since G and Gp,e have the same number of vertices, this generally does not affect the size of the instance by more than a polynomial factor. On the other hand, Gp,v may be much smaller than G for very small values of p. However, in this work we will be only dealing with the case where p = 1 , hence with high probability the size of the vertex percolated and original graphs n1−Ω(1) are polynomially related as well. We will use the following version of the Chernoff bound.

Lemma 1.11 (Chernoff bound, Theorem 7.3.2 in [20]). Let x1, . . . , xn be independent Bernoulli Pn 2 trials with Pr[xi = 1] = p, and let µ = E[ i=1 xi] = pn. Let r ≥ e . Then Pn Pr[ i=1 xi > (1 + r)µ] < exp(−(µr/2) ln r).

8 2 Maximum Independent Set and Percolation

In this section we demonstrate the hardness of approximating α(G) and χ(G) in both edge perco- lated and vertex percolated graphs. We base our results on a theorem of Feige and Kilian, saying that for every fixed ε > 0 the problem Coloring-vs-MIS(nε, nε) is NP-hard. Theorem 2.1 ([14]). For every ε > 0 it is NP-hard to decide whether a given n-vertex graph G satisfies χ(G) ≤ nε or α(G) ≤ nε.

Edge percolation Below we prove Theorem 1.4. We will use the following lemma, due to Turan (see, e.g. [2]). Lemma 2.2. Every graph H with l vertices and e edges contains an independent set of size at least l2 2e+l . As a corollary we observe that if a graph contains no large independent sets, then it can also cannot contain large subsets of the vertices that span a small number of edges. Corollary 2.3. Let G = (V,E) be an n-vertex graph satisfying α(G) ≤ k. Then every set of vertices of size l ≥ k spans at least l(l − k)/2k edges. Proof. Let H be a subgraph of G induced by l vertices, and suppose that H spans e edges. Then, l2 l2 by Lemma 2.2 we have α(H) ≥ 2e+l . On the other hand, α(H) ≤ α(G) ≤ k, and hence 2e+l ≤ k, as required.

We are now ready to prove Theorem 1.4 saying that for any n-vertex graph G = (V,E) it holds  α(G)  that with high probability α(Gp,e) ≤ O p log(np) .

Proof of Theorem 1.4. For a given graph G, let k = α(G) + 1 and let C > 0 be a large enough α(G) l(l−k) constant. By Corollary 2.3, every subset of size l = C p log(np) spans at least 2k edges in G. Hence, by taking union bound over all subsets of size l, the probability there exists a set of size l in Gp,e that spans no edge is at most

  l   n l(l−k) en l(l − k) −Ω(l) · (1 − p) 2k < · exp −p · < (np) , l l 2k

en l l l(l−k) where the last inequality uses the choices of l and k, implying that l < (np) and exp(−p 2k ) < −Ω(l) α(G) exp(−Ω(l · log(np))) = (np) . Therefore, with high probability α(Gp,e) ≤ C p log(np). Theorem 1.5 follows immediately from Theorem 1.4.

Proof of Theorem 1.5. Let G be an instance G of the Coloring-vs-MIS(q, a) problem Note that for the YES-case if χ(G) ≤ q, then clearly χ(Gp,e) ≤ q. For the NO-case by Theorem 1.4 if  a  α(G) ≤ a, then with high probability α(Gp,e) ≤ O p log(np) which implies the strongly-noise- robust hardness. The “in particular” part follows immediately from Theorem 2.1.

Remark. Note that for constant p > 0 (e.g., p = 1/2) this theorem establishes inapproximability for the independence number of Gp,e that matches the inapproximability for the worst case.

9 1 log(n) Remark. Note also that for p > n1−ε (in fact, for p > n ) such random percolated graphs have maximal degree at most O(pn) with high probability. Therefore, such graphs Gp,e can be colored efficiently using O(pn) colors. In particular, with high probability Gp,e contains an independent set of size Ω(1/p) and hence, the independence number can be approximated within a factor of pn on p-percolated instances.

Vertex percolation Next we handle vertex percolation. We show that approximating α(G) and 1 χ(G) on vertex percolated instances with p > n1−δ is essentially as hard as on worst-case instances. Here n is the number of vertices in the graph and δ ∈ (0, 1) does not depend on n. We show this again by relying on the hardness of the gap problem Coloring-vs-MIS for percolated instances. Note that in the case of vertex percolation, the (in)approximability guarantee should depend on the number of vertices in the percolated graph Gp,v, and not on the number in the original graph. Theorem 2.4. The Coloring-vs-MIS(q, a) problem is strongly-noise-robust to itself, where noise is vertex percolation with p > 0. In particular, for any δ, ε > 0 unless N P ⊆ BPP there is no polynomial time algorithm that 0 1−ε 0 approximates either α(Gp,v) or χ(Gp,v) within a (n ) factor for any constant ε > 0, where n 1 denotes the number of vertices in Gp,v, and any p > n1−δ . Proof. The strong robustness of Coloring-vs-MIS(q, a) is clear, since for any graph G if G0 is a vertex of G, then χ(G0) ≤ χ(G), and α(G0) ≤ α(G), which is, in particular, true 0 for G ∼ Gp,v. 1 log(pn) 1 For the “in particular” part, let p > n1−δ and let c = log(n) ∈ (δ, 1) so that p = n1−c . Let η = ε · c. 0 Let G be an n-vertex graph, let Gp,v be it vertex-percolated subgraph, and let n be the number 0 of vertices in Gp,v. By the Chernoff bound in Lemma 1.11 with high probability we have |n −pn| < 0.1pn, and so, we assume from now on that nη < 2(n0)ε. By Theorem 2.1 it is NP-hard to decide whether a given n-vertex graph G satisfies χ(G) ≤ nη η η η 0 ε or α(G) ≤ n . By the choice of parameters, if χ(G) ≤ n then χ(Gp,v) ≤ n < 2(n ) , and similarly, η η 0 ε if α(G) ≤ n then α(Gp,v) < n < 2(n ) . This completes the proof of the theorem.

3 Graph Coloring and Percolation

We present our results in terms of the maximum coverage problem (see, for example, [35]), which is a variant of the set cover problem, and show later how graph coloring is related to maximum coverage.

3.1 Maximum Coverage

In the maximum coverage problem we are given a family of sets F = {S1,...,Sm} with Si ⊆ [n] and a number c. The goal is to find c sets in F such the cardinality of the union of these c sets is as large as possible. We will make use of the representation of a set S in terms of its incidence vector x(S) ∈ {0, 1}n. In this way, we can reformulate the maximum coverage problem as follows. n c Given A ⊆ F2 , find elements y1, . . . , yc ∈ A that maximize k∨i=1yik1, the Hamming weight of the bitwise-OR of the vectors.

10 We will prove two existential results saying that if A is of constant density α > 0, then there exists a good cover using only 2 or 3 vectors.

n n Lemma 3.1. Let A ⊆ F2 with |A| = α2 . Then there exist y1, y2, y3 ∈ A such that ky1 ∨y2 ∨y3k1 ≥ n − 4/α3.

Lemma 3.2. Let A ⊆ n with |A| = α2n. Then there exist y , y ∈ A such that ky ∨ y k ≥ √ F2 1 2 1 2 1 n − (1 + r) n, where r = max{e2, 2 ln 1/α}.

3.2 Proof of Lemma 3.1 using additive combinatorics Lemma 3.1 follows almost immediately from a result about sumsets. Recall that the Minkowski sum of two sets A, B is defined as A + B = {x + y : x ∈ A, y ∈ B}.

n n Lemma 3.3 (Corollary 3.5 in [31]). Let A ⊆ F2 with |A| = α2 . Then A + A + A contains an affine subspace of dimension at least n − 4/α3.

Because an affine subspace of dimension at least n−4/α3 must contain an element of Hamming 3 Pc weight at least n−4/α , Lemma 3.1 follows from Lemma 3.3 and the observation that k i=1 yik1 ≤ Wc k i=1 yik1. We note that using the recent result of Sanders on the quasi-polynomial Freiman-Ruzsa Theo- rem [29] we can also get better dependence on α by adding an additional copy of A.

n n Lemma 3.4 (Theorem A.1 in [29]). Let A ⊆ F2 with |A| = α2 . Then A + A + A + A contains an affine subspace of dimension at least n − O(log4 1/α).

3.3 Proof of Lemma 3.2 using Fourier analysis We use an inequality from Fourier analysis to give a proof of Lemma 3.2 via the probabilistic method.

n Definition 3.5. Given x ∈ F2 , define y ∼ Nρ(x) by letting each yi be equal to xi with probability 1+ρ 1−ρ 2 , and be equal to 1 − xi with probability 2 . n Let Uni(S) denote the uniform distribution on a set S, and let Un denote Uni(F2 ). The following lemma is a corollary of the reverse Bonami-Beckner inequality.

n n Lemma 3.6 (Corollary 3.5 in [28]). Let A, B ⊆ F2 with |A| = |B| = α2 . Then Pr [y ∈ B] ≥ α(1+ρ)/(1−ρ). x←Uni(A) y←Nρ(x)

n n ~ ~ ~ Proof of Lemma 3.2. Let A ⊆ F2 with |A| = α2 , and let B = A + 1 = {x + 1 : x ∈ A}, where 1 is the n-dimensional all 1s vector. Note that to prove Lemma 3.2 it suffices to show that there exist √ √ x ∈ A, y ∈ B such that kx+yk = (1+r)· n, since then y+~1 ∈ A and kx+(y+~1)k = n−(1+r)· n. √ 1 1 Let ε = 1/ n and let ρ = 1 − 2ε. By Lemma 3.6,

√ Pr [x ∈ A, y ∈ B] = Pr [y ∈ B] · Pr [x ∈ A] ≥ α2/(1−ρ) = α n. (1) x←Un x←Uni(A) x←Un y←Nρ(x) y←Nρ(x)

11 Set r = max{e2, 2 ln(1/α)}. Note that by definition of y ∼ N (x) we have that Pr[x 6= y ] = √ ρ i i 1/ n for each i independently. Therefore, by the Chernoff bound in Lemma 1.11, √ √ √ −(r/2 ln r) n 2 n Pr [kx + yk1 ≤ (1 + r) n] ≥ 1 − e ≥ 1 − α . (2) x←Un y←Nρ(x)

Since the sum of the probabilities in Equations (1) and (2) is strictly greater than 1, the correspond- √ ing events cannot be disjoint. Hence there exist x ∈ A, y ∈ B such that kx + yk1 ≤ (1 + r) n.

3.4 Coloring Using Subgraphs We now show how to apply the results in the previous subsection to the graph coloring problem. Throughout this section we let G = (V,E) with n = |V | , m = |E|. We will identify the elements of [n] with vertices V in the vertex percolation case and the elements of [m] with edges E in the edge percolation case. Let G|U denote the subgraph of G induced by U ⊆ V .

Lemma 3.7. Let G = (V,E) and let V1,V2 ⊆ V with V1∪V2 = V . If χ(G|V1 ) ≤ k1 and χ(G|V2 ) ≤ k2 then χ(G) ≤ k1 + k2.

Proof. Assume that V1 ∩ V2 = ∅ (if not, replace V1 with V1 \ V2 in the following argument). Color

G|V1 with k1 colors and color G|V2 with k2 fresh colors. Because G|V1 and G|V2 are colored with separate colors any edges between V1 and V2 have endpoints with distinct colors.

Lemma 3.8. Let G = (V,E), let E1,E2 ⊆ E with E1 ∪E2 = E, and let G1 = (V,E1),G2 = (V,E2). If χ(G1) ≤ k1 and χ(G2) ≤ k2 then χ(G) ≤ k1k2.

Proof. Let c1 be a coloring of G1 with k1 colors, and let c2 be a coloring of G2 with k2 colors. We claim that the coloring c(v) = (c1(v), c2(v)) is a legal coloring of G with k1k2 colors. Consider an edge e = (u, v) ∈ E. If e ∈ E1 then c(u) differs from c(v) in the first coordinate. Otherwise e ∈ E2 in which case c(u) differs from c(v) in the second coordinate.

3.5 Lower Bounding the Chromatic Number We now prove lower bounds on the chromatic number of percolated graphs. We will consider 1 both vertex and edge percolation with p = . This choice of p is important because G 1 , G 1 2 2 ,v 2 ,e become the distributions of graphs induced by uniformly random subsets of V and E, respectively. 1 1 However, it is easy to obtain bounds for p < 2 by composing the bounds for p = 2 . When stating bounds based on Lemma 3.2 we set r = max{e2, 2 ln(1/α)}. The idea will be to argue that if many subgraphs of a graph G are k-colorable, then G is colorable with f(k) colors for relatively small f(k). To see how this idea works, consider the following easy 1 0 case. Suppose that Pr[χ(G 1 ) ≤ k] > . Then there exists V ⊆ V such that G|V 0 and G 0 are 2 ,v 2 |V both k-colorable. It follows that G is 2k-colorable by Lemma 3.7. We now consider the case where 1 the density of k colorable subgraphs α is less than 2 .

Vertex percolation We now prove Theorem 1.6, saying that if√ G is an n-vertex graph with −λ2/2 k = χ(G), then for every λ > 0 it holds that Pr[χ(G 1 ) ≥ k/2 − λ k] > 1 − e . 2 ,v

12 Proof of Theorem 1.6. Note first that if k = χ(G), then E[χ(G 1 )] ≥ k/2. Indeed, let G 1 be 2 ,v 2 ,v the subgraph of G induced by the vertices that are not in G 1 . Observe that G 1 has the same 2 ,v 2 ,v distribution as G 1 , and hence, by Lemma 3.7 we have 2 ,v 1 E[χ(G 1 ,v)] = E[χ(G 1 ,v) + χ(G 1 ,v)] ≥ k/2. (3) 2 2 2 2

Next we define a martingale 0 = X0,X1,...,Xk as follows. Fix some k-coloring of G, and for each i = 1, . . . , k let Xi be the chromatic number of the percolated subgraph induced by the first i color classes. It is clear that |Xi−1 − Xi| ≤ 1 for all i = 1, . . . , k, and Xk = χ(G 1 ,v). Therefore, by 2 √ −λ2/2 Azuma’s inequality (see e.g. Theorem 7.2.1 in [2]) it follows that Pr[Xk ≤ E[Xk] − λ k] < e . This concludes the proof of the Theorem.

Edge percolation For a random G 1 let G 1 be the graph obtained from G by removing all 2 ,e 2 ,e edges in G 1 . By observing that G 1 and G 1 have the same distribution, and using Lemma 3.8 2 ,e 2 ,e 2 ,e we get that

1 q p √ E[χ(G 1 ,e)] = · E[χ(G 1 ,e) + χ(G 1 ,e)] ≥ E[ χ(G 1 ,e) · χ(G 1 ,e)] ≥ χ(G) = k, 2 2 2 2 2 2 analogous to the bound in (3) for vertex percolation.√ However, using the martingale as above −λ2/2 Azuma’s inequality implies Pr[χ(G 1 ) ≤ (1 − λ) k] < e , which is not enough to prove that 2 ,e χ(G 1 ) is large with high probability. 2 ,e Below we use alternative techniques to prove Theorem 1.7 asserting that a weaker bound on χ(G 1 ) holds with probability 1 − α for any α > 0. To the best of our knowledge these techniques 2 ,e are new to this area, and may be of independent interest.

3 3 Lemma 3.9. Pr[χ(G 1 ) ≤ k] ≥ α ⇒ χ(G) ≤ k + 8/α . 2 ,e m Proof. Identify subsets of edges E with vectors in F2 . Because Pr[χ(G 1 ) ≤ k] ≥ α by Lemma 3.1 2 ,e 3 there exist E1,E2,E3 ⊆ E such that each Gi = (V,Ei) is k-colorable and |E1 ∪ E2 ∪ E3| ≥ m−4/α . 3 Using Lemma 3.8, we can then color G(V,E1 ∪E2 ∪E3) with k colors. We then color the endpoints 3 3 3 of the remaining E \ (E1 ∪ E2 ∪ E3) edges using 8/α new colors to achieve a (k + 8/α )-coloring of G.

The next lemma gives an unconditional upper bound on the chromatic number of a graph. √ Lemma 3.10. Let G = (V,E) be a graph with |E| = m. Then χ(G) ≤ 3 m + 1. √ √ Proof. Partition V into sets V0 = {v ∈ V : deg(v) < m} and V1 = {v ∈ V : deg(v) ≥ m}. By √ P Brooks’ Theorem [6], χ(G ) ≤ maxv∈V deg(v)+1 ≤ m+1. Furthermore, because deg(v) ≤ |V0 √ 0 √ v∈V1 2m, it follows that |V1| ≤ 2 m, and in particular χ(G|V1 ) ≤ 2 m. The result follows by Lemma 3.7.

We use a variant of the same partitioning trick in the following lemma.

1/4 Lemma 3.11. Pr[χ(G 1 ) ≤ k] ≥ α ⇒ χ(G) ≤ (4 + 2r) · k · m . 2 ,e

13 Proof. Note first that if k ≥ m1/4, then the claimed bound holds by Lemma 3.10. So we assume henceforth that k < m1/4. m Identify subsets of edges E with vectors in F2 . Because Pr[χ(G 1 ) ≤ k] ≥ α by Lemma 3.2 2 ,e there exist E ,E ⊆ E such that G = (V,E ),G = (V,E ) are k-colorable and |E ∪ E | ≥ √1 2 1 1 2 2 1 2 m − (1 + r) m. Let E3 = E \ (E1 ∪ E2), and define the graph G3 = (V,E3). Define a partition U, U of V , 1/4 1/4 where U = {v ∈ V : degG3 (v) < m /k} and U = {v ∈ V : degG3 (v) ≥ m /k}. We claim (1) 1/4 1/4 that χ(G|U ) ≤ 2km and (2) that χ(G|U ) ≤ 2(1 + r)km . By Lemma 3.7 we then get the upper 1/4 bound χ(G) ≤ χ(G|U ) + χ(G|U ) ≤ (4 + 2r)km . 1/4 To prove (1) note that by Brooks’ Theorem [6] we have χ((G3)|U ) ≤ 2m /k, and thus by 1/4 P Lemma 3.8, χ(G|U ) ≤ χ(G1) · χ(G2) · χ((G3)|U ) ≤ 2km . For (2) note that v∈U degG3 (v) ≤ √ 1/4 2(1 + r) m, and hence χ(G|U ) ≤ U ≤ 2(1 + r)km , as required Taking the contrapositive of Lemmas 3.9 and 3.11 implies Theorem 1.7.

Proof of Theorem 1.8

Finally, we use Theorem 1.7 to prove the strong robustness result for Gap-Coloring. Let G be an instance of the Gap-Coloring(q, Q) problem. We claim the following:

YES-case: If χ(G) ≤ q, then χ(G 1 ) ≤ q. 2 ,e

1/3 NO-case: If χ(G) ≥ Q, then χ(G 1 ) ≥ exp(Ω(Q )) with probability at least 0.99. 2 ,e The YES-case is clear, since removing edges can only decrease the chromatic number. The NO- case follows from Theorem 1.7. Therefore, the Gap-Coloring(q, Q) problem is strongly-noise-robust to the Gap-Coloring(q, exp(Ω(Q1/3))) problem. The “in particular” part of the theorem follows from the result of Huang [21] showing that Gap-Coloring(q, exp(Ω(q1/3))) is NP-hard.

4 Constraint Satisfaction Problems and Percolation

In this section we deal with percolation in Constraint Satisfaction Problems (CSP).

Clause percolation We show that for k-CSP the problem of approximating the optimal value on p-percolated instances is essentially as hard as approximating it on a worst-case instance as long 1 as p > nk−1−δ for any constant δ > 0. To prove Theorem 1.9 we start with the following lemma.

Lemma 4.1. Let Φ be a simple unweighted k-CSP instance over an alphabet Σ with n variables Cn and m clauses, and let p > ε2m for some ε ∈ (0, 1) and some constant C > 0 that depends only on |Σ|. Then,

c 1. If val(Φ) = 1, then val(Φp) = 1. c 2. If val(Φ) < 1, then with high probability |val(Φp) − val(Φ)| < ε.

14 c Proof. The first item is clear, as any assignment that satisfies Φ will also satisfy Φp. For the second 0 c item, let m be the number of clauses in Φp. By concentration bounds in Lemma 1.11 we have

2 Pr[|m0 − pm| > εpm] < e−Ω(ε pm) < e−Ω(C·n), where Ω(·) hides some absolute constant. Fix an assignment σ to the variables of Φ, and let s = valσ(Φ). Then, the number of clauses in Φ satisfied by σ is sm. Let Sσ denote the number c clauses in Φp satisfied by σ. Since we pick each clause with probability p independently, and Cn recalling that p > ε2m we have

−Ω(ε2pm) −Ω(C·n) Pr[|Sσ − spm| > εpm] < e < e .

0 0 Denoting by E the event that |Sσ − sm | > εm we get

 c  0 0 Pr |valσ(Φp) − s| > ε = Pr[|Sσ − sm | > εm ] 0 0 ≤ Pr[E] + Pr[|Sσ − sm | > εm E] ≤ 2e−Ω(C·n).

Suppose now that val(Φ) = s. If σ is an optimal assignment to Φ, i.e., valσ(Φ) = s, then we c immediately have by the argument above that valσ(Φp) > s−ε with high probability. On the other 0 c −Ω(C·n) hand, for any assignment σ it holds that Pr[valσ0 (Φp) > s + ε] < e for some sufficiently large C > 0, and by taking union bound over all assignments σ we get

c 0 c −Ω(Cn) n n Pr[val(Φp) > s + ε] < Pr[∃σ such that valσ0 (Φp) > s + ε] < 2e · |Σ| < c , for some absolute constant c < 1.

We note that we assume in the proof above that s is a constant independent of n. This 1 assumption is justified as s ≥ |Σ|k , and we assume that |Σ| and k are constants independent of n. Next, we show a polynomial time reduction which, given a Max-CSP-F instance Φ, outputs a Max-CSP-F instance Ψ with n variables and nk−ε clauses such that val(Ψ) = val(Φ). We use similar ideas to those used in [9] who proved that unweighted instances of CSP problems are as hard to approximate as weighted ones. Lemma 4.2. For any δ ∈ (0, 1) there is a polynomial time reduction which, given a simple un- weighted Max-CSP-F instance Φ, outputs a simple Max-CSP-F instance Ψ with n variables and at least nk−δ clauses such that val(Ψ) = val(Φ). Proof. The reduction works as follows. Let R be a parameter to be chosen later. Given an instance Φ of k-CSP with M clauses over the variables x1, . . . , xN the reduction creates the following instance Ψ. For each variable xi of Φ, the instance Ψ will have a set of R corresponding variables Xi = {xi,j : j ∈ [R]}, where we think of each variable in Xi as a copy of xi. For each clause C of Φ we add to Ψ the Rk clauses obtained by taking the same constraint over each combination of the k variables from the corresponding Xi’s. We call this set of R clauses the cloud corresponding to C. So, Ψ has n = NR variables and m = M · Rk clauses. Therefore, if R > N k/δ, then m > nk−δ. Next we claim that val(Φ) = val(Ψ). Clearly, we have val(Φ) ≤ val(Ψ), as any assignment σ : {x1, . . . , xN } ∈ Σ to Φ can be extended to the assignment τ to Ψ by letting τ(xi,j) = σ(xi) for all i ∈ [N], j ∈ [R].

15 In the other direction, let τ be an assignment to the variables of Ψ.5 For each i ∈ [N] and a |{j∈[R]:τ(xi,j )=a}| a Σ let pi = R be the fraction of xi,j’s that are assigned the value a. Construct an a assignment σ to the variables of Φ randomly, by setting σ(xi) = a with probability pi independently for each xi. Equivalently we can choose one of the R copies of xi in Ψ uniformly at random and assign to xi the value assigned by τ to the variable chosen. Then for each clause C of Φ, the probability that σ satisfies C is equal to the fraction of the clauses in Ψ in the cloud corresponding to C that are satisfied by τ. Denote by SATσ(Ci) the number of clauses that are satisfied by σ in the cloud corresponding to Ci. Since each clause of Φ corresponds to the same number of clauses in Ψ, it follows that the expected value of Φ under the assignment σ is

M 1 X E[val (Φ)] = Pr[σ satisfies C ] σ M i i=1 M 1 X SATσ(Ci) = M Rk i=1 = valτ (Ψ).

Hence, there exists an assignment σ to the variables of Φ such that valσ(Φ) ≥ valτ (Ψ), and thus val(Φ) ≥ val(Ψ), as required.

Theorem 1.9 follows immediately from Lemmas 4.1 and 4.2. We observe that it is unlikely that Lemma 4.2 could also hold for Max-CSP-F instances with- arity k and Ω(nk) constraints, as, for example, the value of a 3-SAT formula with Ω(n3) clauses, can be (1 − δ)-approximated for every δ ∈ (0, 1) in polynomial time [3].

Variable percolation Next we show that Max-CSP-F is also hard under variable percolation. We prove below that for p that is no too small, with high probability Max-CSP-F is hard to approximate on percolated instances within the same factor as in the worst-case setting.

Theorem 4.3. Let ε, δ > 0 be fixed constants. There is a polynomial time reduction which, given a simple unweighted instance Φ, outputs a simple unweighted instance Ψ on n variables with the 1 same constraints, such that val(Ψ) = val(Φ), and furthermore for any p > n1−δ the following holds. v 1. If val(Φ) = 1, then val(Ψp) = 1 with probability 1. v 2. If val(Φ) < 1, then with high probability |val(Ψp) − val(Φ)| < ε. The following corollary is the analogue of Corollary 1.10 for variable percolation.

Corollary 4.4. Let F be a collection of Boolean constraints of arity k, and suppose that for some 0 < s < c ≤ 1 the problem Gap-CSP-F(c, s) is NP-hard. Then Gap-CSP-F(c−ε, s+ε) is NP-hard 1 under a robust reduction with respect to vertex percolation with any parameter p > n1−δ , where n denotes the number of variables in a given formula, and ε, δ > 0 are arbitrary constants.

5 Note that if for each i ∈ [N] the assignment τ gave the same value to all variables in Xi, this would naturally induce a corresponding assignment to Φ. However, this need not be the case in general.

16 Proof of Theorem 4.3. The reduction is the same reduction as in the proof of Theorem 1.9. Namely, given a simple unweighted instance Φ with N variables and M clauses the reduction replaces each variable xi of Φ, with a set of R corresponding variables Xi = {xi,j : j ∈ [R]}, and replaces each clause of Φ with a cloud of Rk corresponding clauses, by taking all possible combinations of the variables from the corresponding Xi’s. That is, the output of the reduction Ψ has n = NR variables q and m = M · Rk clauses. We choose R = N 1/c, where c = log pn ∈ (δ, 1) so that log N < 1 . log n pR N c/2 0 v For each i ∈ [N] let Xi be variables from Xi that remain in Ψp after variable percolation. By the Chernoff bound in Lemma 1.11, it follows that for p > 1 with high probability ||X0|−pR| < √ N 1−δ i O( pR log n) for all i ∈ [N]. We assume from now on that this is indeed the case. For a constraint

Ci of Φ let xi1 , . . . , xik be the variables appearing in Ci. Then, the number of clauses in the cloud corresponding to C in Ψv is equal to Qk |X0 |, and the total number of clauses in Ψv is i p j=1 ij p PM Qk |X0 |. i=1 j=1 ij By Lemma 4.2 we have val(Ψ) = val(Φ). In particular, if Φ is satisfiable, then so is Ψ, as v any assignment that satisfies Ψ also satisfies any subformula of Ψ, which implies that Ψp is also satisfiable with probability 1. v Suppose now that val(Φ) < 1. We claim that with high probability |val(Ψp) − val(Φ)| < ε. v To prove that val(Ψp) ≥ val(Φ) − ε, let σ be an optimal assignment to Φ. Extend σ to an v assignment τ to Ψp by letting τ(xi,j) = σ(xi) for all 1 ≤ i ≤ R. Note that for each constraint v Ci of Φ if Ci is satisfied by σ, then in Ψp all clauses in the corresponding cloud are satisfied, and otherwise no clause in the corresponding cloud is satisfied. Denoting by SATτ (Ci) the number of clauses that are satisfied by τ in the cloud corresponding to Ci we have √ PM k s v i=1 SATτ (Ci) val(Φ)M · (pR − pR log N) log N  valτ (Ψp) = ≥ √ ≥ val(Φ) − O . PM |X0 | · · · |X0 | M(pR + pR log N)k pR i=1 i1 ik By the choice of R we get for large enough N 1 val (Ψv) ≥ val(Φ) − O( ) ≥ val(Φ) − ε. τ p N c/2 v v Next, we prove that val(Φ) ≥ val(Ψp) − ε. Given an assignment τ to the variables of Ψp we decode it into an assignment to Φ using the same decoding as in the proof of Lemma 4.2. Namely, a we choose a random assignment σ to the variables of Φ by setting σ(xi) = a with probability pi 0 a |{xi,j ∈Xi:τ(xi,j )=a}| 0 independently between i’s, where pi = 0 . Let Ci be the set of clauses in Ci that |Xi| v 0 0 belong to Ψp. Let SATτ (Ci) be the number of clauses that are satisfied by τ in Ci, it follows that the expected value of Φ under the assignment σ is

M M 0 1 X 1 X SATτ (C ) E[val (Φ)] = Pr[σ satisfies C0] = i . (4) σ M i M |X0 | · · · |X0 | i=1 i=1 i1 ik On the other hand we have PM 0 v i=1 SATτ (Ci) valτ (Ψp) = . (5) PM |X0 | · · · |X0 | i=1 i1 ik 0 √ Now, using the assumption that for all i ∈ [n] it holds that ||Xi| − pR| < pR log n, we get that PM SAT (C0) PM SAT (C0) both (4) and (5) are between i=1√ τ i and i=1√ τ i . A simple computation reveals M(pR+ pR log N)k M(pR− pR log N)k

17 q log N that the difference between the two quantities is at most O( pR ), and hence s log N 1 E[val (Φ)] ≥ val (Ψv) − O( ) ≥ val (Ψv) − O( ) ≥ val (Ψv) − ε. σ τ p pR τ p N c/2 τ p

This completes the proof of Theorem 4.3.

5 More NP-hard Problems and Percolation

In this section we study more NP-hard problems.

5.1 Vertex Cover and Percolation In the Minimum Vertex Cover problem we are given a graph G and our goal is to find a vertex cover of G of minimum size. There is a simple 2-approximation algorithm for the Minimum Vertex Cover problem [35]. On the hardness side, the problem is NP-hard to approximate within a factor of 1.3606 [11], and assuming the Unique Games Conjecture is known to be NP-hard to approximate within a (2 − ε) factor for any constant ε > 0 [24]. We prove that the same hardness results are percolation robust.

Edge percolation

We have the following simple lemma regarding independent sets in edge percolated subgraph of KR,R.

Lemma 5.1. Consider the complete bipartite graph G = KR,R with bipartition A, B. Then, the probability that there is an independent set I in Gp,e such that |I ∩ A| = |I ∩ B| = C log(R)/p is at most R−3, where C is a large enough constant independent of n or p.

Proof. For fixed sets SA ⊆ A and SB ⊆ B each of size C log(R)/p the probability that SA and SB (C log(R)/p)2 span no edge is (1 − p) . Therefore, by union bound over all SA and SB the probability that there is is an independent set I in Gp,e with |I ∩ A| = |I ∩ B| = C log(R)/p is at most

 2 R 2 2 (1 − p)(C log(R)/p) ≤ m2C log(R)/pe−p(C log(R)/p) C log(R)/p which is at most R−3 for large enough C.

Consider the following Gap-Vertex-Cover(c, s) problem where the YES-instances are graphs that have a vertex cover of size cn, and NO-instances are all graphs whose minimum vertex cover is larger than sn, where n is the number of vertices in G. Note that, equivalently, the YES-instances are graphs that contain an independent set of size α(G) ≥ (1 − c)n, the NO-instances are graphs whose is of size α(G) ≤ (1 − s)n. We remark that the result of Khot and Regev [24] proves that assuming the Unique Games 1 Conjecture the problem Gap-Vertex-Cover( 2 +ε, 1−ε) if NP-hard for all constant ε > 0. We show use this to show hardness of approximation for this problem on edge-percolated instances.

18 Theorem 5.2. Let ε, δ ∈ (0, 1) be fixed constants. Assuming the Unique Games Conjecture, 1 Gap-Vertex-Cover( 2 + ε, 1 − ε) is NP-hard under a noise-robust reduction, where noise is edge 1 percolation with parameter p for any p > n1−δ and n denotes the number of vertices in the given graph. In particular, assuming the Unique Games Conjecture (2−ε)-approximation of the Vertex Cover problem is hard on edge percolated instances.

Proof. By [24] assuming the Unique Games Conjecture, for any ε > 0 the problem 1 Gap-Vertex-Cover( 2 +ε, 1−ε) is NP-hard. Equivalently, given an N-vertex graph G is is NP-hard to distinguish between the case that α(G) > (1/2 − ε)N and the case that α(G) < εN. We show a reduction from this problem to itself (with slightly larger parameter ε) that is robust for edge percolation. Consider the reduction that given a graph G outputs the R-blowup of G, which we denote by H, with R to be chosen later. That is the graph H is a graph on n = NR vertices, and it is clear 1 that α(H) = α(G)·R. Therefore, this is indeed a reduction from the Gap-Vertex-Cover( 2 +ε, 1−ε) to itself. We show below that in fact the reduction is robust for edge percolation. In order to do it we prove that with high probability

α(G) · R ≤ α(He) ≤ α(G) · R + (C log(R)/p) · N, (6) where He ∼ Hp,e denotes the edge percolation of H with parameter p. Indeed, the left inequality is clear because α(He) ≥ α(H) = α(G) · R, since He is a subgraph of H. For the right inequality, by Lemma 5.1 with probability at least (1 − N 2/R3) the following holds: for every edge (u, v) of G the corresponding clouds ue and ve in H are such that there is no independent set I in He, such that |I ∩ ue| ≥ C log(R)/p and |I ∩ ve| ≥ C log(R)/p. Therefore, if I is an independent set that intersects some clouds on more than C log(R)/p, then the vertices corresponding to these clouds must form an independent set in G. Thus, with probability at least (1 − N 2/R3) we have α(He) ≤ α(G) · R + (C log(R)/p) · N. Next we choose the parameter R such that the reduction above is indeed a robust reduction for log(pn) 1 edge percolation with parameter p. For the parameter p let c = log(n) so that p = n1−c , and let R = N 2/c (where N is the number of vertices in the original graph). Now, if α(G) > (1/2−ε)N, then by (6) we have α(He) ≥ α(G)·R > (1/2−ε)NR = (1/2−ε)n, and hence He contains a vertex cover of size (1/2 + ε)n On the other hand, we claim that if α(G) < εN, 1 1 then with high probability α(He) < 2εn. Indeed, by the choice of R we have p = n1−c = (NR)1−c > N C log(R) α(G) · R . Therefore, by the right inequality of (6) we have α(He) ≤ α(G)·R+(C log(R)/p)·N ≤ 2α(G) · R < 2εn, and hence He does not have a vertex cover of size (1 − ε)n. This completes the proof of Theorem 5.2.

Vertex percolation We now proceed with vertex percolation. Note that when considering vertex percolation, the percolation parameter p depends on the number of vertices in the given (worst-case instance) graph, while the performance of the algorithm is measured with respect to the number of vertices in the percolated graph, which is close to pn with high probability. We will need the following concentration bound, which is an immediate corollary of the Chernoff bound in Lemma 1.11.

19 (1) (1) (m) (m) Corollary 5.3. Let X1 ,...,Xn ,...,X1 ,...,Xn be independent 0-1 random variables with (j) Pr[Xi = 1] = p. Then, for some absolute constant C > 0 it holds that

n X (j) p −3 Pr[∃j ∈ [m]: Xi − pn ≥ Cpn log(m)] ≤ m . i=1

Pn (j) Proof. By the multiplicative Chernoff bound above for each j ∈ [m] it holds that Pr[| i=1 Xi − pn| ≥ pCpn log(m)] ≤ e−C log(m) < m−4, where C > 0 is some absolute constant. By taking Pn (j) p −4 −3 union bound over we get Pr[∃j ∈ [m]: | i=1 Xi − pn| ≥ Cpn log(m)] ≤ m · m = m , as required.

We can now deal with vertex percolation and vertex-cover.

Theorem 5.4. Let ε, δ ∈ (0, 1) be fixed constants. Assuming the Unique Games Conjecture, Gap-Vertex-Cover(1 − ε, 1/2 + ε) is NP-hard under a robust reduction with respect to vertex per- 1 colation with parameter p, for any p > n1−δ , where n is the number of vertices in the starting graph. In particular, assuming the Unique Games Conjecture (2−ε)-approximation of the Vertex Cover problem is hard on vertex percolated instances.

Proof. The reduction is the same as in the proof of Theorem 5.2. For the parameters p and ε let log(pn) 1 N 1/c c = log(n) so that p = n1−c , and let R = ( ε2 ) . Given a graph G the reduction produces the R-blowup of G, which we denote by H. Then H is a graph on n = NR vertices. Let He = Hp,e denote the vertex percolation of H with parameter p. By Corollary 5.3, with high probability the number of vertices in He ∼ Hp,e, which we denote by m is between pNR − p p C pNR log(NR√) and pNR + C pNR√log(NR), and the number of vertices in every cloud of He is between pR − C pR log N and pR + C pR log N, for some absolute constant C > 0 independent of N or p. Clearly any independent set I in He gives rise to an independent set in G by taking all vertices v of G such that I intersects the corresponding cloud ve. This implies that with high probability it holds (for N large enough) that p p α(G) · (pR − C pR log(N)) ≤ α(He) ≤ α(G) · (pR + C pR log(N)).

C2 lg(N) By the choice of R we have R > ε2p , and hence |α(He) − α(G)pR| ≤ ε · α(G)pR. Therefore, denoting by m the number of vertices in He if α(G) > (1/2 − ε)N, then α(He) ≥ (1/2 − 3ε)m, and hence He contains a vertex cover of size (1/2 + 3ε)m. On the other hand, if α(G) < εN, then with high probability α(He) < 3εm, and and hence He does not have a vertex cover of size (1 − 3ε)m.

5.2 Hamiltonicity and Percolation Recall that an Hamiltonian cycle in a graph is a cycle that visits every vertex exactly once. Deciding if a graph (whether directed or undirected) contains a Hamiltonian cycle is a classical NP-hard problem, which we denote by HamCycle. A Hamiltonian path, is a simple path that traverses all vertices in the graph.

20 In this section we prove that unless NP = coRP, there is no polynomial time algorithm that given an n-vertex graph G decides with high probability whether Gp,e contains a Hamiltonian cycle 1 for any p > n1− where  ∈ (0, 1). A natural approach in proving that deciding the Hamiltonicity of percolated instances is hard, is to “blow up” edges. Namely to replace each edge (u, v) by a clique of size k and connect both endpoints of the edges to all vertices of the clique. The idea is that when k is large enough, there is a Hamiltonian path with high probability between all pairs of distinct vertices of the clique even after percolation. Hence with high probability, we can connect u and v after percolating the edges, by a path that traverses all the vertices of the percolated clique. The problem with this idea, is that the resulting graph after this blowup operation may not be Hamiltonian (even if the starting graph is) as there is a new set of vertices for every edge in the original graph that needs to be traversed by an Hamiltonian cycle. For directed graphs, we overcome this problem by adding to each vertex v a large clique C, adding a directed edge (v, c) for every c ∈ C and adding a directed edge (c, u) for every c ∈ C and u ∈ N(v) (where N(v) is the set of all vertices having a directed edge from v). Combining this reduction with the standard NP- hardness reduction from directed to undirected Hamiltonian cycle (see, e.g., [30]) yields a similar result for undirected graphs. We omit the details. Theorem 5.5. Let ε ∈ (0, 1) be a fixed constant. The HamCycle is NP-hard under a noise-robust 1 reduction, where noise is the edge percolation with probability p > n1−ε . We will need the following claim.

Claim 5.6. Let H = (V,E) be the directed graph with V = {s, t} ∪ U, where U = {u1, . . . , uR}, the vertex s is a source and t is a sink t. The edges of H are

E = {(s → ui): i ∈ [R]} ∪ {(ui → t): i ∈ [R]} ∪ {(ui → uj): i, j ∈ [R]}. Let H0 = (V,E0) be an edge percolation of H, where we keep each directed edge with probability 3 log5(R) 1 p = R . Then, with probability 1 − R3 there is a Hamiltonian path in H from s to t.

Proof. Let p0 ∈ (0, 1), and consider the random graph Hp0,e Note that with probability at least R R−1 2 1 − 2(1 − p0) − R(p0(1 − p0) ) there are two distinct vertices v1, vR ∈ U such that (s → v1), (vR → t) are both edges of Hp0,e. Conditioning on these specific v1, vR ∈ U, we show that with high probability there is a Hamiltonian path from v1 to vR in the subgraph of Hp0,e induced by U. By a result of [16, Theorem 1.3] if D is a p0-edge percolation of the complete directed graph p log4(R) with R vertices with p0 = 3 log(R) = R , then with high probability every edge of D is contained in some Hamiltonian cycle in D. Note that the probability that Hp0,e contains a Hamiltonian path from v1 to vR is equal to the probability that Hp0,e contains a Hamiltonian cycle that goes through the edge (v1 → vR), conditioned on the event that (v1 → vR) is an edge in Hp0,e. Therefore, since the distribution of the subgraph of Hp0,e induced by U is distributed like D, it follows that with high probability the subgraph Hp0,e induced by U contains a Hamiltonian path from v1 to vR, and 1 hence Hp0,e contains a Hamiltonian path from s to t with probability at least 2 . Next, let ` = 3 log(R) so that p = ` · p0. We claim that the graph Hp,e contains an Hamiltonian 1 0 0 path from s to t with probability at least 1− R3 . Observe that if H1,...,H` are independent copies 0 of Hp0,e, then the probability that none of the Hi contains a Hamiltonian path from s to t is at ` 1 ` ` most (1/2) < R3 . Therefore, since 1−(1−p0) ≤ p it follows that H dominates ∪i=1Hi, and hence 1 Hp,e contains an Hamiltonian path from s to t with probability at least 1 − R3 , as required.

21 Proof of Theorem 5.5. In order to prove the theorem, we show a reduction that given a directed graph G = (V,E) produces a directed graph G0 = (V 0,E0) such that • If G contains a Hamiltonian cycle, then G0 contains a Hamiltonian cycle, and with high 0 probability Gp,e contains a Hamiltonian cycle. 0 0 • If G does not contain a Hamiltonian cycle, then neither G nor Gp,e contains a Hamiltonian cycle. The reduction works as follows. Let V = [N] be the vertices of G, and let R be a parameter 0 0 S N i i to be chosen later. The vertices of G will be V = V (∪i=1Ui), where Ui = {u1, . . . , uR}. For 0 each i ∈ [N] the graph G contains all edges in both directions inside Ui. For each directed edge (i → j) ∈ E we add in G0 the directed edges i i {(i → u`): ` ∈ [R]} ∪ {(u` → j): ` ∈ [R]}. 0 That is, we turn the graph G into G by adding a clique Ui for each vertex vi ∈ V , and letting all edges outgoing from vi go through this clique. This completes the description of the reduction. Let us first show that that G contains a Hamiltonian cycle if and only if G0 contains a Hamiltonian cycle. Indeed, suppose that C = (σ1, . . . , σN ) is a Hamiltonian cycle in G. Then 0 σ1 σ1 σN σN 0 C = (σ1, u1 . . . , uR , . . . , σN , u1 . . . , uR ) is a Hamiltonian cycle in G . In the other direction, suppose that G0 contains a Hamiltonian cycle C0. It is easy to see that any i ∈ V appearing in C0 must be followed immediately by a permutation of all R vertices in Ui. Therefore, by restricting C0 to the vertices in V we get a Hamiltonian cycle in G. 0 0 Next we show that the reduction above is robust to edge percolation. Let G˜ = G p,e be the edge percolation of G0. Clearly if G0 does not contain a Hamiltonian cycle, then neither does G˜0. Therefore, it is only left to show that if G0 contain a Hamiltonian cycle C, then with high probability G˜0 also contains a Hamiltonian cycle. As explained above a Hamiltonian cycle 0 in G is given by a permutation σ = (σ1, . . . σN ) ∈ SN and some ordering of the vertices in 0 σ1 σ1 σN σN each Ui, i.e., C = (σ1, u1 . . . , uR , . . . , σN , u1 . . . , uR ). Note that for each i ∈ [N] the vertices σi σi {σi, u1 . . . , uR , σi+1} induce a subgraph isomorphic to the graph H from Claim 5.6. Therefore, by log4(R) 1 Claim 5.6 if p > R , then for each i ∈ [N] with probability 1 − R3 there is path from σi to σi+1 that visits all vertices in Uσi . By taking union bound over all i ∈ [N] we get that with probability N ˜0 1 − R3 such paths exist for all i ∈ [N], and by concatenating them we conclude that G contains a Hamiltonian cycle with high probability. Finally, we specify the choice of the parameter R. The obtained graph H has n = NR vertices, log4 R 3 and the constraints we have are p > R and R  N. Therefore, in order to prove the theorem 1 1/c log(pn) for p > n1−ε with ε ∈ (0, 1) it is enough to take R = N , where c = log(n) > ε such that 1 p = n1−c .

5.3 Subset-Sum and Percolation In this section we consider the Subset-Sum problem. In the Subset-Sum problem we are given a n set items {ai}i=1 which are positive integers, and a target integer S. The goal is to decide whether there is a subset of ai’s whose sum is S. n Given an instance I = ({ai}i=1; S) of the Subset-Sum problem, we define p-percolation on I with probability p to be a random instance Ip, where each item ai is included in Ip with probability p independently, with the target of Ip being the same as the target of I.

22 It is known the Subset-Sum problem is NP-hard. Below we prove hardness of the Subset-Sum problem with respect to the above percolation. Theorem 5.7. The Subset-Sum problem is NP-hard under a noise-robust reduction, where noise is p-percolation with p > 1 , where n is the number of items in a given instance, and ε > 0 is n1/2−ε any fixed constant.

N Proof. In order to prove the theorem, we show a reduction that given an instance I = ({ai}i=1; S) 0 of the Subset-Sum problem with all ai > 0, produces an instance I on n variables such that the following two properties are satisfied.

0 0 • If I ∈ Subset-Sum, then I ∈ Subset-Sum, and with high probability Ip ∈ Subset-Sum. 0 0 • If I/∈ Subset-Sum, then I ∈/ Subset-Sum, and hence Ip ∈/ Subset-Sum with probability 1. Let us assume that the number of items in I is even. (If N is odd, then, add an item to I that is PN  equal to zero). Let R be a parameter to be chosen later, let M = 2RN, L0 = 6RM · i=1 ai +M i and for i = 1,...,N let Li = (6R) · L0. For each i ∈ [N] define the following set

0 Ji = {Li + ai · M + k : k ∈ {−R,...,R}} and Ji = {Li + k : k ∈ {−R,...,R}} Consider now the instance 0 0 0 I = (∪i∈[N](Ji ∪ Ji); S ), 0 PN where S = S · M + i=1 Li. Clearly, if R is not too large, this is a polynomial time reduction that outputs a Subset-Sum instance with n = 2N(2R + 1) items. We show first that I ∈ Subset-Sum if and only if I0 ∈ Subset-Sum. Indeed, suppose that for P 0 some subset T ⊆ [N] it holds that i∈T ai = S. Consider the following subset of items of I . For each i ∈ T take the item from Ji that corresponds to k = 0, and for i ∈ [N] \ T take the item from 0 Ji that corresponds to k = 0. Summing these items we get

X X 0 (Li + ai · M + 0) + (Li + 0) = S . i∈T i∈[N]\T

In the other direction, suppose that I0 ∈ Subset-Sum. Then, there is some subset T 0 ⊆ [N] × {0, 1} × {−R,...,R} such that

X 0 X (Li + ai · M · t + k) = S = Li + S · M. (7) (i,t,k)∈T 0 i∈[N]

Claim 5.8. For each i ∈ [N] there is a unique ti ∈ {0, 1} and a unique ki ∈ {−R,...,R} such that 0 PN PN (i, ti, ki) ∈ T . Furthermore, i=1 ki = 0 and i=1 ai · ti = S. Proof. Note first that for each i ∈ [N] there are at most 2 · (2R + 1) < 6R terms (i, t, k) in T 0. P PN Therefore, (i,t,k)∈T 0 (ai · M · t + k) < 6R · i=1 (aiM + R) < L0. Considering the sum in the LHS i of (7) modulo L0 (and recalling that Li = L0 · (6R) ) we conclude that

N X X Li = Li (8) (i,t,k)∈T 0 i=1

23 and X (ai · M · t + k) = S · M. (9) (i,t,k)∈T 0

By the choice of Li, and using again the fact that for each i ∈ [N] there are at most 2·(2R+1) < 6R 0 terms (i, t, k) in T , it is now easy to see that each Li term in the LHS of (8) must appear exactly once, i.e., for each i ∈ [N] there is a unique ti ∈ {0, 1} and a unique ki ∈ {−R,...,R} such that 0 (i, ti, ki) ∈ T . PN P To see that k = 0, note that 0 k < RN < M. Now, since in (7) we have L ≡ 0 i=1 i (i,ti,ki)∈T i i 0 PN (mod M) and S ≡ 0 (mod M) it follows that i=1 ki = 0. This concludes the proof. P Therefore, by defining T = {i ∈ [N]: ti = 1} we get that i∈T ai = S, and so I ∈ Subset-Sum. Next, we claim that the reduction above is in fact robust to noise. Indeed, consider the per- 0 0 colated instance Ip for some p ∈ (0, 1). Note that if I/∈ Subset-Sum, then I ∈/ Subset-Sum, and 0 hence Ip ∈/ Subset-Sum with probability 1. Therefore, it remains to show that if I ∈ Subset-Sum, 0 then with high probability Ip ∈ Subset-Sum. The proof relies on the following claim.

Claim 5.9. Let N ∈ N be even, and let R ∈ N. Let A1,...,An ⊆ {−R,...,R} be random sets chosen by letting each k ∈ {−R,...,R} be in Ai with probability p independently. Then, with 2 2R PN probability ≥ 1 − N/2 · (1 − p ) for each i ∈ [n] there is ki ∈ Ai such that i=1 ki = 0. Proof. Note that for each odd i ∈ [N], the probability for a fixed element k ∈ {−R,...,R} that 2 both k ∈ Ai and −k ∈ Ai+1 hold is p . Therefore, 2 2R+1 Pr[6 ∃k ∈ {−R,...,R} : k ∈ Ai and − k ∈ Ai+1] = (1 − p ) . Hence, by taking the union bound over all pairs (i, i + 1) with odd values of i we get that with 2 2R+1 probability at least 1 − N/2 · (1 − p ) , for all odd i’s there is ki ∈ Ai such that −ki ∈ Ai+1. P Suppose now that I ∈ Subset-Sum, i.e., for some subset T ⊆ [N] it holds that i∈T ai = S. 0 0 0 Note that the percolated instance Ip is obtained from I by taking random subsets of Ji and Ji independently of each other. For i ∈ [N] define Ai to be the p-percolated subsets of Ji if i ∈ T , 0 C log(N) and define Ai to be the p-percolated subsets of Ji if i∈ / T . Note that if R > p2 , then the conclusion of Claim 5.9 holds with probability at least 1 − 1/N. hence, in the percolated instance 0 Ip by taking the items from Ai’s that correspond to ki ∈ Ai’s from Claim 5.9 we get X X X X X (Li + ai · M + ki) + (Li + ki) = ( Li) + ( ai · M) + ( ki) i∈T i∈[N]\T i∈[N] i∈T i∈[N] X = ( Li) + S · L + 0 i∈[N] = S0.

0 Therefore, with high probability Ip ∈ Subset-Sum as required.   log(N) q log N Finally, note that the reduction works as long as R > C p2 , or equivalently p > Ω R . It is easy to verify that if we set R = N 1/c, with c = log(pn) − 1 such that p = 1 , then log(n) 2 n1/2−c the foregoing reduction is indeed a robust reduction with respect to percolation with parameter p > 1 for any constant ε > 0, where n is the number of items in I0. n1/2−ε

24 6 Conclusion

We have examined the complexity of percolated instances of several well studied NP-hard problems and established the hardness of solving these problems on such instances. It might be of interest to study the hardness of percolated instances of other NP-hard problems, and of other classes of problems such as counting, W [1]-hard problems, and parallel computation. It might also prove worthwhile to determine whether percolated instances of 3-SAT remain hard to solve for p = O(1/n2) over n-variable formulas. It would be interesting to determine whether the HamCycle problem is NP-hard under a robust reduction with respect to vertex percolation. We only show that this is true with respect to edge percolation, and it could be the case that the problem is not NP-hard under a vertex percolation robust reduction. Proving such a result (if true) could be very interesting.

Acknowledgements

We thank Itai Benjamini, Uri Feige, and Sam Hopkins for useful discussions. We are grateful to the anonymous referees for helpful comments and turning our attention to [7].

References

[1] R´eka Albert, Hawoong Jeong, and Albert-L´aszl´oBarab´asi. Error and attack tolerance of complex networks. nature, 406(6794):378–382, 2000.

[2] N. Alon and J. H. Spencer. The Probabilistic Method. Wiley-Interscience series in discrete mathematics and optimization. J. Wiley & Sons, New York, 2000.

[3] S. Arora, D. R. Karger, and M. Karpinski. Polynomial time approximation schemes for dense instances of np-hard problems. J. Comput. Syst. Sci., 58(1):193–210, 1999.

[4] B. Barak, M. Hardt, T. Holenstein, and Steurer D. Subsampling mathematical relaxations and average-case complexity. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, San Francisco, California, USA, pages 512–531, 2011.

[5] B. Bollob´as. Random graphs. Springer, 1998.

[6] R. L. Brooks. On colouring the nodes of a network. Mathematical Proceedings of the Cambridge Philosophical Society, 37:194–197, 4 1941.

[7] B. Bukh. Interesting problems that I cannot solve. Problem 2. http://www.borisbukh.org/ problems.html.

[8] Reuven Cohen, Keren Erez, Daniel Ben-Avraham, and Shlomo Havlin. Resilience of the internet to random breakdowns. Physical review letters, 85(21):4626, 2000.

[9] P. Crescenzi, R. Silvestri, and L. Trevisan. On weighted vs unweighted versions of combinatorial optimization problems. Inf. Comput., 167(1):10–26, 2001.

25 [10] I. Dinur, E. Mossel, and O. Regev. Conditional hardness for approximate coloring. SIAM J. Comput., 39(3):843–873, 2009.

[11] I. Dinur and S. Safra. On the hardness of approximating minimum vertex cover. Annals of Mathematics, 162(1):439–485, 2005.

[12] I. Dinur and I. Shinkar. On the conditional hardness of coloring a 4-colorable graph with super-constant number of colors. In Approximation, Randomization, and Combinatorial Opti- mization. Algorithms and Techniques, 13th International Workshop, APPROX 2010, and 14th International Workshop, RANDOM 2010, Barcelona, Spain, September 1-3, 2010. Proceedings, pages 138–151, 2010.

[13] M. Etscheid and H. R¨oglin.Smoothed analysis of local search for the maximum-cut problem. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, Portland, Oregon, USA, pages 882–889, 2014.

[14] U. Feige and J. Kilian. Zero knowledge and the chromatic number. Journal of Computer and System Sciences, 57(2):187 – 199, 1998.

[15] U. Feige and Daniel Reichman. Recoverable values for independent sets. Random Struct. Algorithms, 46(1):142–159, 2015.

[16] A. Ferber, G. Kronenberg, and E. Long. Packing, counting and covering hamilton cycles in random directed graphs. 2015. arXiv:1506.00618.

[17] A. M. Frieze and C. McDiarmid. Algorithmic theory of random graphs. Random Struct. Algorithms, 10(1-2):5–42, 1997.

[18] M. R. Garey and D. S. Johnson. The complexity of near-optimal graph coloring. J. ACM, 23(1):43–49, January 1976.

[19] G. Grimmett. Percolation. Springer, 1999.

[20] S. Har-Peled. Concentration of Random Variables – Chernoff’s Inequality. Available at http: //sarielhp.org/teach/13/b_574_rand_alg/lec/07_chernoff.pdf. [21] S. Huang. Improved hardness of approximating chromatic number. In Approximation, Ran- domization, and Combinatorial Optimization. Algorithms and Techniques - 16th International Workshop, APPROX 2013, and 17th International Workshop, RANDOM 2013, Berkeley, CA, USA, August 21-23, 2013. Proceedings, pages 233–243, 2013.

[22] D. R. Karger. Random sampling in cut, flow, and network design problems. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, Montr´eal,Qu´ebec, Canada, pages 648–657, 1994.

[23] S. Khot. Improved inapproximability results for maxclique, chromatic number and approximate graph coloring. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 600–609, 2001.

[24] S. Khot and O. Regev. Vertex cover might be hard to approximate to within 2-epsilon. J. Comput. Syst. Sci., 74(3):335–349, 2008.

26

[25] L. Kucera. The greedy coloring is a bad probabilistic algorithm. J. Algorithms, 12(4):674–684, 1991.

[26] M. Mezard and A. Montanari. Information, physics, and computation. Oxford University Press, 2009.

[27] S. Misra. Decay of Correlation and Inference in Graphical Models. PhD thesis, MIT, 2014.

[28] E. Mossel, R. O’Donnell, O. Regev, J. E. Steif, and B. Sudakov. Non-interactive correlation distillation, inhomogeneous Markov chains, and the reverse Bonami-Beckner inequality. Israel Journal of Mathematics, 154:299–336, 2006.

[29] T. Sanders. On the Bogolyubov–Ruzsa lemma. Analysis & PDE, 5(3):627–655, 2012.

[30] M. Sipser. Introduction to the Theory of Computation. International Thomson Publishing, 1st edition, 1996.

[31] O. Sisask. Discrete fourier analysis. Available at https://people.kth.se/~sisask/ Shillong/DFA_2.pdf.

[32] A. Sly. Computational transition at the uniqueness threshold. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, Las Vegas, Nevada, USA, pages 287–296, 2010.

[33] D. A. Spielman and S-H. Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM, 51(3):385–463, 2004.

[34] D. A. Spielman and S-H. Teng. Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM, 52(10):76–84, 2009.

[35] V. V. Vazirani. Approximation Algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001.

27

ECCC ISSN 1433-8092

http://eccc.hpi-web.de