58th Annual IEEE Symposium on Foundations of Computer Science

Random Θ(log n)-CNFs are Hard for Cutting Planes

Noah Fleming Denis Pankratov Toniann Pitassi Robert Robere University of Toronto University of Toronto University of Toronto University of Toronto noahfl[email protected] [email protected] [email protected] [email protected]

Abstract—The random k-SAT model is the most impor- lower bounds for random k-SAT formulas in a particular tant and well-studied distribution over k-SAT instances. It is proof system show that any and efficient algorithm closely connected to statistical physics and is a benchmark for based on the proof system will perform badly on random satisfiability algorithms. We show that when k =Θ(logn), any Cutting Planes refutation for random k-SAT requires k-SAT instances. Furthermore, since the proof complexity exponential size in the interesting regime where the number of lower bounds hold in the unsatisfiable regime, they are clauses guarantees that the formula is unsatisfiable with high directly connected to Feige’s hypothesis. probability. Remarkably, determining whether or not a random SAT Keywords-Proof complexity; random k-SAT; Cutting Planes; instance from the distribution F(m, n, k) is satisfiable is controlled quite precisely by the ratio Δ=m/n, which is called the clause density. A simple counting argument shows I. INTRODUCTION that F(m, n, k) is unsatisfiable with high probability for The Satisfiability (SAT) problem is perhaps the most Δ > 2k ln 2. The famous satisfiability threshold conjecture famous problem in theoretical computer science, and sig- asserts that there is a constant ck such that random k-SAT nificant effort has been devoted to understanding randomly formulas of clause density Δ are almost certainly satisfiable generated SAT instances. The most well-studied random for Δ ck, k SAT distribution is the random k-SAT model, F(m, n, k), where ck is roughly 2 ln 2. In a major recent breakthrough, where a random k-CNF over n variables is chosen by the conjecture was resolved for large values of k [4]. uniformly and independently selecting m clauses from the From the perspective of proof complexity, the density set of all possible clauses on k distinct variables. The parameter Δ also plays an important role in the difficulty random k-SAT model is widely studied for several reasons. of refuting unsatisfiable CNF formulas. For instance, in First, it is an intrinsically natural model analogous to the Resolution, which is arguably the simplest proof system, random graph model, and closely related to phase transitions the complexity of refuting random k-SAT formulas is now and structural phenomena occurring in statistical physics. very well understood in terms of Δ. In a seminal paper, Second, the random k-SAT model gives us a testbench of Chvatal and Szemeredi [5] showed that for any fixed Δ empirically hard examples which are useful for comparing above the threshold there is a constant κΔ such that random and analyzing SAT algorithms (for example, each of the SAT k-SAT requires size exp(κΔn) Resolution refutations with competitions has featured a track for random SAT instances high probability. In their proof, the drop-off in κΔ is doubly [1]). exponential in Δ, making the lower bound trivial when Third, and most relevant to the current work, the difficulty the number of clauses is larger than n log1/4 n (and thus of solving random k-SAT instances above the threshold (in does not hold when k is large.) Improved lower bounds [6], the regime where the formula is almost certainly unsatis- [7] proved that the drop-off in κΔ is at most polynomial fiable) has been connected to worst-case inapproximability in Δ. More precisely, they prove that a random k-SAT by Feige [2]. Feige’s hypothesis states that there is no effi- formula with at most n(k+2)/4 clauses requires exponential cient algorithm to certify unsatisfiability of random 3-SAT size Resolution refutations. Thus for all values of k,even instances for certain parameter regimes of (m, n, k), and when the number of clauses is way above the threshold, he shows that this hard-on-average assumption for 3-SAT Resolution refutations are exponentially long. They also give implies worst-case inapproximability results for many NP- asymptotically matching upper bounds, showing that there hard optimization problems. The hypothesis was generalized are DLL refutations of size exp(n/Δ1/(k−2)). to k-SAT as well as to any CSP, thus exposing more links to Superpolynomial lower bounds for random k-SAT formu- central questions in approximation algorithms and the power las are also known for other weak proof systems such as the of natural SDP algorithms [3]. The importance of under- polynomial calculus and Res(k) [8], [9], and random k-SAT standing the difficulty of solving random k-SAT instances in is also conjectured to be hard for stronger semi-algebraic turn makes random k-SAT an important family of formulas proof systems. In particular, it is a relatively long-standing for propositional proof complexity, since superpolynomial open problem to prove superpolynomial size lower bounds

0272-5428/17 $31.00 © 2017 IEEE 109 DOI 10.1109/FOCS.2017.19 for Cutting Planes refutations of random k-SAT. As alluded monotone circuit lower bound techniques. to earlier, this potential hardness (and even more so for Independently, Pavel Hrubesˇ and Pavel Pudlak´ have the semi-algebraic SOS proof system) has been linked to shown our main result by essentially the same techniques hardness of approximation. [10]. From any unsatisfiable CNF F they show how to In this paper, we focus on the Chvatal-Gomory´ Cutting obtain a partial monotone boolean function which they call Planes proof system and some of its generalizations. A an unsatisfiability certificate for F, and then show that the proof in this system begins with a set of unsatisfiable linear complexity of computing an unsatisfiability certificate by integral inequalities (in the form aT x ≥ b), and new integral a real monotone circuit implies lower bounds for seman- inequalities are derived by (i) taking nonnegative linear tic cutting planes by relating these certificates to feasible combinations of previous lines, or (ii) dividing a previous interpolation. The unsatisfiability certificates turn out to be inequality through by d (as long as all coefficients on the exactly the same as our monotone CSP problems, and their left-hand side are divisible by d) and then rounding up the lower bounds for random k-SAT are also obtained by using constant term on the right-hand side. The goal is to derive the symmetric method of approximations. Further, they use the “false” inequality 0 ≥ 1 with as few derivation steps as this technique to give lower bounds for other problems: a possible. This system can be generalized in several natural generalization of the Pigeonhole Principle called the Weak ways. In Semantic Cutting Planes, there are no explicit rules Bit Pigeonhole Principle, and a function related to Feige’s – a new linear inequality can be derived from two previous hypothesis. lines as long as it follows soundly. A further generalization of both CP and Semantic CP is the CC-proof system, where A. Related Work now every line is only required to have low (deterministic or Exponential lower bounds on lengths of refutations are real) communication complexity; like Semantic CP, a new known for CP, Semantic CP, and low-weight CC-proofs) line can be derived from two previous ones as long as the [11], [12], [13]. These lower bounds were obtained using derivation is sound. the method of interpolation [14]. A lower bound proof The main result of this paper is a new proof method via interpolation begins with a special type of formula – for obtaining Cutting Planes lower bounds. We apply it an interpolant. Given two disjoint NP sets U and V an to prove the first nontrivial lower bounds for the size interpolant formula has the form A(x, y) ∧ B(x, z) where of Cutting Planes refutations of random k-SAT instances. the A-part asserts that x ∈ U, as verified by the NP-witness Specifically, we prove that for k =Θ(logn) and m in the y, and the B-part asserts that x ∈ V , as verified by the unsatisfiable regime random k-SAT requires exponential-size NP-witness z. The prominent example in the literature is Cutting Planes refutations with high probability. The main the clique/coclique formula where U is the set of all graphs result holds for the generalizations Semantic CP and CC- with the clique number at least k, and V is the set of all proofs that were mentioned above; in fact, it is obtained (k − 1)-colorable graphs. Feasible interpolation for a proof by establishing an equivalence between the lengths of CC system amounts to showing that if an interpolant formula proofs and certain monotone circuit lower bounds. More has a short proof then we can extract from the proof a precisely, we generalize the interpolation method so that it small monotone circuit for separating U from V . Thus lower applies to any unsatisfiable family of formulas; we show bounds follow from the celebrated monotone circuit lower that proving superpolynomial size lower bounds for any bounds for clique [15], [16]. formula in the CC proof system amounts to proving a Despite the success of interpolation, it has been quite monotone circuit lower bound for certain yes/no instances limited since it only applies to “split” formulas. In particular, of the monotone CSP problem. Applying this equivalence the only family of formulas which are known to be hard to random k-SAT instances, we reduce the problem to that for (unrestricted) Cutting Planes are the clique-coclique of proving a monotone circuit lower bound for a specific formulas. In contrast, for Resolution the width measure is family of yes/no instances of the monotone CSP problem. a nice combinatorial property that characterizes Resolution We then apply the symmetric method of approximations in proof size [7], [17]; we would similarly like to understand order to prove exponential monotone circuit lower bounds the strength of Cutting Planes with respect to arbitrary for our monotone CSP problem. formulas and most notably for random k-SAT formulas and It is natural to wonder whether or not the new lower bound Tseitin formulas. technique could be pushed to obtain lower bounds for k-SAT Our main equivalence is an adaptation of the earlier work instances when k is constant. By being a bit more careful, combined with a key between search problems and one can obtain superpolynomial lower bounds when k  monotone functions established in [18]. With this reduction log log n, but when k =Θ(1)the symmetric method of in hand, our main proof is very similar to both [13] and approximations fails to give superpolynomial lower bounds [19]. [13] proved this equivalence for the special case of on the CSP problem. Thus, it appears that we will not be the clique-coclique formulas. Namely they showed that low- able to push the lower bounds any further without improving weight CC-proofs for this particular formula are equivalent

110 to monotone circuits for the corresponding sets U, V . Our • there exists jmthere exists j, k < i such that Lj,Lk x is a minterm if f(x)=1but f(x )=0for any x obtained Li. by flipping a single bit of x from 1 to 0. More generally, if The length of the refutation is . f(x)=1we call x an accepting instance or a yes instance, We will be particularly interested in semantic refutations while if f(x)=0then we call x a rejecting instance or a where the boolean functions can be computed by short no instance.Ifx is any yes instance of f and y is any no communication protocols. instance of f then there exists an index i ∈ [n] such that xi = ≤ 1,yi =0, as otherwise we would have x y, contradicting Definition II.4. Let F = C1 ∧ ... ∧ Cm be an unsat- { }n →{ } the fact that f is monotone. If f,g,h : 0, 1 0, 1 are isfiable CNF on n = n1 + n2 variables, and let X =  { } { } boolean functions on the same domain then f,g h if for x1,x2,...,xn1 , Y = y1,...,yn2 be a partition of the ∈{ }n ∧ ⇒ all x 0, 1 we have f(x) g(x)= h(x). variables. A CCd-refutation of F with respect to the partition A monotone circuit is a circuit in which the only gates are (X, Y ) is a semantic refutation L1,...,L of F such that ∧ ∨ or gates. A real monotone circuit is a circuit in which each function Li in the proof can be computed by a d-bit each internal gate has two inputs and computes any function communication protocol with respect to the partition (X, Y ). φ(x, y):R2 → R which is monotone nondecreasing in its T T ≥ arguments. Since any linear integral inequality a x + b y c with polynomially bounded weights can be evaluated by a trivial Definition II.1. A linear integral inequality in variables x = O(log n)-bit communication protocol (just by having Alice ∈ Zn T (x1,...,xn) with coefficients a =(a1,...,an) and evaluating a x and sending the result to Bob), it follows ∈ Z T ≥ CC constant term c is an expression a x c. that low-weight cutting planes proofs are also O(log n)- Definition II.2. Given a system of linear integral inequali- proofs. We can similarly define a proof system which can ties Ax ≥ b, where A ∈ Zm×n and b ∈ Zm,acutting planes simulate any cutting planes proof by strengthening the type proof of an inequality aT x ≥ c is a sequence of inequalities of communication protocol. T ≥ T ≥ T ≥ a1 x c1,a2 x c2,...,a x c, such that a = a, Definition II.5. A d-round real communication protocol is ∈ c = c and every inequality i [] satisfies either a communication protocol between two players, Alice and T • ai x ≥ ci appears in Ax ≥ b, Bob, where Alice receives x ∈X and Bob receives y ∈Y. T • ai x ≥ ci is a Boolean axiom, i.e., xj ≥ 0 or −xj ≥ In each round, Alice and Bob each send real numbers α, β −1 for some j, to a “referee”, who responds with a single bit b which is 1 if T • there exists j, k < i such that ai x ≥ ci is the sum of α ≤ β and 0 otherwise. After d rounds of communication, T T the linear inequalities aj x ≥ cj and ak x ≥ ck, the players output a bit b. The protocol computes a function

111 F : X×Y→{0, 1} if for all (x, y) ∈X×Y the protocol and alphabet Σ is defined as follows. The vertices in L are outputs F (x, y). thought of as the set of constraints, and the vertices in R are ∈ F ∧ ∧ thought of as a set of variables; thus for each vertex i L Definition II.6. Let = C1 ... Cm be an unsatisfiable vars { } we let (i) denote the neighbourhood of i. For each vertex CNF on n = n1 + n2 variables X = x1,...,xn1 and i ∈ L the CSP has an associated boolean function TTi : Y = {y ,...y }.AnRCC -refutation of F is a semantic vars 1 n2 d Σ (i) →{0, 1} called the truth table of i that encodes the refutation L ,L ,...,L in which each function L can be 1 2 i set of “satisfying” assignments to the constraint associated computed by a d-round real communication protocol with with i. An assignment α ∈ Σn, thought of as a Σ-valued respect to the variable partition (X, Y ). assignment to the variables R, satisfies the CSP H if for T T It is clear that any linear integral inequality a x+b y ≥ c each i ∈ L we have TTi(α  vars(i)) = 1, otherwise the can be evaluated by a 1-round real communication protocol, assignment falsifies the CSP. and so it follows that a cutting planes refutation of F is For each i ∈ [m] and α ∈ Σvars(i) we abuse notation and also an RCC1-refutation of F. We record each of these let TTi(α) represent the boolean variable corresponding to observations in the next proposition. this entry of the truth table for the constraint i. Proposition II.1. Let F be an unsatisfiable CNF on vari- Definition II.10. Let H =(L ∪ R, E) be a bipartite graph ∈ ables z1,z2,...,zn, and let X, Y be any partition of the such that each vertex i L has degree at most k, and variables into two sets. Any length- low-weight cutting let m = |L| and n = |R|. We think of H as encoding F CC planes refutation of is a length- O(log n)-refutation the topology of a constraint satisfaction problem, where of F. Similarly, any length- cutting planes refutation of F each vertex i ∈ L represents a constraint of the CSP and ∈ is a length- RCC1-refutation of F. each i R represents a variable of the CSP. Let Σ be a m | |vars(i) ≤ | |k finite alphabet, and let N = i=1 Σ m Σ . The A. Total Search Problems and Monotone CSP-SAT N monotone function CSP-SATH,Σ : {0, 1} →{0, 1} is In this section we review the equivalence between the defined as follows. An input x ∈{0, 1}N encodes a CSP search problem associated with an unsatisfiable CNF for- H(x) by specifying for each vertex i ∈ L its truth table x vars(i) →{ } ∈{ }N mula, and the Karchmer-Wigderson (KW) search problem TTi :Σ 0, 1 . Given an assignment x 0, 1 for a related (partial) monotone function. the function CSP-SATH,Σ(x)=1if and only if the CSP H(x) is satisfiable. This function is clearly monotone Definition II.7. Let n ,n ,m be positive integers, and let 1 2 since for any x, y ∈{0, 1}N with x ≤ y, any satisfying X , Y be finite sets. A total search problem is a relation assignment for the CSP H(x) is also a satisfying assignment R⊆Xn1 ×Yn2 × [m] where for each (x, y) ∈Xn1 ×Yn2 , for the CSP H(y). there is an i ∈ [m] such that R(x, y, i)=1. We refer to x ∈Xn1 as Alice’s input and y ∈Yn2 as Bob’s input. The Next we show how to relate k-local total search problems search problem is k-local if for each i ∈ [m] we have that and the CSP-SAT problem. Let R⊆Xn1 ×Yn2 × [m] R(∗, ∗,i) depends on a fixed set of at most k coordinates be a k-local total search problem. Associated with R is a of x (it may depend on any number of y coordinates). bipartite constraint graph HR encoding for each i ∈ [m] the coordinates in X n1 on which R(∗, ∗,i) depends. Formally, A standard example of a k-local search problem is the the constraint graph is the bipartite graph HR =(L ∪ R, E) search problem associated with unsatisfiable k-CNFs. with L =[m], |R| =[n1], and for each pair (i, j) ∈ L × R Definition II.8. Let F be an unsatisfiable k-CNF formula we add the edge if R(∗, ∗,i) depends on the variable xj. ∈ with m clauses and n variables z1,...,zn, which are par- Note that each vertex u L has degree at most k, since the titioned into two sets x1,x2,...,xn1 and y1,y2,...,yn2 . original search problem is k-local. The search problem Search(F) with respect to this partition Given R and its corresponding constraint graph we can takes as input an assignment x ∈{0, 1}n1 and y ∈{0, 1}n2 give a natural way to construct accepting and rejecting X n1 Yn2 and outputs the index i ∈ [m] of a violated clause under this instances of CSP-SATHR,X from and . To reduce assignment. clutter, given a k-local total search problem R we abuse notation and write CSP-SATR := CSP-SAT X . This problem is clearly k-local since each clause can HR, Accepting Instances U. For any x ∈Xn1 we construct an contain at most k variables from x1,x2,...,xn1 . Associated accepting input U(x) of CSP-SATR as follows. For with this search problem is the following monotone variant each vertex i ∈ L we define the corresponding truth of the constraint satisfaction problem. table TTi by setting TTi(α)=1if x  vars(i)=α Definition II.9. Let H =(L ∪ R, E) be a bipartite graph and TTi(α)=0otherwise. such that each vertex v ∈ L has degree at most k, and Rejecting Instances V. For any y ∈Yn2 we construct a let m = |L| and n = |R|. Let Σ be a finite alphabet. A rejecting input V(y) of CSP-SATR as follows. For constraint satisfaction problem (CSP) H with topology H each vertex i ∈ L and each α ∈ Σvars(i) we set

112 TTi(α)=0iff R(α, y, i) holds. to first running the communication protocol for L1 and then Given x ∈Xn1 it is easy to see that U(x) is a satisfying running the communication protocol for L2. This will give assignment for CSP-SATR since x is a satisfying assign- us a height 2d (full) binary tree, T , where the top part is the ment for the corresponding CSP. The rejecting instances communication protocol tree for L1, with protocol trees for require a bit more thought. Let y ∈Yn2 and consider the L2 hanging off of each of the leaves. We label each of the {CL1 CL2 } rejecting instance V(y) as defined above. Suppose by way leaves of this stacked tree with a circuit from h , h as of contradiction that the corresponding CSP HR(V(y)) is follows. Consider a path labelled h1h2 in T , where h1 is the satisfiable, and let x ∈Xn1 be the satisfying assignment for history from running L1 and h2 is the history from running ∩ the CSP. It follows by definition of the rejecting instances L2. By soundness, either the rectangle RL(h) RL1 (h1) is ∩ that R(x, y, u) does not hold for any u, implying that R is 0-monochromatic, or the rectangle RL(h) RL2 (h2) is 0- not total. monochromatic. In the first case, we will label this leaf with CL1 CL2 h1 and otherwise we will label this leaf with h2 .Nowwe III. RELATING PROOFS AND CIRCUITS will label the internal vertices of the stacked tree with a gate: In this section we relate CCd-proofs and monotone cir- if a node corresponds to Alice speaking, then we label the cuits, as well as RCC1-proofs and real monotone circuits. node with an ∨ gate, and otherwise if the node corresponds to Bob speaking, then we label the node with an ∧ gate. The Theorem III.1. Let F be an unsatisfiable CNF formula on resulting circuit for this history h has size 22d plus the sizes n variables and let X = {x ,...,x }, Y = {y ,...,y } 1 n1 1 n2 of the subcircuits, and thus performing the construction for be any partition of the variables. Let d be a positive each of the 2d histories increases circuit size by factor of of integer. If there is a CC refutation of F with respect to d 23d. With this, the theorem is immediately implied by the the partition (X, Y ) of length , then there is a mono- following claim. tone circuit separating the accepting and rejecting in- Claim. The circuit resulting from the above construction U { }n1 V { }n2 stances ( 0, 1 ), ( 0, 1 ) of CSP-SATSearch(F) of 3d satisfies: for each line L in P , and for each good history h size O(2 ). CL ∈ for L, h will be correct for all (x, y) RL(h). Proof: Let F = C1 ∧ ... ∧ Cm over variables Proof of Claim. If L is an axiom, then L is a clause, Ci. The CC F x1,...,xn1 ,y1,...,yn2 . Let P be a d-proof for with  communication protocol for Ci is a two-bit protocol where lines. Order the lines in P as L1,L2,...,L, where each line Alice and Bob each send 0 iff their part of Ci evalutes to 0. is either a clause, or follows semantically from two earlier There is only one good (0-monochromatic) history, h =00. lines. If (x, y) ∈ RL(h) then Ci(x, y)=0by definition. Let α =  vars We build the circuit for CSP-SATSearch(F) that separates x (Ci). In our construction the circuit corresponding U V CL , by induction on . For each line L in the proof, to h is labelled by the variable TTi(α), and it is easy to d there are 2 possible histories h, each with an associated check that U(x) sets TTi(α) to true, and V(y) sets TTi(α) monochromatic rectangle RL(h). A rectangle h is good for to false. L if it is 0-monochromatic. For every line L and each good If L is not an axiom, then we will prove the lemma by CL history h for L, we will build a circuit h that correctly proving the following stronger statement by induction: for “separates” x and y for each (x, y) ∈ RL(h). By this, each line L (derived from previous lines L1 and L2), and CL U we mean that the circuit h outputs 1 on (x) (the 1- for each node v in the stacked protocol tree for L, with V CL input associated with x) and outputs 0 on (y) (the 0-input corresponding (sub)history h = h1h2, the subcircuit h associated with y). associated with vertex v is correct on all (x, y) ∈ RL(h) ∩ ∩ For each leaf in the proof, the associated line L is a clause RL1 (h1) RL2 (h2). Ci of F. The communication protocol for Ci is a two-bit Fix a line L that is not an axiom. For the base case, protocol where Alice/Bob each send 0 iff their inputs are suppose that v is a leaf of the stacked protocol tree for α, β such that Ci(α, β)=0. Thus there is only one good L with history h = h1h2. Then by soundness either (i) ∩ ∩ (0-monochromatic) rectangle with history h =00. This pair RL(h) RL1 (h1) is 0-monochromatic or (ii) RL(h) α, β corresponds to the variable TTi(α), and we define the RL2 (h2) is 0-monochromatic. In case (i) we labelled v by CL CL1 ∩ circuit h corresponding to line L = Ci and good history h1 . Since RL(h) RL1 (h1) is 0-monochromatic, RL1 (h1) CL1 h =00to be the variable TTi(α). is 0-monochromatic and therefore h1 is defined and is ∈ Now suppose that L is derived from L1 and L2, and correct on all (x, y) RL1 (h1), so it is correct on all CL1 CL2 ∈ ∩ ∩ inductively we have circuits h , h for each history h (x, y) RL(h) RL1 (h1) RL2 (h2). A similar argument good for L1 and h good for L2. Given a good history h for holds in case (ii). CL L, we will show how to build the circuit h . It will use all of For the inductive step, let v be a nonleaf node in the {CL1 CL2 } the circuits that were built for L1 and L2 ( h , h for all protocol tree with history h and assume that Alice owns d CL ∩ ∩ × good h and h ) and an additional 2 gates. To build h we v. The rectangle RL(h) RL1 (h1) RL2 (h2)=A B is will construct a stacked protocol tree for L, corresponding partitioned into A0 × B and A1 × B, where

113 ∪ C V 1) A = A0 A1, g2 ( (y)) = 0, then they output 0, and otherwise they 2) A0 × B is the rectangle with history h 0, output 1. This is a 2-bit protocol, with Alice sending one bit 3) A1 × B is the rectangle with history h 1. to report whether or not condition (i) is satisfied, and Bob ∈ ∩ ∩ CL sending one bit to report if (ii) is satisfied. Given (x, y) RL(h) RL1 (h1) RL2 (h2), since h0 is ∈ × CL correct on all (x, y) A0 B and h1 is correct on all Now, we want to show that for all (x, y) such that L L L C U C V (x, y) ∈ A1 × B, it follows that C = C ∨C is correct g( (x)) = 1 and g( (y)) = 0 we have that Rg(x, y)=0. h h 0 h 1 ∨ C U on all (x, y) ∈ A × B. To see this, observe that if x ∈ A0, This is easy — since g = g1 g2 we have that g( (x)) = 1 CL U C V C U then h0( (x)) = 1 and therefore and g( (y)=0implies that either g1 ( (x)) = 1 or C (U(x)) = 1 and C (V(y)) = 0 and C (V(y)) = 0, CL U CL U ∨CL U g2 g1 g2 h ( (x)) = h0( (x)) h1( (x)) = 1. implying that the protocol will output 0 on (x, y) by defini- ∈ CL U tion. Similarly, if x A1, then h1( (x)) = 1 and therefore Similarly, if g is an AND gate, then again Alice pri- L L L C U C U ∨C U C U C U h ( (x)) = h 0( (x)) h 1( (x)) = 1. vately simulates g1 ( (x)) and g2 ( (x)) and Bob privately C V C V C U ∈ CL V CL V simulates g2 ( (y)) and g2 ( (y)). If (i) g1 ( (x)) = 1 Finally if y B then both h0( (y)) = h1( (y)) = 0 C U C V and g2 ( (x)) = 1 and (ii) either g2 ( (y)) = 0 or and therefore C V g2 ( (y)) = 0, then they ouput 0, and otherwise they output CL V CL V ∨CL V h ( (y)) = h0( (y)) h1( (y)) = 0. 1. By an analogous argument to the OR case, it’s easy to see that the protocol will output 0 whenever Cg(U(x)) = 1 A similar argument holds if v is an internal node in the and Cg(V(y)) = 0. protocol tree that Bob owns (and is therefore labelled by an The next theorem relates RCC1 proofs and real monotone AND gate. circuits. It follows from a recent simulation given by [20]. The converse direction is much easier. (The proof is in the Appendix.) Theorem III.2. If there is a monotone circuit separating Theorem III.3. Let F be an unsatisfiable CNF formula on the inputs of CSP-SATSearch F of size , then there is a ( ) n variables and let X = {x ,...,x }, Y = {y ,...,y } CC -refutation of F of length  with respect to this variable 1 n1 1 n2 2 be any partition of the variables. If there is a RCC partition. 1 refutation of F with respect to the partition (X, Y ) of Proof: We show that from a small monotone circuit length , then there is a monotone real circuit separating the C U { }n1 U { }n1 V { }n2 for CSP-SATSearch(C) that separates ( 0, 1 ) and accepting and rejecting instances ( 0, 1 ), ( 0, 1 ) n2 V({0, 1} ), we can construct a small CC2-proof for F, of CSP-SATSearch(F) of size . Conversely, a real monotone n1 n2 where Alice gets x ∈{0, 1} and Bob gets y ∈{0, 1} . circuit separating the inputs of CSP-SATSearch(F) implies a The lines/vertices of the refutation will be in 1-1 corre- RCC1 refutation of F of the same size. spondence with the gates of C. The protocol is constructed In particular, the above theorem implies that for any inductively from the leaves of C to the root. For a gate g family of formulas F and for any partition of the un- of C, let U be those inputs u ∈U({0, 1}n1 ) such that g derlying variables into X, Y , a Cutting Planes refutation g(u)=1, and let V be those inputs v ∈V({0, 1}n2 ) g of F of size S implies a similar size monotone real such that g(v)=0. At each gate g we will prove that for circuit for separating the accepting and rejecting instances every pair (u, v) ∈ U × V and for every (x, y) such that g g U { }n1 V { }n2 ( 0, 1 ), ( 0, 1 ) of CSP-SATSearch(F). u = U(x),v = V(y), the protocol Rg on input (x, y) will output 0. Since the output gate of C is correct for all pairs, this will achieve our desired protocol. IV. LOWER BOUNDS FOR RANDOM CNFS At a leaf  labeled by some variable TTj(α), the pairs associated with this leaf must have TTj(α)=1in u and 0 In this section we use Theorem III.3 to prove lower RCC in v, and thus we can define R(x, y) to be 0 if and only bounds for 1-refutations (and therefore Cutting Planes if x is consistent with α and the clause Cj evaluates to refutations) of uniformly random k-CNFs with sufficient false on (x, y). This is a 2-bit protocol, and by definition of clause density. the accepting and rejecting instances we have for all (x, y) Definition IV.1. Let F(m, n, k) denote the distribution of satisfying u = U(x),v = V(y) that x  vars(j)=α and random k-CNFs on n variables obtained by sampling m R holds. n (α, y, j) clauses (out of the 2k possible clauses) uniformly at C k Now suppose that g is an OR gate of , with inputs g1,g2. random. The protocol Rg on (x, y) is as follows. Alice privately C U C U simulates g1 ( (x)) and g2 ( (x)), and Bob simulates The proof is delayed to Section IV-B; to get a feeling for C V C V C U g1 ( (y)) and g2 ( (y)). If (i) either g1 ( (x)) = 1 the argument, we first prove an easier lower bound for a C U C V or g2 ( (x)) = 1 and (ii) both g1 ( (y)) = 0 and simpler distribution of balanced random CNFs.

114 A. Balanced Random CNFs with high probability every RCC1-refutation (and therefore, Cutting Planes refutation) of F has at least 2Ω(˜ n) lines. Definition IV.2. Let X = {x1,...,xn} and Y = { } y1,...,yn be two disjoint sets of variables, and let Proof: Immediate consequence of Theorems III.3 and F ⊗2 (m, n, k) denote the following distribution over 2k- IV.3. F 1 1∧ 1∧· · ·∧ 1 F CNFs: first sample = C1 C2 Cm from (m, n, k) The proof of Theorem IV.3 comes down to the essential F 2 2 ∧ 2 ∧···∧ 2 on the X variables, and then = C1 C2 Cm from property that random k-CNFs are good expanders. The F (m, n, k) on the Y variables independently. Then output next lemma records the expansion properties we require of F 1 ∨ 2 ∧ 1 ∨ 2 ∧···∧ 1 ∨ 2 =(C1 C1 ) (C2 C2 ) (Cm Cm). random CNFs; the proof is adapted from the notes of Salil This distribution shares the well-known property with Vadhan [25]. The lemma is stated in general terms for re-use F(m, n, k) that dense enough formulas are unsatisfiable in the next section. with high probability. Lemma IV.5. Let n be any sufficiently large positive integer. F∼F Lemma IV.1. Let c>2/ log e and let n be any positive Let k, m be positive integers and sample (m, n, k). ⊗ ≤ 2 ⊆F integer. If k ∈ [n] and m ≥ cn22k then F∼F(m, n, k) 2 Let s n/ek be a positive integer. For any subset S of vars is unsatisfiable with high probability. clauses let (S) denote the subset  of variables appearing ≤ · k k in clauses S.Iflog m δ 2 log 2 for some 0 <δ<1, Proof: Fix any assignment (x, y) to the variables of then every set S ⊆Fof size s satisfies |vars(S)|≥ks/2 F − − . The probability that the ith clause is satisfied by the with probability at least 1 − 2 (1 δ)(ks/2) log(k/2). joint assignment is 1 − 1/22k, and so the probability ⊆F that all clauses are satisfied by the joint assignment is Proof: Fix any set S of size s, and for each clause − 2k ∈ (1 − 1/22k)m ≤ e m/2 , since the clauses are sampled C S sample the variables in C one at a time without independently. By the union bound, the probability that replacement. Let v1,v2,...,vks denote the concatenation of ∈ some joint assignment satisfies the formula is at most all sequences of sampled variables over all C S. We say 2k 2k 22ne−m/2 =22n−(log e)m/2 ≤ 22n−(log e)cn ≤ 2−Ω(n). that variable vi is a repeat if it has already occurred among |vars | Thus, the probability that the formula is unsatisfiable is at v1,...,vi−1. In order for (S) 2/ log e is some constant. Let F∼F(m, n, k) 2 with i U(x)=U(x) for some x = x only if there exists a clause variable partition (X, Y ), and let U = U({0, 1}X ),V = C which does not contain some x variable. However, it is V({0, 1}Y ). Then with high probability any real monotone ˜ easy to see that every x variable participates in some clause, circuit separating U and V has at least 2Ω(n) gates. and thus U is 1-1. Corollary IV.4. Let n be a sufficiently large positive integer, This implies that U is one-to-one and thus |U| =2n with and let k =4logn, m = n6.IfF∼F(m, n, k)⊗2 then high probability.

115 | | n Recall that the 0-inputs of CSP-SATSearch(F) correspond V =2 . It remains to bound the terms A1(1,U),A1(r, U ), to substituting Y -assignment into F and writing out truth and A0(s, V ). tables of all the clauses. The truth tables corresponding Bounding A1(1,U). Fixing a single bit of a 1-input in U to the clauses that were satisfied by the Y -assignment are to CSP-SATSearch(F) to 1 is the same as selecting a vertex identically 1, and the truth tables corresponding to the C in the bipartite constraint graph of Search(F) and an clauses that were not satisfied by the given Y -assignment assignment α to the variables which participate in C, and contain exactly one 0-entry. Given a Y -assignment we call then setting TTC (α)=1. By the definition of U, for any the set of clauses that were not satisfied by the Y assignment input x ∈{0, 1}n, fixing this bit to 1 determines exactly k the profile of Y . The next lemma implies that the profiles out of the n variables of x. Thus the number of x ∈{0, 1}n of all Y -assignments are distinct with high probability. that are consistent with this partial assignment is 2n−k, and since U is one-to-one, we have A (1,U)=2n−k. Lemma IV.6. Let n, m, k be positive integers. Let F∼ 1 Bounding A (r, U ). Similar to the previous bound, but now F(m, n, k), let S⊆{0, 1}n be a collection of boolean 1 |S| we fix r of the truth table bits to 1. By definition of U, assignments, and define the following 2 × m matrix M, these bits must be chosen from r distinct truth tables in with the rows labelled by assignments α ∈Sand the the 1-input in order to be consistent with any x ∈{0, 1}n. columns labelled by clauses of F. Namely, for any pair (α, i) With respect to the underlying CNF F, this corresponds to set fixing an assignment to the set of variables appearing in an 1 if the ith clause is not satisfied by α, arbitrary set S of r clauses in F. By Lemma IV.5, with high M[α, i]= 0 otherwise. probability we have |vars(S)|≥rk/2. Thus fixing these r bits in the definition of A1(r, U ) corresponds to setting If log |S| 5/4. By a union bound over ordered pairs of by a large collection of assignments. First we introduce our assignments in S, the probability that there exists a pair of − k notion of “imbalanced” clauses. rows that agree on all columns is at most |S|22 5km/4n2 ≤ k k 22log|S|−5km/4n2 ≤ 2−km/n2 . Definition IV.3. Fix >0. Given a partition of n variables In our current setting we have S = {0, 1}n and into x-variables and y-variables, clause C is called X-heavy km/n2k ≥ n log n, thus applying the previous lemma if it contains more than (1 − )kx-variables. Clause C is yields that all rows of M are distinct with high probability. called Y -heavy if it contains more than (1−)ky-variables. Since each profile is distinct with high probability, this Clause C is called balanced if it is neither X-heavy nor Y - implies that V is 1-1 with high probability, and therefore heavy.

116 We recall some basic facts from probability theory which and   will be used in our main lemma. εk H(ε)k −3k/4 k −k ≥ −k 2 ≥ 2 √ E Pr(Zi =1)= 2 2 Lemma IV.7 (Lovasz´ Local Lemma). Let = j 8kε(1 − ε) k { } j=0 E1,...,En be a finite set of events in the probability ∈E space Ω.ForE let Γ(E) denote the set of events Ei on since 1/4 3m /2) ≤ Pr(Z>3E[Z]/2)   εn ≤ exp(−E[Z]/12) 2H(ε)n n √ ≤ ≤ 2H(ε)n, ≤ − −3k/4 − j exp( m2 /12 k). 8nε(1 ε) j=0 Since m = n22k and k = 160 log n this occurs with high − − − − where H(ε)= ε log ε (1 ε)log(1 ε) is the binary probability. An identical calculation applies to the Y -heavy entropy function. clauses. It follows by a union bound that there exists a Fact IV.9 (Multiplicative Chernoff Bound). Suppose partition satisfying both of the above properties. (3) Fix the partition (X, Y ) satisfying the properties (1) and Z1,...,Zn are independent random variables taking values in {0, 1}. Let Z denote their sum and let μ = E(Z) denote (2), and we show that the third property is also satisfied. F∼F the sum’s expected value. Then for any 0 <δ≤ 1 we have Sample (m, n, k). We first bound the number of times a variable appears in a heavy clause with the goal of − 2 − 2 Pr(Z ≥ (1+δ)μ) ≤ e δ μ/3, Pr(Z ≤ (1−δ)μ) ≤ e δ μ/3 applying the Lovasz´ Local Lemma. Arbitrarily fix z to be any of the n variables occurring We now prove the main lemma of this section, which as possible inputs to F. By Lemma IV.10, the number of shows that for F∼F(m, n, k) a good partition of the X-heavy and Y -heavy clauses are both bounded by 3m/2. variables exists with high probability. Let Zi be the indicator random variable which is 1iff the Lemma IV.10. Let ε =1/20, and let n be a positive integer. variable z occurs in the ith heavy clause and let Z = i Zi. F∼F Let k = 160 log n, let m = n22k, and let m = m2−k/2. Since (m, n, k) we have Pr(Zi =1)=k/n and E Let F be any k-CNF with m clauses on n variables. There so [Z]=3km /2n. Applying the Chernoff bound we get E − exists a partition of the variables of F into two sets (X, Y ) Pr(Z>3km /n)=Pr(Z>2 [Z]) < exp( 3km /12n). Taking a union bound over the n variables, we conclude such that the following holds: ± that each variable occurs in at most 3km /n X-heavy and 1) The number of variables in X is n/2 o(n). Y -heavy clauses with high probability. 2) The number of X-heavy clauses and Y -heavy clauses Now, consider selecting a random assignment to the X are each upper bounded by 3m /2. variables. Let E be the event that the ith X-heavy clause 3) If F∼F(m, n, k), then with high probability there i |X|−(log(e)n/60k) is falsified by the random assignment, and observe that exists a set A of 2 truth assignments −(1−ε)k Pr(Ei) ≤ 2 since the clause is X-heavy. The number to the X variables that satisfy all X-heavy clauses, of events E is at most 3m/2, and for any event E the and a set B of 2|Y |−(log(e)n/60k) truth assignments to i i number of events that share any X variable with Ei is at the Y -variables satisfying all of the Y -heavy clauses. 2 most 3m k /n. Set q = n/90m k. Then for each Ei we Proof: We prove the existence of such a partition by have 2 the probabilistic method. For each variable, flip a fair coin | i | − n − − − q(1−q) Γ(E ) ≥ qe 6qm k /n ≥ e k/15 ≥ 2 (1 )k, and place it in X if the coin is heads and in Y otherwise. 90km (1) We have E[|X|]=n/2 and since each variable is placed which holds for ε =1/20 and k = 160 log n. Applying in X independently with probability 1/2 we have Pr[|X − Lovasz´ Local Lemma (see Lemma IV.7) we get that the n/2| >n2/3] ≤ 2exp(−n1/3/6) by a Chernoff bound. probability that an assignment satisfies all X-heavy clauses F (2) For each clause Ci in let Zi be the random variable is at least indicating whether this clause is X-heavy. Using both in- − 3m /2 ≥ − 3m /2 ≥ −n/(60k) equalities in Fact IV.8 we have that (1 q) (1 n/(90km )) e .   εk Thus the number of assignments to the X-variables satisfy- k − − − | | Pr(Z =1)= 2 k ≤ 2 k2H(ε)k < 2 k/2 ing all heavy clauses is at least 2 X /en/60k, and an identical i j j=0 calculation applies to the Y variables.

117 With this lemma in place, we can proceed in more or and containing a single 0-entry in ρ). The original circuit less the same way that we proceeded in the last section. outputs 0 on z therefore, by monotonicity, it also outputs 0 Now we perform the whole argument with respect to U = on z ◦ ρ. This means that the derived circuit outputs 0 on U(A) and V = V(B) chosen from the previous lemma. z. This allows us to restrict our attention only to the balanced The rest of the proof proceeds identically to the proof of clauses, and the calculations from the previous section work Theorem IV.3 using U and V and counting with respect to mutatis mutandis since many clauses are balanced. the balanced clauses. It is easy to see that with high probabil- ity the m/2 balanced clauses contain all variables occurring Theorem IV.11. There exists a constant c>0 such that the in the formula, and this implies by the lemma that U is 1-1 following holds. Let n ≥ c be any positive integer. Let F∼ when restricted to A. Similarly, letting S = V = V(B),we F(m, n, k) for m = n22k and k = 160 log n. There exists a can apply Lemma IV.6 with respect to the m/2 balanced partition (X, Y ) of the variables of F and a δ>0 such that clauses. Since km/8n2k =(n/4) log n ≥ n/2 ± o(n)= the search problem Search(F) defined with respect to this log |S| for sufficiently large n this lemma implies that V is partition satisfies the following with high probability: any 1-1 on this set of inputs, and so V is also 1-1 when restricted real monotone circuit computing CSP-SATSearch F requires ( ) to B. Ω(˜ n) at least 2 gates. Finally we consider the expansion by applying Lemma Proof: Apply Lemma IV.10 to get a partition of the IV.5 with respect to the balanced clauses. By Lemma IV.10, variables (X, Y ), and let A, B denote the set of assignments each balanced clause contains at least k0 = k/20 variables to the X and Y variables, respectively. If z is an input from both X and Y . There are at least m/2 balanced clauses, and so to CSP-SATSearch(F), let z be z restricted to truth tables F − corresponding to balanced clauses of with respect to the log(m/2) = log n22k 1 = k +2logn − 1 = 162 log n − 1 partition (X, Y ); it follows from the lemma that there are   log n m − 3m ≥ m/2 balanced clauses for n sufficiently large. ≤ log(n)log Let U = {z | z ∈U(A)} and V = {z | z ∈V(B)}. Letting 2 F ⊆F be the formula containing only balanced clauses of ≤ · k0 k0 γ log F , then we can think of z as input to CSP-SATSearch(F ). 2 2 As in the previous section, we shall apply Theorem IV.2 to for sufficiently large n and some universal constant γ> 2 U and V . 0. We set s = n/2ek0; by Lemma IV.5 this implies Given a real monotone circuit separating U(A) and V(B), that each collection S of s balanced clauses satisfies we apply to it restriction ρ setting inputs (i.e. truth tables) |varsX (S)|, |varsY (S)|≥k0s/2 with high probability. Note corresponding to unbalanced clauses as follows: that we can apply the argument from Lemma IV.5 because ≥ • Truth table entries corresponding to an X-heavy clause conditioned on containing some fixed number k k/20 are all set to 1 except for the entry corresponding to of X-variables, the X-part of a clause is distributed exactly F | | the assignment falsifying the clause. according to (1, X ,k ). ≤ ≤ • Truth table entries corresponding to a Y -heavy clause Our choice of s implies that 2 log 2s 2logn k0/4 are all set to 1. since k0 = k/20 = 8 log n. Now we just follow the calculation at the end of Theorem IV.3 using our new We first claim that the circuit obtained from applying this estimates. This yields the following lower bound on the real restriction separates U and V . monotone circuit size of CSP-SATSearch F : Given z ∈U(A) there is a corresponding z ∈ U. ( ) Let z ◦ ρ denote the extension of z by ρ to an input to | | − |X|−log(e)n/60k−1 U (1 2sA1(1,U))) ≥ 2 | |− CSP-SATSearch(F). Thus, the derived circuit evaluated on z s+1 s+1 X sk0/2 (2s) A1(s, U) (2s) 2 is the same as the original circuit evaluated on z ◦ ρ. Since 0 − − − ≥ 2s(k /2 2log2s) log(e)n/60k 1 assignments in A satisfy all X-heavy clauses, it is easy to − − ˜ see that z ◦ρ ≥ z, i.e., z ◦ρ is z with some entries set to 1. ≥ 2sk0/4 log(e)n/1200k0 1 ≥ 2Ω(n). The original circuit output 1 on z, thus, by monotonicity, it also outputs 1 on z ◦ρ. This, in turn, means that the derived circuit outputs 1 on z. Corollary IV.12. Let F be distributed as above. There exists Now let z ∈V(B) and consider z ◦ ρ. Since assignments ε>0 such that with high probability any RCC1-refutation in B satisfy all Y -heavy clauses it is easy to see that z ◦ρ ≤ requires 2Ω(˜ n) lines. z, i.e., z ◦ ρ is z with some entries set to 0 (all truth tables corresponding to Y -heavy clauses are identically 1 both in V. C ONCLUSION z and z ◦ ρ; truth tables corresponding to X-heavy clauses The obvious problem left open by this paper is to prove are either the same in z as in ρ or are identically 1 in z lower bounds on other conjectured hard problems for Cutting

118 Planes: perhaps most important is improving the lower [10] P. Hrubesˇ and P. Pudak,´ “Random formulas, monotone cir- bounds for random k-SAT when k =Θ(1). It seems likely cuits, and interpolation,” ECCC TR17-042, 2017. that such lower bounds should hold for some (possibly large) CC [11] P. Pudlak,´ “Lower bounds for resolution and cutting constant k even for -proofs, however, as we discussed plane proofs and monotone computations,” J. Symb. Log., in the introduction it seems that the symmetric method vol. 62, no. 3, pp. 981–998, 1997. [Online]. Available: of approximations is incapable of obtaining strong lower http://dx.doi.org/10.2307/2275583 bounds for constant k. Another standard formula which [12] Y. Filmus, P. Hrubes,ˇ and M. Lauria, “Semantic is believed to be hard for Cutting Planes are the Tseitin versus syntactic cutting planes,” in Proc. of the tautologies (conjectured, for instance, in [24]). However, 33rd STACS, 2016, pp. 35:1–35:13. [Online]. Available: CC2 proofs admits linear-size refutations of the Tseitin graph http://dx.doi.org/10.4230/LIPIcs.STACS.2016.35 principles on any underlying graph — simply consider the lines as mod 2 linear equations and add the constraints, using [13] M. L. Bonet, T. Pitassi, and R. Raz, “Lower bounds for cutting planes proofs with small coefficients,” J. Symb. Log., the fact that each variable occurs in exactly two clauses. vol. 62, no. 3, pp. 708–728, 1997. [Online]. Available: Therefore our techniques cannot be directly applied to obtain http://dx.doi.org/10.2307/2275569 lower bounds for the Tseitin graph principles. [14] J. Kraj´ıcek, “Interpolation theorems, lower bounds for proof REFERENCES systems, and independence results for bounded arithmetic,” J. Symb. Log., vol. 62, no. 2, pp. 457–486, 1997. [Online]. [1] “Sat competition 2017,” http://baldur.iti.kit.edu/sat- Available: http://dx.doi.org/10.2307/2275541 competition-2017/index.php, accessed: 2017-08-2. [15] A. Razborov, “Lower bounds for the monotone complexity [2] U. Feige, “Relations between average case complexity of some boolean functions,” Sov. Math. Dokl., vol. 31, pp. and approximation complexity,” in Proc. of the 354–357, 1985. 34th STOC, 2002, pp. 534–543. [Online]. Available: [16] N. Alon and R. B. Boppana, “The monotone http://doi.acm.org/10.1145/509907.509985 circuit complexity of boolean functions,” Combinatorica, vol. 7, no. 1, pp. 1–22, 1987. [Online]. Available: [3] B. Barak, G. Kindler, and D. Steurer, “On the http://dx.doi.org/10.1007/BF02579196 optimality of semidefinite relaxations for average- case and generalized constraint satisfaction,” in Proc. [17] A. Atserias and V. Dalmau, “A combinatorial characterization of ITCS, 2013, pp. 197–214. [Online]. Available: of resolution width,” J. Comput. Syst. Sci., vol. 74, http://doi.acm.org/10.1145/2422436.2422460 no. 3, pp. 323–334, 2008. [Online]. Available: http://dx.doi.org/10.1016/j.jcss.2007.06.025 [4] J. Ding, A. Sly, and N. Sun, “Proof of the satisfiability conjecture for large k,” in Proc. of the [18] M. Go¨os¨ and T. Pitassi, “Communication lower 47th STOC, 2015, pp. 59–68. [Online]. Available: bounds via critical block sensitivity,” in Proc. of the http://doi.acm.org/10.1145/2746539.2746619 46th STOC, 2014, pp. 847–856. [Online]. Available: http://doi.acm.org/10.1145/2591796.2591838 [5] V. Chvatal´ and E. Szemeredi,´ “Many hard examples for resolution,” J. ACM, vol. 35, no. 4, pp. 759–768, 1988. [19] A. Razborov, “Unprovability of lower bounds on circuit [Online]. Available: http://doi.acm.org/10.1145/48014.48016 size in certain fragments of bounded arithmetic,” Izvestiya Mathematics, vol. 59, no. 1, pp. 205–227, 1995. [6] P. Beame, R. M. Karp, T. Pitassi, and M. E. Saks, “On the complexity of unsatisfiability proofs [20] P. Hrubesˇ and P. Pudlak,´ “A note on monotone real circuits,” for random k-CNF formulas,” in Proc. of the ECCC TR17-048, 2017. 13th STOC, 1998, pp. 561–571. [Online]. Available: http://doi.acm.org/10.1145/276698.276870 [21] D. Sokolov, “Dag-like communication and its applications,” ECCC TR16-202, 2017.

[7] E. Ben-Sasson and A. Wigderson, “Short proofs [22] C. Berg and S. Ulfberg, “Symmetric approximation are narrow - resolution made simple,” J. ACM, arguments for monotone lower bounds without sunflowers,” vol. 48, no. 2, pp. 149–169, 2001. [Online]. Available: Computational Complexity, vol. 8, no. 1, pp. 1–20, 1999. http://doi.acm.org/10.1145/375827.375835 [Online]. Available: http://dx.doi.org/10.1007/s000370050017

[8] E. Ben-Sasson and R. Impagliazzo, “Random CNFs are hard [23] A. Haken and S. A. Cook, “An exponential lower bound for the polynomial calculus,” Computational Complexity, for the size of monotone real circuits,” J. Comput. Syst. vol. 19, no. 4, pp. 501–519, 2010. [Online]. Available: Sci., vol. 58, no. 2, pp. 326–335, 1999. [Online]. Available: http://dx.doi.org/10.1007/s00037-010-0293-1 http://dx.doi.org/10.1006/jcss.1998.1617

[9] M. Alekhnovich, “Lower bounds for k-DNF res- [24] S. Jukna, Boolean Function Complexity - Advances and olution on random 3-CNFs,” in Proc. of the Frontiers, ser. Algorithms and combinatorics. Springer, 2012, 37th STOC, 2005, pp. 251–256. [Online]. Available: vol. 27. [Online]. Available: http://dx.doi.org/10.1007/978-3- http://doi.acm.org/10.1145/1060590.1060628 642-24508-4

119 [25] S. P. Vadhan, “Pseudorandomness,” Foundations and Let r be the root node of the dag. Since we started with a Trends in Theoretical Computer Science, vol. 7, valid RCC1 refutation of F we have Ar(x) >Br(y) for all no. 1-3, pp. 1–336, 2012. [Online]. Available: x and y. Therefore, f (U(x)) >f(V(y)) for all x and y. http://dx.doi.org/10.1561/0400000010 r r Modifying fr by composing it with an appropriately chosen VI. APPENDIX threshold function gives us the separating circuit. It is easy to see that f can be computed by a real In this appendix, we show how to prove Theorem III.3. v monotone gate with inputs f and f . First of all, the We could prove this theorem using the equivalence between u1 u2 value of f is determined by values of f and f , and a real analogue of Karchmer-Wigderson (KW) games and v u1 u2 secondly, increasing values of f and/or f increases the monotone real circuits, proven recently in [20]. This would u1 u2 feasible region of xs over which the maximum is taken in entail proving equivalence between RCC refutations and a 1 the definition of f . real analogue of KW games, which is a relatively simple v Thus, it is left to show that f (z) satisfies (2). We shall exercise. Instead, for the purpose of readability and self- v prove this by induction. The base case is given by (1). containment, we give a direct argument, which is essentially Inductive assumption (IA): suppose that we proved (2) for the reduction given by [20]. Theorem III.3 follows from the n1 children u1,u2 of v. Consider an arbitrary x ∈{0, 1} .By following two lemmas. U ≥ U ≥ IA, we have fu1 ( (x)) Au1 (x) and fu2 ( (x)) Au2 (x). Lemma VI.1. Let F be an unsatisfiable CNF formula on Thus, the region over which the max is taken in the definition { } { } U n variables and let X = x1,...,xn1 , Y = y1,...,yn2 of fv( (x)) is nonempty and contains x. It follows that n2 be any partition of the variables. If there is a RCC1 fv(U(x)) ≥ Av(x). Now, consider an arbitrary y ∈{0, 1} . refutation of F with respect to the partition (X, Y ) of Assume for contradiction that fv(V(y)) >Bv(y). Since length , then there is a monotone real circuit separating the Bv(y) ≥ 0,wehavefv(V(y)) = Av(x) for some x ∈ n1 n2 n1 accepting and rejecting instances U({0, 1} ), V({0, 1} ) {0, 1} . Thus we have Av(x) >Bv(y), and by soundness of CSP-SATSearch(F) with  gates. of the refutation it follows that either Au1 (x) >Bu1 (y) or A (x) >B (y). Assume without loss of generality Proof: Fix an RCC -refutation of F. With each node u2 u2 1 that A (x) >B (y). By definition of f (V(y)) we have v of the underlying (dag) associate u1 u1 v f (V(y)) ≥ A (x) >B (y). This contradicts the IA. two functions A : {0, 1}n1 → R and B : {0, 1}n2 → R u1 u1 u1 v v The above lemma proves the first part of Theorem III.3. that Alice and Bob use to communicate with the referee. We The following lemma proves the second part of the theorem. assume without loss of generality that the referee outputs 0 if and only if Av(x) >Bv(y), and furthermore, that Bv ≥ 0. Lemma VI.2. With the setting as in the previous Recall that each leaf in this dag is associated with a clause lemma, a real monotone circuit separating the inputs of RCC F Ci and let αi be the assignment to the X-variables that does CSP-SATSearch(F) implies a 1 refutation of of the not satisfy the X-part of Ci. Note: we may assume that if same size. v is a leaf then Proof: The RCC1 refutation that we shall construct will U(x) V(y) have the exact same topology as the given real monotone Av(x)=TTi (αi) and Bv(y)=TTi (αi). (1) circuit. Turn each input variable TTi(α) of the circuit into Next, we convert the given dag to the real circuit separat- the corresponding clause Ci in the refutation. Turn each gate U { }n1 V { }n2 ing ( 0, 1 ) from ( 0, 1 ) as follows. The topology v in the circuit into the line in the refutation computed by of the derived circuit is exactly the same as that of the the following RCC1 protocol. On input x, Alice privately dag. Thus, to finish specifying the circuit we need to label runs the circuit on U(x) and sends the value Av computed inputs to the circuit and label the internal nodes by real by the circuit at gate v to the referee. On input y, Bob acts monotone gates. Each leaf labeled by clause Ci in the dag analogously — he simulates the circuit privately on input turns into an input variable to the circuit labeled by TT (α ). i i V(y) and sends the value Bv computed by the circuit at With each internal node v of the dag with children u1 gate v to the referee. The referee outputs 0 if and only if and u we associate the function f defined recursively 2 v Av >Bv. Since at the top gate the circuit is identically 1 n { | ≥ as follows: fv(z)=maxx∈{0,1} 1 Av(x) fu1 (z) on U(x) and 0 on V(y), the referee always outputs 0 at the ∧ ≥ } Au1 (x) fu2 (z) Au2 (x) . We define fv(z) to be 0 if last line in the refutation. Thus, the only thing left to see is the set on the right-hand side is empty. We claim that these that the refutation is sound. Let u1 and u2 be the children functions can be computed by real monotone gates and for of v, then Av = f(Au1 ,Au2 ) and Bv = f(Bu1 ,Bu2 ) for every x ∈{0, 1}n1 and every y ∈{0, 1}n2 we have some monotone function f. Thus, if Av >Bv then either

Au1 >Bu1 or Au2 >Bu2 . fv(U(x)) ≥ Av(x) and fv(V(y)) ≤ Bv(y). (2)

First, let’s see how the above properties of fv imply that the constructed circuit separates U({0, 1}n1 ) from V({0, 1}n2 ).

120