<<

2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)

On Exponential-Time Hypotheses, Derandomization, and Circuit Lower Bounds [Extended Abstract]

Lijie Chen∗, Ron D. Rothblum†, Roei Tell‡, Eylon Yogev§ ∗Massachusetts Institute of Technology. Email: [email protected] †Technion. Email: [email protected] ‡Massachusetts Institute of Technology. Email: [email protected] §Boston University and Tel Aviv University. Email: [email protected]

Abstract—The Exponential-Time Hypothesis (ETH) is a tures that 3-SAT with n variables and m = O(n) clauses strengthening of the P= NP conjecture, stating that 3-SAT ·n ·n cannot be deterministically solved in time less than 2 , for on n variables cannot be solved in (uniform) time 2 ,for > ETH > some constant = m/n 0. The may be viewed some 0. In recent years, analogous hypotheses that P NP are “exponentially-strong” forms of other classical complexity as an “exponentially-strong” version of = , since it conjectures (such as NP BPP or coNP NP) have also conjectures that a specific NP-complete problem requires been introduced, and have become widely influential. essentially exponential time to solve. In this work, we focus on the interaction of exponential-time Since the introduction of ETH many related variants, hypotheses with the fundamental and closely-related questions which are also “exponentially-strong” versions of classical of derandomization and circuit lower bounds. We show that even relatively-mild variants of exponential-time hypotheses have complexity-theoretic conjectures, have also been introduced. far-reaching implications to derandomization, circuit lower For example, the Randomized Exponential-Time Hypoth- bounds, and the connections between the two. Specifically, we esis (rETH), introduced in [4], conjectures that the same prove that: lower bound holds also for probabilistic algorithms (i.e., it is 1) The Randomized Exponential-Time Hypothesis a strong version of NP ⊆BPP). The Non-Deterministic (rETH) BPP implies that can be simulated on “average- Exponential-Time Hypothesis (NETH), introduced (im- case” in deterministic (nearly-)polynomial-time (i.e., in O˜ n n O(1) plicitly) in [5], conjectures that co-3SAT (with n variables time 2 (log( )) = nloglog( ) ). The derandomization and O(n) clauses) cannot be solved by non-deterministic relies on a conditional construction of a pseudorandom ·n generator with near-exponential stretch (i.e., with seed machines running in time 2 for some constant >0 length O˜(log(n))); this significantly improves the state- (i.e., it is a strong version of coNP ⊆NP). The variations of-the-art in uniform “hardness-to-randomness” results, MAETH and AMETH are defined analogously (see [6]1), which previously only yielded pseudorandom generators and other variations conjecture similar lower bounds for with sub-exponential stretch from such hypotheses. SAT 2) The Non-Deterministic Exponential-Time Hypothe- seemingly-harder problems (e.g., for #3 ; see [4]). sis (NETH) implies that derandomization of BPP is These Exponential-Time Hypotheses have been widely completely equivalent to circuit lower bounds against influential across different areas of complexity theory. E, and in particular that pseudorandom generators are Among the numerous fields to which they were applied necessary for derandomization. In fact, we show that the so far are structural complexity (i.e., showing classes of foregoing equivalence follows from a very weak version of NETH, and we also show that this very weak version problems that, conditioned on exponential-time hypotheses, is necessary to prove a slightly stronger conclusion that are “exponentially-hard”), parameterized complexity, com- we deduce from it. munication complexity, and fine-grained complexity; see, Lastly, we show that disproving certain exponential-time hy- e.g., the surveys [7]–[10]. potheses requires proving breakthrough circuit lower bounds. Exponential-time hypotheses focus on conjectured lower CircuitSAT n In particular, if for circuits over bits of size bounds for uniform algorithms. Two other fundamental poly(n) can be solved by probabilistic algorithms in time n/ n 2 polylog( ), then BPE does not have circuits of quasilinear questions in theoretical computer science are those of de- size. randomization, which refers to the power of probabilistic algorithms; and of circuit lower bounds, which refers to the power of non-uniform circuits. Despite the central place Keywords-computational complexity of all three questions, the interactions of exponential-time I. INTRODUCTION hypotheses with derandomization and circuit lower bounds have yet to be systematically studied. The Exponential-Time Hypothesis (ETH), introduced by Impagliazzo and Paturi [2] (and refined in [3]), conjec- 1In [6], the introduction of these variants is credited to a private communication from Carmosino, Gao, Impagliazzo, Mihajlin, Paturi, and A full version of this paper is available online at ECCC [1]. Schneider [5].

2575-8454/20/$31.00 ©2020 IEEE 13 DOI 10.1109/FOCS46700.2020.00010 A. Our results: Bird’s eye against E. See Section I-C for more details. In this work we focus on the interactions between Lastly, in Section I-D we show that disproving a conjec- exponential-time hypotheses, derandomization, and circuit ture similar to rETH requires proving breakthrough circuit lower bounds. In a nutshell, our main contribution is show- lower bounds. Specifically, we show that if there exists a ing that even relatively-mild variants of exponential-time probabilistic algorithm that solves CircuitSAT for circuits n/ n hypotheses have far-reaching consequences on derandom- with n input bits and of size poly(n) in time 2 polylog( ), ization and circuit lower bounds. then non-uniform circuits of quasilinear size cannot decide def O n Let us now give a brief overview of our specific results, BPE == BPT IME[2 ( )] (see Theorem I.6, and see the before describing them in more detail in Sections I-B, I-C, discussion in Section I-D for a comparison with the state- and I-D. Our two main results are the following: of-the-art). 1) We show that rETH implies a nearly-polynomial- Relation to Strong Exponential Time Hypotheses: time average-case derandomization of BPP. Specifi- The exponential-time hypotheses that we consider also have (1−)·n cally, assuming rETH,2 we construct a pseudorandom “strong” variants that conjecture a lower bound of 2 , > generator for uniform circuits with near-exponential where 0 is arbitrarily small, for solving a correspond- SAT coSAT SAT stretch (i.e., with seed length O˜(log(n))) and with ing problem (e.g., for solving , ,or# ; see, O(1) running time 2O˜(log(n)) = nloglog(n) , and de- e.g., [10]). We emphasize that in this paper we focus only duce that BPP can be decided, in average-case and on the “non-strong” variants that conjecture lower bounds of ·n > infinitely-often, by deterministic algorithms that run 2 for some 0; these are indeed significantly weaker O(1) in time nloglog(n) (see Theorem I.1). This signif- than their “strong” counterparts; in fact, some “strong” icantly improves the state-of-the-art in the long line variants of standard exponential-time hypotheses are simply of uniform “hardness-to-randomness” results, which known to be false (see [6]). previously only yielded pseudorandom generators with We mention that a recent work of Carmosino, Impagli- at most a sub-exponential stretch from worst-case azzo, and Sabin [11] studied the implications of hypotheses lower bounds for uniform probabilistic algorithms in fine-grained complexity on derandomization. These fine- BPT IME grained hypotheses are implied by the “strong” version of (i.e., for ; see Section I-B for details). We rSETH also extend this result to deduce an “almost-always” rETH (i.e., by ), but are not known to follow from derandomization of BPP from an “almost-always” the “non-strong” versions that we consider in this paper. We lower bound (see Theorem I.2), which again improves will refer again to their results in Section I-B. on the state-of-the-art. See Section I-B for details. B. rETH and pseudorandom generators for uniform circuits E 2) Circuit lower bounds against are well-known to The first hypothesis that we study is rETH, which (slightly yield pseudorandom generators for non-uniform cir- changing notation from above) asserts that probabilistic prBPP cuits that can be used to derandomize in the algorithms cannot decide if a given 3-SAT formula with v worst-case. An important open question is whether variables and O(v) clauses is satisfiable in time less than such lower bounds and pseudorandom generators are 2·v, for some constant >0. Note that such a formula can actually necessary for worst-case derandomization of be represented with n = O(v ·log(v)) bits, and therefore the prBPP NETH . We show that a very weak version of conjectured lower bound as a function of the input length is yields a positive answer to the foregoing question; 2·(n/ log(n)). specifically, to obtain a positive answer it suffices to Intuitively, using “hardness-to-randomness” results, we E assume that cannot be computed by small circuits expect that such a strong lower bound would imply a strong that are uniformly generated by a non-deterministic derandomization result. For context, recall that in non- 3 machine. In fact, loosely speaking, we show that this uniform hardness-to-randomness results (following [12]), NETH weak version of is both sufficient and necessary lower bounds for non-uniform circuits yield pseudorandom to show an equivalence between non-deterministic generators (PRGs) that “fool” non-uniform distinguishers. prBPP derandomization of and circuit lower bounds Moreover, these results “scale smoothly” such that lower bounds for larger circuits yield PRGs with longer stretch 2We will in fact consider a hypothesis that is weaker (qualitatively and quantitatively), and conjectures that the specific PSPACE-complete prob- (see [13] for an essentially optimal trade-off); at the extreme, lem Totally Quantified Boolean Formula (TQBF) cannot be solved in prob- if E is hard almost-always for exponential-sized circuits, then n/polylog(n) TQBF SAT abilistic time 2 . (Recall that is the set of 3- formulas we obtain PRGs with exponential stretch and deduce that ϕ over variables w1, ..., wv such that ∀w1∃w2∀w3..., ϕ(w1, ..., wv)=1, and that 3SAT reduces to TQBF in linear time; see [1, Definition 4.6].) prBPP = prP (see [14]). 3That is, we assume that the following statement does not hold: For any The key problem, however, is that the long line-of-works n ∈Ethere is a uniform machine ML that runs in time  2 and uses concerning uniform “hardness-to-randomness” did not yield its non-determinism to generate a single small circuit CL that decides L on all n-bit inputs; for example, ML runs in sub-exponential time and CL such smooth trade-offs so far (see [11], [15]–[23]). Ideally, is of polynomial size. (See Section I-C.) given an exponential lower bound for uniform probabilistic

14 algorithms (such as E ⊆i.o.BPT IME[2·n]) we would Theorem I.1 (rETH ⇒ PRG with almost-exponential stretch like to deduce that there exists a PRG with exponential for uniform circuits; informal). Suppose that there exists stretch for uniform circuits, and consequently that BPP = T (n)=2n/polylog(n) such that TQBF ∈BPTIME/ [T ]. P in “average-case”.4 However, prior to the current work, Then, for every t(n)=npolyloglog(n), there exists a PRG that the state-of-the-art (by Trevisan and Vadhan [20]) could has seed length O(log(n)), runs in time npolyloglog(n), and at best yield PRGs with sub-exponential stretch (i.e., with is infinitely-often (1/t)-pseudorandom for every distribution seed length polylog(n)), even if the hypothesis refers to over circuits that can be sampled in time t with log(t) bits an exponential lower bound. Moreover, the best currently- of non-uniform advice. known PRG only works infinitely-often, even when we The proof of Theorem I.1 is based on careful refinements assume that the “hard” function cannot be computed by of the proof framework of [15], using new technical tools probabilistic algorithms on almost all input lengths. that we construct. The latter tools significantly refine and Previous works bypassed these two obstacles in vari- strengthen the technical tools that were used by [20] to ous indirect ways. Goldreich [23] relied on the (much) obtain the previously-best uniform hardness-to-randomness stronger hypothesis prBPP = prP to construct an “almost- tradeoff. For high-level overviews of the proof of Theo- always” PRG with exponential stretch for uniform circuits. rem I.1 (and of the new constructions), see Section II-A. Similarly, Carmosino, Impagliazzo, and Sabin [11] relied Overcoming the “infinitely-often” barrier: The hypoth- on hypotheses from fine-grained complexity (recall that esis in Theorem I.1 is that any probabilistic algorithm that these are qualitatively strong, and implied by the “strong” n/ n runs in time 2 polylog( ) fails to compute TQBF infinitely- version of rETH, i.e. by rSETH) to bypass both obstacles often, and the corresponding conclusion is that the PRG and derandomize BPP “almost-always” on average-case in “fools” uniform circuits only infinitely-often. This is identi- polynomial time; however, their derandomization does not cal to all previous uniform “hardness-to-randomness” results rely on a PRG construction, and satisfies a weaker notion of that used the [15] proof framework.6 average-case derandomization than the notion that we use.5 Gutfreund and Vadhan [22, Sec 6] showed one way to Gutfreund and Vadhan [22] bypassed the “almost-always” overcome this “infinitely-often” barrier, by deducing almost- barrier by deducing (subexponential-time) derandomization always average-case derandomization of RP (rather than of of RP rather than of BPP (see details below). Lastly, a line- BPP) under an almost-always lower bound hypothesis; as of-works dealing with uniform “hardness-to-randomness” in previous results, their derandomization is relatively slow for AM (rather than for BPP) was able to bypass both (i.e., it works in sub-exponential time). Combining their obstacles in this context (see, e.g., [18], [19], [21]). ideas with the techniques underlying Theorem I.1, we prove In this work we tackle both obstacles directly. First, we that under the hypothesis that rETH holds almost-always, establish for the first time that hardness assumptions for RP BPT IME can be derandomized almost-always in average-case and yield a pseudorandom generator for uniform in (nearly-)polynomial time (see [1, Theorem 4.14]). circuits with near-exponential stretch (i.e., with seed length O˜ n In addition, their techniques can be adapted to yield an (log( ))), which can be used for average-case derandom- almost-always PRG (from an almost-always lower bound ization of BPP in nearly-polynomial-time (i.e., in time O(1) hypothesis) that uses O(log(n)) bits of non-uniform advice. O˜(log(n)) nloglog(n) 2 = ). Specifically, we start from the We are able to significantly improve this: Assuming that hypothesis that the Totally Quantified Boolean Formula every probabilistic algorithm that runs in time 2n/polylog(n) TQBF ( ) problem cannot be solved by probabilistic algorithms fails to decide TQBF on almost all input lengths, we n/polylog(n) that run in time 2 ; this hypothesis is weaker than prove that BPP can be derandomized in average-case and rETH SAT TQBF (since 3- reduces to with a linear overhead). almost-always, using only a triply-logarithmic number (i.e., Under this hypothesis, we show that there exists a PRG O n  (logloglog( ))) of advice bits. for uniform circuits with seed length O(log(n)) that is O˜ n n O(1) rETH ⇒ computable in time 2 (log( )) = nloglog( ) . Theorem I.2 (aa- almost-always derandomiza- tion in time npolyloglog(n); informal). Assume that for n/ n 4Throughout the paper, when we say that a PRG is -pseudorandom for some T (n)=2polylog( ) it holds that TQBF ∈/ uniform circuits, we mean that for every efficiently-samplable distribution i.o.BPT IME[T ], and let t(n)=npolyloglog(n). Then, for over circuits, the probability over choice of circuit that the circuit distin-  every L ∈BPTIME[t] and every distribution ensemble guishes the output of the PRG from uniform with advantage more than n is at most  (see [1, Definitions 3.6 and 3.7]). The existence of such PRGs X = {Xn ⊂{0, 1} } such that x ∼Xn can be sampled in BPP implies an “average-case” derandomization of in the following sense: t n D DX L ∈BPP D time ( ), there exists a deterministic algorithm = For every there exists an efficient deterministic algorithm npolyloglog(n) O n such that every probabilistic algorithm that gets input 1n and tries to find that runs in time and uses (logloglog( )) x ∈{0, 1}n such that D(x) = L(x) has a small probability of success bits of non-uniform advice such that for almost all input (see, e.g., [23, Prop. 4.4]). 5Specifically, they deduce an average-case derandomization of BPP 6Other proof strategies (which use different hypotheses) were able to with respect to the uniform distribution, rather than with respect to every support an “almost-always” conclusion, albeit not necessarily a PRG, from polynomial-time-samplable distribution. an “almost-always” hypothesis (see [11], [19]).

15 n ∈ N D x L x < lengths it holds that Prx∼Xn [ ( ) = ( )] answer to the foregoing question (and this is not difficult to 1/t(n). show; see Section II-B). Our main contribution is in showing that, loosely speak- Remark: Non-deterministic extensions: We note that ing, even a very weak form of NETH suffices to answer the “scaled-up” versions of Theorems I.1 and I.2 for non- question of equivalence in the affirmative, and that this weak deterministic settings follow easily from known results; that form of NETH is in some sense inherent (see details below). is, assuming lower bounds for non-deterministic uniform ∗ Specifically, we say that L ⊆{0, 1} has NTIME[T ]- algorithms, we can deduce strong derandomization of corre- uniform circuits if there exists a non-deterministic machine sponding non-deterministic classes. First, from the hypoth- n M that gets input 1 , runs in time T (n), and satisfies the esis MAETH7 we can deduce strong circuit lower bounds, following: For some non-deterministic choices M outputs and hence also worst-case derandomization of prBPP and n a single circuit C : {0, 1} →{0, 1} that decides L on all of prMA (this uses relatively standard Karp-Lipton-style n inputs x ∈{0, 1} , and whenever M does not output such a arguments, following [24]; see [1, Appendix A] for details circuit, it outputs ⊥. We also quantify the size of the output and for a related result). Similarly, as shown by Gutfreund, circuit, when this size is smaller than T (n). Shaltiel, and Ta-Shma [19], a suitable variant of AMETH The hypotheses that will suffice to show an equivalence implies an average-case derandomization of AM. between derandomization and circuit lower bounds are of E NTIME T C. NETH and an equivalence of derandomization and cir- the form “ does not have [ ]-uniform circuits S n T n T S cuit lower bounds of size ( ) ( )”, for values of and that will be specified below. In words, this hypothesis rules out a world Let us now consider the Non-Deterministic in which every L ∈Ecan be computed by small circuits that NETH Exponential-Time Hypothesis ( ), which asserts can be efficiently produced by a uniform (non-deterministic) co SAT n O n that -3 (with variables and ( ) clauses) cannot machine. Indeed, this hypothesis is weaker than the NETH- be solved by non-deterministic machines running in time style hypothesis E⊆NTIME [T ], and even than the ·n > 2 for some 0. This hypothesis is an exponential-time hypothesis E ⊆(NTIME[T ] ∩SIZE[T ]). We stress that coNP ⊆NP version of , and is therefore incomparable to our hypothesis refers to lower bounds for uniform models rETH MAETH and weaker than . of computation, for which strong lower bounds (compared The motivating observation for our results in this section to those for non-uniform circuits) are already known. (For NETH k is that has an unexpected consequence to the long- example, NP is hard for NP-uniform circuits of size n for standing question of whether worst-case derandomization every fixed k ∈ N (see [31]), whereas we do not even know prBPP NP of is equivalent to circuit lower bounds against if E is hard for non-uniform circuits of arbitrarily large E . Specifically, recall that two-way implications between linear size.) The fact that such a weak hypothesis suffices derandomization and circuit lower bounds have been grad- to deduce that derandomization and circuit lower bounds ually developing since the early ‘90s (for surveys see, are equivalent can be seen as appealing evidence that the e.g., [25], [26]), and that it is a long-standing question equivalence indeed holds. whether the foregoing implications can be strengthened to Our first result is that if E cannot be decided by δ show a complete equivalence between the two. One well- NTIME[2n ]-uniform circuits of polynomial size (for known implication of such an equivalence would be that some δ>0), then derandomization of prBPP in prBPP any (worst-case) derandomization of necessitates sub-exponential time is equivalent to lower bounds for 8 the construction of PRGs that “fool” non-uniform circuits. polynomial-sized circuits against EXP. Then, being more concrete, the motivating observation for our results in this section is that NETH implies an affirmative Theorem I.3 (NETH ⇒ circuit lower bounds are equiva- lent to derandomization; “low-end” setting). Assume that 7Note that indeed a non-deterministic analogue of rETH is MAETH there exists δ>0 such that E cannot be decided by δ (or, arguably, AMETH), rather than NETH, due to the use of randomness. NTIME[2n ]-uniform circuits of arbitrary polynomial Also recall that, while the “strong” version of MAETH is false (see [6]), there is currently no evidence against the “non-strong” version MAETH. size, even infinitely-often. Then, 8 The question of equivalence is mostly “folklore”, but was mentioned prBPP ⊆ i.o.prSUBEXP ⇐⇒ EXP ⊂P/ several times in writing. It was asked in [27, Remark 33], who proved poly . an analogous equivalence between non-deterministic derandomization with short advice and circuit lower bounds against non-deterministic classes (i.e., Theorem I.3 also scales-up to “high-end” parameter set- against NTIME; see also [28]). It was also mentioned as a hypothetical tings, albeit not smoothly, and using different proof tech- possibility in [20] (referred to there as a “super-Karp-Lipton theorem”). Following the results of [29], the question was recently raised again as a niques (see [1, Section 5] for details). Nevertheless, an conjecture in [30]. We note that in the context of uniform “hardness-to- analogous result holds for the extreme “high-end” setting: randomness”, equivalences between average-case derandomization, lower Under the stronger hypothesis that E cannot be decided by bounds for uniform classes, and PRGs for uniform circuits have long been NTIME Ω(n) prBPP known (see [15], [23]), but these equivalences do not involve circuit lower [2 ]-uniform circuits, we show that = bounds or standard PRGs. prP is equivalent to lower bounds for exponential-sized

16 circuits against E; that is: algorithms (following Williams [34]) crucially rely on the hypothesis that the circuit-analysis algorithm is determinis- Theorem I.4 (NETH ⇒ circuit lower bounds are equiva- tic, and it is a well-known challenge to obtain analogous lent to derandomization; “high-end” setting). Assume that results for randomized algorithms, as we do in Theorem I.6. there exists δ>0 such that E cannot be decided by δ·n In order to prove Theorem I.6 we crucially leverage the NTIME[2 ]-uniform circuits, even infinitely-often. Then: technical tools that we develop in the proof of Theorem I.1; n ·n prBPP = prP⇐⇒∃>0:DT IME[2 ] ⊂i.o.SIZE[2 ] see. Section II-C for further details and for comparison with Remarkably, as mentioned above, hypotheses such as the known results. ones in Theorems I.3 and I.4 actually yield a stronger Finally, we combine Theorem I.6 and Theorem I.1 to conclusion, and are also necessary for that stronger con- deduce the following unconditional Karp-Lipton style result: BPE clusion. Specifically, the stronger conclusion is that even If can be decided by circuits of quasilinear size, then BPP non-deterministic derandomization of prBPP (such as can be derandomized, in average-case and infinitely- O˜(log(n)) npolyloglog(n) prBPP ⊆ prNSUBEXP) yields circuit lower bounds often, in time 2 = . (See [1, Corollary against E, which in turn yield PRGs for non-uniform circuits. 6.6] for details and for a precise statement.) NTIME E Theorem I.5 ( -uniform circuits for , non-deter- II. TECHNICAL OVERVIEW ministic derandomization, and circuit lower bounds). As- sume that there exists δ>0 such that E cannot be decided In this section we describe the proofs of our main results, δ by NTIME[2n ]-uniform circuits of arbitrary polynomial in high level. In Section II-A we describe the proofs of size. Then, Theorems I.1 and I.2; in Section II-B we describe the proofs of Theorems I.3, I.4 and I.5; and in Section II-C we describe prBPP ⊆ prNSUBEXP ⇒EXP⊂P/ = poly . (I.1) the proof of Theorem I.6, which relies on the proofs from In the other direction, if Eq. (I.1) holds, then E cannot be Section II-A. decided by NP-uniform circuits. A. Near-optimal uniform hardness-to-randomness results Note that in Theorem I.5 there is a gap between for TQBF the hypothesis that implies Eq. (I.1) and the conclu- sion from Eq. (I.1). Specifically, the hypothesis refers to Recall that in typical “hardness-to-randomness” results, a δ NTIME[2n ]-uniform circuits of polynomial size, whereas PRG is based on a hard function, and the proof amounts to the conclusion refers to NP-uniform circuits. By optimiz- showing that an efficient distinguisher for the PRG can be ing the parameters, this gap between sub-exponential and transformed to an efficient algorithm or circuit that computes polynomial can be considerably narrowed (see [1, Theorem the hard function. 5.11]). In high-level, our proof strategy follows this paradigm, and relies on the classic approach of Impagliazzo and rETH D. Disproving a version of requires circuit lower Wigderson [15] for transforming a distinguisher into an bounds algorithm for the hard function. Loosely speaking, the latter Lastly, we show that disproving a weak version of rETH approach works only when the hard function f ws : {0, 1}∗ → requires breakthrough circuit lower bounds. Specifically, we {0, 1}∗ is well-structured; the precise meaning of the term show that a randomized algorithm that solves CircuitSAT “well-structured” differs across different follow-up works, in time 2n/polylog(n) would yield lower bounds for circuits and in the current work it will also take on a new meaning, of quasilinear size against BPE = BPT IME[2O(n)].For but for now let us intuitively think of f ws as downward context, the best known lower bounds for such circuits are self-reducible and as having properties akin to random self- against Σ2 (see [32]) or against MA/1 (i.e., Merlin-Arthur reducibility. Instantiating the Nisan-Wigderson PRG with a protocols that use one bit of non-uniform advice; see [33]). suitable encoding ECC(f ws) of f ws as the underlying function Specifically, we prove the following: (again, the precise requirements from ECC differ across works), our goal is to show that if the PRG with stretch t(n) Theorem I.6 (circuit lower bounds from non-trivial ran- does not “fool” uniform distinguishers even infinitely-often, domized CircuitSAT algorithms). For any constant c ∈ N ws   then f is computable in probabilistic time t (n) >t(n). there exists a constant c ∈ N such that the following holds. If CircuitSAT for circuits over n variables and of size The key challenge underlying this approach is the sig-  c 2 c n/(log n) nificant overheads in the proof, which increase the time n · (log n) can be solved in probabilistic time 2 ,  ws c complexity t of computing f . In the original proof of [15] then BPE ⊂SIZE[n · (log n) ].  this time was roughly t (n) ≈ t(t(n)), and the state-of-the- Theorem I.6 constitutes progress on a well-known tech- art prior to the current work, by Trevisan and Vadhan [20] nical challenge. Specifically, the known arguments that de- (following [16]), yielded t(n)=poly(t(poly(n))). Since duce circuit lower bounds from “non-trivial” circuit-analysis the relevant functions f ws in all works are computable in

17 E, proofs with such an overhead can yield at most a sub- part in the proof of Theorem I.1. We now outline (very Ω(1) exponential stretch t(n)=2n . briefly) the key points underlying the construction; for a As mentioned in Section I-B, previous works bypassed detailed overview see [1, Section 4.1]. After the following this difficulty by either using stronger hypotheses, or de- brief outline, we will explain how we use f ws to prove ducing weaker conclusions, or working in different contexts Theorem I.1. (e.g., considering derandomization of AM rather than of Our first main idea is to use an IP = PSPACE protocol BPP). In contrast, we tackle this difficulty directly, and with polylog(n) rounds instead of poly(n) rounds, so that manage to reduce all of the polynomial overheads in the in- the first overhead (i.e., the additive overhead in the input put length to polylogarithmic overheads in the input length. length caused by the number of operators) will be only That is, we will show that for carefully-constructed f ws and polylog(n) instead of poly(n). Indeed, in such a protocol suitably-chosen ECC (and with some variations in the proof the verification time in each round is high, and therefore our approach), if the PRG instantiated with ECC(f ws) for stretch downward self-reducibility algorithm is relatively slow and t does not “fool” uniform distinguishers infinitely-often, then makes many queries; but we will be able to afford this in ws   O f can be computed in time t (n)=t(O(n)) (1). our proof (since eventually we only need to solve TQBF in 1) The well-structured function f ws: Following Trevisan time 2n/polylog(n)). While implementing this idea, we define and Vadhan [20], our f ws is an artificial PSPACE-complete a different intermediate problem that is both amenable to problem that we carefully construct. Their goal was to arithmetization and reducible from TQBF in quasilinear time construct f ws that will be simultaneously downward self- (see [1, Claim 4.7.1]); we move to an arithmetic setting reducible and randomly self-reducible. They achieved this that will support the strong random self-reducibility that we by constructing a function based on the proof of IP = want (see details below), and arithmetize the intermediate PSPACE [35], [36]: Loosely speaking, at input length problem in this setting (see [1, Claim 4.7.2]); we show how N =poly(n) the function gets as input a 3-SAT formula to execute arithmetic operators in a “batch” in this arithmetic (ϕ,N) ϕ over n variables, and outputs P (ϕ)=Q1 ◦ Q2 ◦ setting (see [1, Claim 4.7.3]); and we combine the resulting ... ◦ Q P (ϕ) P (ϕ) ϕ poly(n) , where is an arithmetization of , collection of polynomials into a single Boolean function. We the Qi’s are arithmetic operators from the IP = PSPACE stress that we are “paying” for all the optimizations above, proof, and P (ϕ,N)(ϕ)=TQBF(ϕ); and for i ∈ [poly(n)], by the fact that the associated algorithms (for downward self- at input length N − i, the function gets input (ϕ, w) and reducibility and for our notion of random self-reducibility outputs P (ϕ,N−i)(ϕ, w), where P (ϕ,N−i) is the polynomial that will be described next) now run in time 2n/polylog(n), that applies one less operator to P (ϕ) than P (ϕ,N−i+1) and rather than polynomial time; but again, we are able to afford fixes some input variables for P (ϕ) according to w. Since this in our proof. f ws consists of low-degree polynomials, it is randomly self- We obtain a function f ws with the following properties: reducible; and since each P (ϕ,N−i) is obtained by applying First, f ws is computable in linear space; secondly, TQBF a simple operator to P (ϕ,N−(i−1)), the function f ws is also is reducible to f ws in quasilinear time; thirdly, f ws is downward self-reducible. downward self-reducible in time 2n/polylog(n); and lastly, Going through their proof (with needed adaptations for f ws is sample-aided worst-case to δ-average-case re- our “high-end” parameter setting), we encounter four dif- ducible, for δ(n)=2−n/polylog(n). The last property, ferent polynomial overheads in the input length. The first which is implicit in many works and was recently made and obvious one is that inputs of length n are mapped explicit by Goldreich and G. Rothblum [37], asserts the to inputs of length N =poly(n), corresponding to the following: There exists a uniform algorithm T that gets number of rounds in the IP = PSPACE protocol. The as input a circuit C : {0, 1}n →{0, 1}∗ that agrees with ws other polynomial overheads in the input length come from fn on at least δ(n) of the inputs, and labeled examples their reduction of TQBF to an intermediate problem that takes (x, f ws(x)) where x ∈{0, 1}n is uniformly-chosen, runs both ϕ and w as part of the input and is still amenable in time 2n/poly log(n) and with high probability outputs a 9  n ∗ ws to arithmetization, from the field size that is required for circuit C : {0, 1} →{0, 1} that computes fn on all the strong random self-reducibility that is needed in our inputs (see [1, Definition 4.2]). Our construction of f ws also parameter setting (see below), and from the way the poly(n) satisfies an additional property, which will only be used polynomials are combined into a single Boolean function. in the proof of Theorem I.2 (i.e., of the “almost-always” The main challenge is to eliminate all of the foregoing version of the result); we will describe this property in the overheads simultaneously. We will achieve this by presenting proof outline for Theorem I.2 below. a construction of a suitable f ws, which is a refinement 2) Instantiating the [15] proof framework with the func- of their construction, and constitutes the main technical tion f ws: Given this construction of f ws, we now use a vari- ant of the [15] proof framework, as follows. (For simplicity, 9Recall that the standard arithmetization of 3-SAT is a polynomial that depends on the input formula, whereas we want a single polynomial that we show how to “fool” polynomial-time distinguishers that gets both a formula and the assignment as input. do not use advice.) Let ECC be the Goldreich-Levin [38]

18 ws ws (i.e., Hadamard) encoding ECC(f )(x, r)=⊕if (x)i · ri. this property. The argument of [15] (following [12]) shows that if for Now, observe that the transformation of a probabilistic input length n there exists a uniform poly(n)-time distin- distinguisher A for the PRG to a probabilistic algorithm F guisher A for the Nisan-Wigderson PRG (instantiated with that computes f ws actually gives a “point-wise” guarantee: ws ECC(f )) that succeeds with advantage 1/n, then for input For every input length n ∈ N,ifA distinguishes the PRG on  length  = O(log(n)) (corresponding to the set-size in the a corresponding set of input lengths Sn, then F computes ws  underlying combinatorial design) there is a weak learner for f correctly at input length  = (n)=O(log(n)); ws ECC(f ): That is, there exists an algorithm that gets oracle specifically, we want to use the downward self-reducibility ECC f ws n ≈ /polylog() ws access to ( ), runs in time poly( ) 2 , argument for f at input lengths ,  − 1, ..., 0, and Sn is ws and outputs a small circuit that agrees with ECC(f ) on the set of input lengths at which we need a distinguisher 2 ws approximately 1/2+1/n ≈ 1/2+δ0() of the -bit inputs, for G in order to obtain a weak learner for ECC(f ) at δ  −/polylog() ws where 0( )=2 . input lengths , −1, ...0. Moreover, since f is downward Assuming that there exists a distinguisher for the PRG self-reducible in polylog steps, we will only need weak n ∈ N as above for every , we deduce that a weak learner learners at inputs , ..., 0 =  − polylog(); hence, we can exists for every  ∈ N. Following [15], for each input show that Sn is a set of polylog() = polyloglog(n) input i/ i length i =1, ...,  we construct a circuit of size 2 polylog( ) lengths in the interval [n, n2] (see [1, Lemma 4.9] for the ws ws for fi . Specifically, in iteration i we run the learner for precise calculation). Taking the contrapositive, if f cannot ws ECC(f ) on input length 2i, and answer its oracle queries be computed by F on almost all ’s, then for every n ∈ N ws using the downward self-reducibility of f , the circuit there exists an input length m ∈ Sn ⊂ [n, n2] such that G f ws ECC f ws 10 that we have for i−1, and the fact that ( )2i is fools A at input length m. f ws n easily computable given access to i . The learner outputs Our derandomization algorithm gets input 1 and also 2i/polylog(2i) ECC f ws a circuit of size 2 that agrees with ( ) gets the “good” input length m ∈ Sn as non-uniform / δ i i m on approximately 1 2+ 0(2 ) of the 2 -bit inputs, and advice; it then simulates G(1 ) (i.e., the PRG at input the argument of [38] allows to efficiently transform this length m) and truncates the output to n bits. (We can indeed ws circuit to a circuit of similar size that computes f on show that truncating the output of our PRG preserves its δ i δ i i a approximately ( )=poly(0(2 )) of the -bit inputs. pseudorandomness in a uniform setting; see [1, Proposition Our goal now is to transform this circuit to a circuit of 4.12] for details.) The crucial point is that since |Sn| = ws similar size that computes f on all i-bit inputs. Recall that polyloglog(n), the advice length is O(logloglog(n)). Note, in general, performing such transformations by a uniform however, that for every potential distinguisher A there exists f ws algorithm is challenging (intuitively, if is a codeword a different input length m ∈ Sn such that G is pseudorandom in an error-correcting code, this corresponds to uniform list- for A on m. Hence, our derandomization algorithm (or, ws decoding of a “very corrupt” version of f ). However, in more accurately, its advice) depends on the distinguisher our specific setting we can produce random labeled samples that it wants to “fool”. Thus, for every L ∈BPPand every ws for f , using its downward self-reducibility and the circuit efficiently-samplable distribution X of inputs, there exists a f ws that we have for i−1. Relying on the sample-aided worst- corresponding “almost-always” derandomization algorithm f ws case to average-case reducibility of , we can transform DX (see [1, Proposition 4.12]). ws our circuit to a circuit of similar size that computes fi on all inputs. B. NTIME-uniform circuits for E and an equivalence Finally, since TQBF is reducible with quasilinear overhead between derandomization and circuit lower bounds to f ws, if we can compute f ws in time 2n/polylog(n) then ws we can compute TQBF in such time. Moreover, since f is The proofs that we describe in the current section are computable in space O()=O˜(log(n)) (and thus in time significantly simpler technically than the proofs described n npolyloglog( )), the pseudorandom generator is computable in Sections II-A and II-C. As mentioned in Section I-C, the n in time npolyloglog( ). motivating observation is that NETH implies an equivalence 3) The “almost-always” version: Proof of Theorem I.2: between derandomization and circuit lower bounds; let us We now explain how to adapt the proof above in order to start by proving this statement: get an “almost-always” PRG with near-exponential stretch. For starters, we will use a stronger property of f ws, namely ws that it is downward self-reducible in a polylogarithmic 10Actually, since f is downward self-reducible in polylog steps, it can  be computed relatively-efficiently on infinitely-many input lengths, and thus number of steps; this means that for every input length cannot be “hard” for almost all ’s. However, since TQBF can be reduced ws there exists an input length 0 ≥  − polylog() such that to f with quasilinear overhead, if TQBF is “hard” almost-always then  ws f ws  f ws for every (n) there exists  ≤ O((n)) such that f is “hard” on  , is efficiently-computable at input length 0 (i.e., 0 Sn ⊂ 0/polylog(0) which allows our argument to follow through, with a similar set is computable in time 2 without a “downward” [n, npolyloglog(n)] (see [1, Proposition 4.11] for details). For simplicity, oracle); see [1, Section 4.1.1] for intuition and details about we ignore this issue in the overview.

19 Proposition II.1 (“warm-up”: a weaker version of The- E⊂SIZE [2Ω(n)] it suffices to assume that CAPP on v-bit orem I.3). Assume that EXP ⊂i.o.NSUBEXP. Then, circuits of size n =2Ω(v) can be solved in time 2·v, for a prBPP ⊆ prSUBEXP ⇐⇒ EXP ⊂i.o.P/poly. sufficiently small >0.12 For more details, see [1, Section Proof: The “⇐=” direction follows (without any as- 5.2]. Proof of Theorem I.5: The first part of Theorem I.5 sumption) from [24]. For the “=⇒” direction, assume that δ E NTIME n prBPP ⊆ prSUBEXP, and assume towards a contradic- asserts that if does not have [2 ]-uniform tion that EXP ⊂ i.o.P/poly. The latter hypothesis implies circuits of polynomial size, then the conditional statement prBPP ⊆ prNSUBEXP ⇒EXP⊂P/ (using the Karp-Lipton style result of [24]) that EXP ⊂ “ = poly” holds. i.o.MA. Combining this with the former hypothesis, we The proof of this statement again follows the logical struc- deduce that EXP ⊂ i.o.NSUBEXP, a contradiction. ture from the proof of Proposition II.1, and relies on a further Our proofs of Theorems I.3 and I.4 will follow the same strengthening of our “low-end” Karp-Lipton style result such prBPP ⊆ logical structure as the proof of Proposition II.1, and our goal that the result only uses the hypothesis that prNSUBEXP prBPP ⊆ prSUBEXP 13 will be to relax the hypothesis EXP ⊂i.o.NSUBEXP. rather than . We will do so by strengthening the Karp-Lipton style result The second part of Theorem I.5 asserts that if the con- that uses [24] and asserts that a joint “collapse” hypothesis ditional statement “prBPP ⊆ prNSUBEXP =⇒EXP⊂ and derandomization hypothesis implies that EXP can be P/poly” holds, then E does not have NP-uniform cir- decided in small non-deterministic time. We will show cuits. We will in fact prove the stronger conclusion that two different strengthenings, each referring to a different E ⊆(NP ∩ P/poly). (Recall that the class of prob- parameter setting: The first strengthening refers to a “low- lems decidable by NP-uniform circuits is a subclass of end” setting, and asserts that if EXP ⊂ P/poly and ONP ⊆ NP ∩ P/poly.) The proof itself is very simple: prBPP ⊆ prSUBEXP then EXP has NSUBEXP- Assume towards a contradiction that E⊆(NP ∩P/poly); uniform circuits of polynomial size (see [1, Item (1) of since BPP ⊆ EXP, it follows that prBPP ⊆ prNP Proposition 5.6]); and the second strengthening refers to a (see [1, Proof of Theorem 5.10]); and by the hypothesized “high-end” setting, and asserts that if E⊂i.o.SIZE[2·n] conditional statement, we deduce that EXP ⊂P/poly,a and prBPP = prP then E has NTIME[2O()·n]-uniform contradiction. Indeed, the parameter choices in the foregoing circuits (see [1, Proposition 5.7]). The proofs of these two proof are far from tight, and (as mentioned after the state- different strengthenings rely on different ideas; for high- ment of Theorem I.5) the quantitative gap between the two level descriptions of the proofs see [1, Section 5.1.2] and [1, parts of Theorem I.5 can be considerably narrowed (see [1, Section 5.1.3], respectively. Theorem 5.11]). For context, recall that (as noted by Fortnow, Santhanam, CircuitSAT and Williams [39]), the proof of [24] already supports the C. Circuit lower bounds from randomized al- stronger result that EXP ⊂ P/poly ⇐⇒ E X P = gorithms OMA;11 and by adding a derandomization hypothesis (e.g., Recall that Theorem I.6 asserts that if CircuitSAT for prBPP = prP) we can deduce that EXP = ONP. n-bit circuits of size O˜(n2) can be solved in probabilistic c  Nevertheless, our results above are stronger, because NP- time 2n/(log n) , then BPE ⊂SIZE[n · (log n)c ], where uniform circuits are an even weaker model than ONP: This c depends on c. The relevant context for this result is the is since in the latter model the proof is verified on an input- known line of works that deduce circuit lower bounds from by-input basis, whereas in the former model we only verify “non-trivial” circuit-analysis algorithms, following the cele- once that the proof is convincing for all inputs. We also brated result of Williams [34]. The main technical innovation stress that some lower bounds for this weaker model (i.e., for in Theorem I.6 is that our hypothesis is only that there NTIME-uniform circuits of small size) are already known: exists a probabilistic circuit-analysis algorithm, whereas the Santhanam and Williams [31] proved that for every k ∈ N aforementioned known results crucially rely on the fact that there exists a function in NP that cannot be computed by the circuit-analysis algorithm is deterministic. On the other NP-uniform circuits of size nk. hand, the aforementioned known results yield new circuit We also note that our proofs actually show that (con- lower bounds even if the running time of the algorithm is ditioned on lower bounds for NTIME-uniform circuits against E)evenarelaxed derandomization hypothesis is 12Note that the problem of solving CAPP for v-bit circuits of size n Ω(v) O(v) n already equivalent to the corresponding circuit lower bounds. =2 can be trivially solved in time 2 =poly( ), and thus unconditionally lies in prP∩prBPT IME[O˜(n)]. The derandomization For example, in the “high-end” setting, to deduce that problem described above simply calls for a faster deterministic algorithm for this problem. 11 The notation OMA stands for “oblivious” MA. It denotes the class 13Intuitively, in the “low-end” Karp-Lipton result we only need to of problems that can be decided by an MA verifier such that for every derandomize probabilistic decisions made by the non-deterministic machine input length there is a single “good” proof that convinces the verifier on all that constructs the circuit, whereas the circuit itself is deterministic; thus, inputs in the set (rather than a separate proof for each input); see, e.g., [39], a non-deterministic derandomization hypothesis suffices for this result. [40]. See [1, Section 5.1.2] for details.

20 2n/nω(1),14 whereas Theorem I.6 only yields new circuit 2) The function f ws has circuits of size n · (log n)k. lower bounds if the running time is 2n/polylog(n). Hence, f ws is also learnable (as we concluded above), As far as we are aware, Theorem I.6 is the first result that and so the argument of [15] can be used to show deduces circuit lower bounds from a near-exponential-time that f ws is computable by an efficient probabilis- probabilistic algorithm for a natural circuit-analysis task. tic algorithm.17 Now, by a diagonalization argument, diag 2k The closest result that we are aware of is by Oliveira and there exists L ∈ Σ4[n · (log n) ] that cannot be Santhanam [41, Theorem 14], who deduced lower bounds computed by circuits of size n · (log n)k. We show for circuits of size nO(1) against BPE from non-trivial prob- that Ldiag ∈BPEby first reducing Ldiag to f ws in  ws abilistic algorithms for learning with membership queries time O(n), and then computing f (using the efficient (rather than for a circuit-analysis task such as CircuitSAT); probabilistic algorithm). as explained next, we build on their techniques in our Thus, in both cases we showed a function in BPE \ 15 proof. SIZE[n · (log n)k]. The crucial point is that in the second Our proof strategy is indeed very different from the proof case, our new and efficient implementation of the [15] strategies underlying known results that deduce circuit lower argument (which was described in Section II-A) yields a bounds from deterministic circuit-analysis algorithms (e.g., probabilistic algorithm for f ws with very little overhead, from the “easy-witness” proof strategy [27]–[29], [34], [42], which allows us to indeed show that Ldiag ∈BPE. Specifi- MA [43], or from proofs that rely on lower bounds [27, cally, our implementation of the argument (with the specific Rmk. 26], [30], [33]). In high-level, to prove our result well-structured function f ws) shows that f ws can be learned we exploit the connection between randomized learning in time T (n)=2n/polylog(n), then f ws can be computed in algorithms and circuit lower bounds, which was recently similar time T (n)=2n/polylog(n) (see [1, Corollary 4.10]). discovered by Oliveira and Santhanam [41, Sec. 5] (follow- We thus only need to explain how a CircuitSAT algo- ing [44]–[46]). Loosely speaking, their connection relies on rithm yields a learning algorithm with comparable running the classical results of [15], and we are able to significantly time. The idea here is quite simple: Given oracle access to a refine this connection, using our refined version of the [15] function f ws, we generate a random sample of r =poly(n) argument that was detailed in Section II-A. ws ws ws labeled examples (x ,f (x )), ..., (xr,f (xr)) for f , CircuitSAT 1 1 Our starting point is the observation that and we use the CircuitSAT algorithm to construct, bit-by- k ∈ algorithms yield learning algorithms. Specifically, fix bit, a circuit of size n · (log n)k that agrees with f ws on the N CircuitSAT , and assume (for simplicity) that for sample. Note that the input for the CircuitSAT algorithm n polynomial-sized -bit circuits can be solved in probabilistic is a circuit of size poly(n) over only n ≈ n · (log n)k+1 n/polylog(n) time 2 for an arbitrarily large polylogarithmic bits (corresponding to the size of the circuit that we wish function. We show that in this case, any function that is to construct). Hence, the CircuitSAT algorithm runs in n· n k   computable by circuits of size (log ) can be learned (ap- time 2n /polylog(n ) =2n/polylog(n). And if the sample size n/polylog(n) proximately) using membership queries in time 2 r =poly(n) is large enough, then with high probability 16 f ws (we explain below how to prove this). Now, let be the any circuit of size n · (log n)k that agrees with f ws on the well-structured function from Section II-A, and recall that sample also agrees with f ws on almost all inputs (i.e., by a f ws is computable in linear space, and hard for linear space union-bound over all circuits of such size). under quasilinear-time reductions. Then, exactly one of two cases holds: ACKNOWLEDGMENT 1) The function f ws does not have circuits of size n · (log n)k. In this case a Boolean version of f ws We are grateful to Igor Oliveira for pointing us to the also does not have circuits of such size, and since this results in [41, Sec. 5], which serve as a basis for the proof of Boolean version is in SPACE[O(n)] ⊆BPE,weare Theorem I.6. We thank Oded Goldreich, who provided feed- done. back throughout the research process and detailed comments on the manuscript, both of which helped improve the work. 14For example, from such an algorithm they deduce the lower bound We also thank Ryan Williams for a helpful discussion, for NEXP ⊆P/poly; and from an algorithm that runs in time 2n/polylog(n) as in Theorem I.6, their results yield the lower bound NP ⊂SIZE[nk] asking us whether a result as in Theorem I.6 can be proved, for every fixed k ∈ N. and for feedback on the manuscript. Finally, we thank an 15Another known result, which was communicated to us by Igor Oliveira, anonymous reviewer for pointing out a bug in the initial asserts that if CircuitSAT for circuits over n variables and of size no(1) proof of Theorem I.5, which we fixed. poly(n) can be solved in probabilistic sub-exponential time 2 , then BPT IME[2O(n)] ⊂P/poly. This result can be seen as a “high-end” form of our result (i.e., of Theorem I.6), where the latter will use a weaker 17Actually, our implementation of the [15] argument shows that if the hypothesis but deduce a weaker conclusion. function ECC(f ws) (where ECC is defined as in Section II-A) can be learned, 16That is, there exists a probabilistic algorithm that gets input 1n and then the function f ws can be efficiently computed. For simplicity, we oracle access to f, and with high probability outputs an n-bit circuit of ignore the difference between f ws and ECC(f ws) in the current high-level size n · (log n)k that agrees with f on almost all inputs. decription.

21 The work was initiated in the 2018 Complexity Workshop [9] V. V. Williams, “Hardness of easy problems: basing hard- in Oberwolfach; the authors are grateful to the Mathematis- ness on popular conjectures such as the Strong Exponential ches Forschungsinstitut Oberwolfach and to the organizers Time Hypothesis,” in Proc. 10th International Symposium on Parameterized and Exact Computation, 2015, vol. 43, pp. 17– of the workshop for the productive and lovely work en- 29. vironment. Lijie Chen is supported by NSF CCF-1741615 and a Google Faculty Research Award. Ron Rothblum is [10] ——, “On some fine-grained questions in algorithms and supported in part by a Milgrom family grant, by the Israeli complexity,” 2018, accessed at https://people.csail.mit.edu/ Science Foundation (Grant No. 1262/18), and the Technion virgi/eccentri.pdf, October 17, 2019. Hiroshi Fujiwara cyber security research center and Israel [11] M. L. Carmosino, R. Impagliazzo, and M. Sabin, “Fine- cyber directorate. Roei Tell is supported by funding from grained derandomization: from problem-centric to resource- the European Research Council (ERC) under the European centric complexity,” in Proc. 45th International Colloquium Union’s Horizon 2020 research and innovation programme on Automata, Languages and Programming (ICALP), 2018, (grant agreement No. 819702). Eylon Yogev is funded by pp. Art. No. 27, 16. the ISF grants 484/18, 1789/19, Len Blavatnik and the [12] N. Nisan and A. Wigderson, “Hardness vs. randomness,” Blavatnik Foundation, and The Blavatnik Interdisciplinary Journal of Computer and System Sciences, vol. 49, no. 2, Cyber Research Center at Tel Aviv University. Part of this pp. 149–167, 1994. work was done while the fourth author was visiting the Simons Institute for the Theory of Computing. [13] C. Umans, “Pseudo-random generators for all hardnesses,” Journal of Computer and System Sciences, vol. 67, no. 2, pp. 419–440, 2003. REFERENCES [14] R. Impagliazzo and A. Wigderson, “P=BPPif E requires [1] L. Chen, R. Rothblum, R. Tell, and E. Yogev, “On exponential circuits: derandomizing the XOR lemma,” in exponential-time hypotheses, derandomization, and circuit Proc. 29th Annual ACM Symposium on Theory of Computing lower bounds,” Electronic Colloquium on Computational (STOC), 1999, pp. 220–229. Complexity: ECCC, vol. 26, p. 169, 2019. [15] ——, “Randomness vs. time: De-randomization under a uni- [2] R. Impagliazzo and R. Paturi, “On the complexity of k-SAT,” form assumption,” in Proc. 39th Annual IEEE Symposium on Journal of Computer and System Sciences, vol. 62, no. 2, pp. Foundations of Computer Science (FOCS), 1998, pp. 734–. 367–375, 2001. [16] J.-Y. Cai, A. Nerurkar, and D. Sivakumar, “Hardness and hi- [3] R. Impagliazzo, R. Paturi, and F. Zane, “Which problems have erarchy theorems for probabilistic quasi-polynomial time,” in strongly exponential complexity?” Journal of Computer and Proc. 31st Annual ACM Symposium on Theory of Computing System Sciences, vol. 63, no. 4, pp. 512–530, 2001. (STOC) ), 1999, pp. 726–735.

[4] H. Dell, T. Husfeldt, D. Marx, N. Taslaman, and M. Wahlen,´ [17] V. Kabanets, “Easiness assumptions and hardness tests: trad- “Exponential time complexity of the permanent and the Tutte ing time for zero error,” 2001, vol. 63, no. 2, pp. 236–252. polynomial,” ACM Transactions on Algorithms, vol. 10, no. 4, pp. Art. 21, 32, 2014. [18] C.-J. Lu, “Derandomizing Arthur-Merlin games under uni- form assumptions,” Computational Complexity, vol. 10, no. 3, pp. 247–259, 2001. [5] M. L. Carmosino, J. Gao, R. Impagliazzo, I. Mihajlin, R. Pa- turi, and S. Schneider, “Nondeterministic extensions of the [19] D. Gutfreund, R. Shaltiel, and A. Ta-Shma, “Uniform hard- strong exponential time hypothesis and consequences for ness versus randomness tradeoffs for Arthur-Merlin games,” non-reducibility,” in Proc. 7th Conference on Innovations in Computational Complexity, vol. 12, no. 3-4, pp. 85–130, Theoretical Computer Science (ITCS), 2016, pp. 261–270. 2003.

[6] R. R. Williams, “Strong ETH breaks with Merlin and Arthur: [20] L. Trevisan and S. P. Vadhan, “Pseudorandomness and short non-interactive proofs of batch evaluation,” in Proc. average-case complexity via uniform reductions,” Computa- 31st Annual IEEE Conference on Computational Complexity tional Complexity, vol. 16, no. 4, pp. 331–364, 2007. (CCC), 2016, vol. 50, pp. Art. No. 2, 17. [21] R. Shaltiel and C. Umans, “Low-end uniform hardness vs. [7] G. J. Woeginger, “Exact algorithms for NP-hard problems: a randomness tradeoffs for AM,” in Proc. 39th Annual ACM survey,” in Combinatorial optimization—Eureka, you shrink!, Symposium on Theory of Computing (STOC), 2007, pp. 430– ser. Lecture Notes in Computer Science. Springer, Berlin, 439. 2003, vol. 2570, pp. 185–207. [22] D. Gutfreund and S. Vadhan, “Limitations of hardness vs. [8] D. Lokshtanov, D. Marx, and S. Saurabh, “Lower bounds randomness under uniform reductions,” in Proc. 12th In- based on the exponential time hypothesis,” Bulletin of the ternational Workshop on Randomization and Approximation European Association for Theoretical Computer Science Techniques in Computer Science (RANDOM), 2008, pp. 469– (EATCS), no. 105, pp. 41–71, 2011. 482.

22 [23] O. Goldreich, “In a world of P=BPP,” in Studies in Complex- [35] C. Lund, L. Fortnow, H. Karloff, and N. Nisan, “Algebraic ity and Cryptography. Miscellanea on the Interplay Random- methods for interactive proof systems,” Journal of the Associ- ness and Computation, 2011, pp. 191–232. ation for Computing Machinery, vol. 39, no. 4, pp. 859–868, 1992. [24] L. Babai, L. Fortnow, N. Nisan, and A. Wigderson, “BPP has subexponential time simulations unless EXPTIME has [36] A. Shamir, “IP = PSPACE,” Journal of the ACM, vol. 39, publishable proofs,” Computational Complexity, vol. 3, no. 4, no. 4, pp. 869–877, 1992. pp. 307–318, 1993. [37] O. Goldreich and G. N. Rothblum, “Worst-case to average- [25] I. C. Oliveira, “Algorithms versus circuit lower bounds,” case reductions for subclasses of P,” Electronic Colloquium Electronic Colloquium on Computational Complexity: ECCC, on Computational Complexity: ECCC, vol. 26, p. 130, 2017. vol. 20, p. 117, 2013. [38] O. Goldreich and L. A. Levin, “A hard-core predicate for all [26] R. Williams, “Algorithms for circuits and circuits for algo- one-way functions,” in Proc. 21st Annual ACM Symposium rithms: Connecting the tractable and intractable,” in Proc. on Theory of Computing (STOC), 1989, pp. 25–32. International Congress of Mathematicians (ICM), 2014, pp. 659–682. [39] L. Fortnow, R. Santhanam, and R. Williams, “Fixed- polynomial size circuit bounds,” in Proc. 24th Annual IEEE [27] R. Impagliazzo, V. Kabanets, and A. Wigderson, “In search of Conference on Computational Complexity (CCC), 2009, pp. an easy witness: exponential time vs. probabilistic polynomial 19–26. time,” Journal of Computer and System Sciences, vol. 65, no. 4, pp. 672–694, 2002. [40] O. Goldreich and O. Meir, “Input-oblivious proof systems and a uniform complexity perspective on P/poly,” ACM Transac- [28] L. Chen and H. Ren, “Strong average-case circuit lower tions on Computation Theory, vol. 7, no. 4, pp. Art. 16, 13, bounds from non-trivial derandomization,” in Proc. 52th 2015. Annual ACM Symposium on Theory of Computing (STOC), 2020. [41] I. C. Oliveira and R. Santhanam, “Conspiracies between learning algorithms, circuit lower bounds, and pseudorandom- [29] C. Murray and R. Williams, “Circuit lower bounds for non- ness,” in Proc. 32nd Annual IEEE Conference on Computa- deterministic quasi-polytime: An easy witness lemma for tional Complexity (CCC), 2017, vol. 79, pp. Art. No. 18, 49. and nqp,” in Proc. 50th Annual ACM Symposium on Theory of Computing (STOC), 2018. [42] L. Chen and R. R. Williams, “Stronger Connections Between Circuit Analysis and Circuit Lower Bounds, via PCPs of [30] R. Tell, “Proving that prBPP = prP is as hard as proving Proximity,” in Proc. 34th Annual IEEE Conference on Com- that “almost NP” is not contained in P/poly,” Information putational Complexity (CCC), 2019, pp. 19:1–19:43. Processing Letters, vol. 152, p. 105841, 2019. [43] L. Chen, “Non-deterministic quasi-polynomial time is [31] R. Santhanam and R. Williams, “On medium-uniformity and average-case hard for ACC circuits,” in Proc. 60th An- circuit lower bounds,” in Proc. 28th Annual IEEE Conference nual IEEE Symposium on Foundations of Computer Science on Computational Complexity (CCC), 2013, pp. 15–23. (FOCS), 2019.

[32] R. Kannan, “Circuit-size lower bounds and non-reducibility [44] L. Fortnow and A. R. Klivans, “Efficient learning algorithms to sparse sets,” Information and Control, vol. 55, no. 1-3, pp. yield circuit lower bounds,” Journal of Computer and System 40–56, 1982. Sciences, vol. 75, no. 1, pp. 27–36, 2009.

[33] R. Santhanam, “Circuit lower bounds for Merlin-Arthur [45] R. C. Harkins and J. M. Hitchcock, “Exact learning al- classes,” SIAM Journal of Computing, vol. 39, no. 3, pp. gorithms, betting games, and circuit lower bounds,” ACM 1038–1061, 2009. Transactions on Computation Theory, vol. 5, no. 4, pp. Art. 18, 11, 2013. [34] R. Williams, “Improving exhaustive search implies superpoly- nomial lower bounds,” SIAM Journal of Computing, vol. 42, [46] A. Klivans, P. Kothari, and I. Oliveira, “Constructing hard no. 3, pp. 1218–1244, 2013. functions using learning algorithms,” in Proc. 28th Annual IEEE Conference on Computational Complexity (CCC), 2013, pp. 86–97.

23