The Computational Complexity of Randomness by Thomas Weir
Total Page:16
File Type:pdf, Size:1020Kb
The Computational Complexity of Randomness by Thomas Weir Watson A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Luca Trevisan, Co-Chair Professor Umesh Vazirani, Co-Chair Professor Satish Rao Professor David Aldous Spring 2013 The Computational Complexity of Randomness Copyright 2013 by Thomas Weir Watson Abstract The Computational Complexity of Randomness by Thomas Weir Watson Doctor of Philosophy in Computer Science University of California, Berkeley Professor Luca Trevisan, Co-Chair Professor Umesh Vazirani, Co-Chair This dissertation explores the multifaceted interplay between efficient computation and prob- ability distributions. We organize the aspects of this interplay according to whether the randomness occurs primarily at the level of the problem or the level of the algorithm, and orthogonally according to whether the output is random or the input is random. Part I concerns settings where the problem’s output is random. A sampling problem associates to each input x a probability distribution D(x), and the goal is to output a sample from D(x) (or at least get statistically close) when given x. Although sampling algorithms are fundamental tools in statistical physics, combinatorial optimization, and cryptography, and algorithms for a wide variety of sampling problems have been discovered, there has been comparatively little research viewing sampling through the lens of computational complexity. We contribute to the understanding of the power and limitations of efficient sampling by proving a time hierarchy theorem which shows, roughly, that “a little more time gives a lot more power to sampling algorithms.” Part II concerns settings where the algorithm’s output is random. Even when the specifi- cation of a computational problem involves no randomness, one can still consider randomized algorithms that produce a correct output with high probability. A basic question is whether one can derandomize algorithms, i.e., reduce or eliminate their dependence on randomness without incurring much loss in efficiency. Although derandomizing arbitrary time-efficient algorithms can only be done assuming unproven complexity-theoretic conjectures, it is possi- ble to unconditionally construct derandomization tools called pseudorandom generators for restricted classes of algorithms. We give an improved pseudorandom generator for a new class, which we call combinatorial checkerboards. The notion of pseudorandomness shows up in many contexts besides derandomization. The so-called Dense Model Theorem, which has its roots in the famous theorem of Green and Tao that the primes contain arbitrarily long arithmetic progressions, is a result about pseudorandomness that has turned out to be a useful tool in computational complexity and cryptography. At the heart of this theorem 1 is a certain type of reduction, and in this dissertation we prove a strong lower bound on the advice complexity of such reductions, which is analogous to a list size lower bound for list-decoding of error-correcting codes. Part III concerns settings where the problem’s input is random. We focus on the topic of randomness extraction, which is the problem of transforming a sample from a high-entropy but non-uniform probability distribution (that represents an imperfect physical source of randomness) into a uniformly distributed sample (which would then be suitable for use by a randomized algorithm). It is natural to model the input distribution as being sampled by an efficient algorithm (since randomness is generated by efficient processes in nature), and we give a construction of an extractor for the case where the input distribution’s sampler is extremely efficient in parallel time. A related problem is to estimate the min-entropy (“amount of randomness”) of a given parallel-time-efficient sampler, since this dictates how many uniformly random output bits one could hope to extract from it. We characterize the complexity of this problem, showing that it is “slightly harder than NP-complete.” Part IV concerns settings where the algorithm’s input is random. Average-case complex- ity is the study of the power of algorithms that are allowed to fail with small probability over a randomly chosen input. This topic is motivated by cryptography and by modeling heuris- tics. A fundamental open question is whether the average-case hardness of NP is implied by the worst-case hardness of NP. We exhibit a new barrier to showing this implication, by proving that a certain general technique (namely, “relativizing proofs by reduction”) can- not be used. We also prove results on hardness amplification, clarifying the quantitative relationship between the running time of an algorithm and the probability of failure over a random input (both of which are desirable to minimize). 2 Contents 1 Introduction 1 1.1 ComputationalComplexity............................ 1 1.2 Randomness.................................... 2 1.3 OurContributions ................................ 5 1.3.1 PartI ................................... 5 1.3.2 PartII................................... 6 1.3.3 PartIII .................................. 8 1.3.4 PartIV .................................. 9 I The Problem’s Output is Random 12 2 Time Hierarchies for Sampling Distributions 13 2.1 Introduction.................................... 13 2.1.1 Results................................... 14 2.1.2 Discussion................................. 15 2.2 IntuitionforTheorem2.1 ............................ 16 2.3 ProofofTheorem2.1............................... 20 2.4 List-Decoding from Ubiquitous Informative Errors . ..... 25 2.4.1 Discussion of the Model of Error-Correction . 26 2.4.2 ProofofTheorem2.3........................... 27 2.4.2.1 A Combinatorial Lemma . 27 2.4.2.2 CodeConstruction ....................... 28 II The Algorithm’s Output is Random 31 3 Pseudorandom Generators for Combinatorial Checkerboards 32 3.1 Introduction.................................... 32 3.1.1 CombinatorialCheckerboards . 33 3.1.2 OurResult ................................ 35 3.1.3 OverviewoftheProof .......................... 36 3.1.4 Preliminaries . 37 i 3.2 Deriving Theorem 3.3 from Lemma 3.7 and Lemma 3.8 . 38 3.3 TheHigh-WeightCase .............................. 40 3.3.1 ProofofLemma3.7............................ 40 3.3.2 ProofofLemma3.14 ........................... 42 3.3.3 ProofoftheTreeLabelingLemma . 45 3.3.3.1 Intuition . 46 3.3.3.2 FormalArgument........................ 46 3.4 TheLow-WeightCase .............................. 48 3.4.1 ProofofLemma3.8............................ 48 3.4.2 DimensionReduction........................... 51 3.4.3 ProofofLemma3.36 ........................... 54 4 Advice Lower Bounds for the Dense Model Theorem 55 4.1 Introduction.................................... 55 4.1.1 RelatedWork ............................... 59 4.2 Intuition...................................... 60 4.3 FormalProof ................................... 62 4.3.1 Binomial Distribution Tail . 62 4.3.2 CombinatorialDesigns .......................... 65 4.3.3 Notational Preliminaries . 66 4.3.4 ProofofTheorem4.6........................... 66 4.3.4.1 Distributions Without Dense Models . 67 4.3.4.2 Distributions That Cannot Be Covered . 67 4.3.4.3 SettingtheParameters. 68 4.3.4.4 The Majority of Majorities . 69 4.3.4.5 Putting It All Together . 70 III The Problem’s Input is Random 72 5 Extractors and Lower Bounds for Locally Samplable Sources 73 5.1 Introduction.................................... 73 5.1.1 Results................................... 75 5.1.2 Techniques................................. 76 5.1.3 ConcurrentWork ............................. 77 5.1.4 Previous Work on the Power of Locally Computable Functions . 77 5.2 Preliminaries ................................... 77 5.3 1-LocalSources .................................. 79 5.4 d-LocalSources .................................. 81 5.4.1 SuperindependentMatchings. 81 5.4.2 ProofofTheorem5.13 .......................... 82 5.5 IncreasingtheOutputLength . 83 5.5.1 TheGeneralMethod ........................... 83 ii 5.5.2 ApplyingTheorem5.19.......................... 86 5.6 Improved Lower Bounds for Sampling Input-Output Pairs . ..... 87 5.7 Improved Extractors for Low-Weight Affine Sources . ...... 89 5.7.1 A Linear Strong Seeded Extractor with Seed Length log n + O(log k) 89 5.7.2 ProofofTheorem5.10 .......................... 91 6 The Complexity of Estimating Min-Entropy 94 6.1 Introduction.................................... 94 6.1.1 Definitions................................. 95 6.1.2 Results................................... 97 6.1.3 RelatedWork ............................... 98 6.2 ProofofTheorem6.3andTheorem6.4 . 98 6.3 ProofofTheorem6.5............................... 102 6.4 SBP is Closed Under Nondeterminism . 103 IV The Algorithm’s Input is Random 105 7 Relativized Worlds Without Worst-Case to Average-Case Reductions for NP 106 7.1 Introduction.................................... 106 7.1.1 Notions of Reductions and Relationship to Previous Work . 107 7.1.2 Results................................... 109 7.1.3 IndependentWork ............................ 110 7.1.4 Organization ............................... 111 7.2 Preliminaries ................................... 111 7.2.1 ComplexityClasses ............................ 111 7.2.2 Reductions ................................ 112 7.2.3 Relativization . 114 7.2.4 Uniform Complexity