Bounded Arithmetic and Formalizing Probabilistic Proofs by Dai Tri Man

Bounded Arithmetic and Formalizing Probabilistic Proofs by Dai Tri Man Lê A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto c Copyright 2014 by Dai Tri Man Lê Abstract Bounded Arithmetic and Formalizing Probabilistic Proofs Dai Tri Man Lê Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2014 The first theme of this thesis investigates the complexity class CC and its associated bounded-arithmetic theory. Subramanian defined CC as the class of problems log-space reducible to the comparator circuit value problem (Ccv). Using the Cook-Nguyen method we define the two-sorted theory VCC whose provably-total functions are exactly the CC functions. To apply this method, we show CC is the same as the class of problems computed by uniform AC0 circuits with unbounded Ccv oracle gates. We prove that VNL ⊆ VCC ⊆ VP, where VNL and VP are theories for the classes NL and P respectively. We strengthen Subramanian’s work by showing that the problems in his paper are indeed complete for CC under many-one AC0 reductions. We then prove the correctness of these reductions in VCC. The second theme of this thesis is formalizing probabilistic proofs in bounded arithmetic. In a series of papers, Jeˇrábekargued that the universal polynomial-time theory VPV augmented with the surjective weak pigeonhole principle sWPHP(LFP) for all VPV functions is the ‘right’ theory for randomized polynomial-time reasoning in bounded arithmetic. Motivated from the fact that no one had used Jeˇrábek’sframework for feasible reasoning about specific interesting randomized algorithms in classes such as RP and RNC2, we formalize in VPV the correctness of two fundamental RNC2 algorithms for testing if a bipartite graph has a perfect matching and for finding a bipartite perfect matching. Using Moser’s recent constructive proof technique for the Lovász Local Lemma, we show that VPV + sWPHP(LFP) proves the existence of a satisfying assignment for every instance of k-SAT in which every clause shares a variable with up to 2k−3 other clauses. This result implies the existence of a randomized polynomial-time algorithm for find satisfying assignments such k-SAT instances. The remainder of this thesis was motivated by the lack of fundamental probability concepts like random variables, expectation and variance in Jeˇrábek’swork, which means basic yet useful theorems like Markov’s inequality, Chebyshev’s inequality, linearity of expectation, etc were not available in his work. By choosing suitable definitions of random variables, approximate probability and approximate expectation, we are able prove these theorems and utilize them to prove the Goldreich-Levin theorem A within the conservative extension HARD of VPV + sWPHP(LFP). ii Dedication To my parents and sisters for their unconditional love, guidance, and support. iii Acknowledgements I want to thank my advisor Steve Cook. It has been a real privilege to have Steve as my supervisor. My profound gratitude to him for his wisdom, constructive criticism, and mentorship. During my transition from the software engineering group to the theory group in 2008, I remember asking Steve about what it takes to work in theory and he said: “To do theory one only needs a good taste when choosing problems, and the persistency to follow them through.” A good taste like Steve’s is exceptional, but I have learned in a hard way that when things get difficult, persistency is crucially needed. Many thanks to my coauthors Steve Cook, Yuval Filmus, Yuli Ye and Wesley George for our joint papers that form the bulk of this thesis. I would also like to thank my other committee members, Charles Rackoff, Alasdair Urquhart and Toni Pitassi, and my external examiner Jan Krajíˇcek.They provided valuable feedback on my work and helped shape this thesis into its current form. Charles Rackoff told me his interesting perspective on the Lovász Local Lemma, which inspired the work in Chapter 5. I would like to thank Phuong The Nguyen, whose work with Steve laid the foundation for all of my work on bounded arithmetic. He also gave me very useful feedback on my research direction. I am also very fortunate to have him as a good friend. I am indebted to Jan Krajíˇcekand Emil Jeˇrábekin Prague. Jan generously invited me to the Fall Schools of Logic and Complexity 2009 and 2010, and the Special Semester in Logic and Complexity 2011 which he organized. Emil’s seminal work on approximate counting in bounded arithmetic is the main inspiration for the main results of this thesis. The meetings with Emil during my visit at Prague really helped me gain deeper understanding of his work. Gregory Moore played an important role in my decision of going to graduate school. His wonderful undergraduate set theory courses at McMaster University inspired me to do research relating to set theory and mathematical logic. He is always willing to give me valuable advice whenever I have academic or personal difficulty. I am extremely grateful for his friendship for the last 10 years. I am grateful to Yuli Ye for being a wonderful friend during the last five years. I will always miss the walks we had around the university campus regardless of the weather and our memorable conversations about research, philosophy and life. His emotional support and advice during the most difficult time have contributed greatly to the finishing of this thesis. Many thanks to Yuval Filmus for listening to my half-baked ideas and for his valuable feedback. His construction of universal comparator circuits opens possibilities for making progress in understanding the complexity class CC. While his scientific ability and work ethics are second to none, he is also my great friend. I would like to thank Sergey Gorbunov for organizing the trips to classical music concerts, and for taking part in our “food critic” adventures. He is also a great friend and a very promising scientist, whom I learnt a lot from. I thank my other colleagues and friends at the University of Toronto, especially Jaiganesh Balasun- daram, Steve Chaplick, Jackie CK Cheung, Lila Fontes, Wesley George, Kaveh Ghasemloo, Daniel Lister, David Liu, Lalla Mouatadid, Arron Norwell, Joel Oren, Periklis Papakonstantinou, Robert Robere, Jocelyn Simmonds, Dhinakaran Vinayagamurthy, Justin Ward and Dustin Wehr, for not only their iv intellectual insights but also their humour, friendship and moral support. My work would not have been possible without the excellent service provided by the departmental administrative staff members, especially Lisa DeCaro, Marina Haloulos, Vinita Krishnan and Wah-Ming Wong. I thank the Department of Computer Science, the Ontario Graduate Scholarships (2008, 2009 and 2010), and the Alfred B. Lehman Graduate Scholarship (2012 and 2013) for providing financial support for my study at the University of Toronto. I also thank the [European Community’s] Seventh Framework Programme [FP7/2007-2013] under grant agreement no 238381 for their generous funding during the Special Semester in Logic and Complexity 2011 in Prague. v Contents 1 Introduction 2 1.1 The complexity class CC and its two-sorted theory . .2 1.2 Formalizing probabilistic reasoning . .4 1.2.1 Formalizing randomized matching algorithms . .5 1.2.2 Formalizing Moser’s constructive proof of the Lovász Local Lemma . .6 1.2.3 Formalizing the proof of the Goldreich-Levin theorem . .7 1.3 Organization of the thesis . .8 2 Preliminaries 9 2.1 Background on bounded arithmetic . .9 2.1.1 Two-sorted vocabularies . .9 2.1.2 Two-sorted complexity classes and reductions . 10 2.1.3 Two-sorted theories . 11 2.1.4 Theories for polynomial-time reasoning . 13 2.1.5 The surjective weak pigeonhole principle and Wilkie’s Witnessing Theorem . 14 2.2 Notation . 15 2.3 Jeˇrábek’sapproximate counting framework . 16 3 The complexity class CC and its theory 21 3.1 Ccv and the complexity class CC ................................. 21 3.2 The stable marriage problem . 22 3.3 The new theory VCC ........................................ 23 3.4 Universal comparator circuits . 25 3.5 The theory VCC contains VNL .................................. 28 3.6 Lexicographical first maximal matching problem is CC-complete . 32 AC0 3.6.1 Ccv ≤m 3vLfmm ..................................... 33 AC0 3.6.2 vLfmm ≤m Ccv ..................................... 35 AC0 3.6.3 Ccv ≤m 3Lfmm ..................................... 35 AC0 3.6.4 Ccv: ≤m Ccv ...................................... 36 AC0 3.6.5 Lfmm ≤m Ccv: ..................................... 37 3.7 The Sm problem is CC-complete . 38 3.7.1 3Lfmm is AC0 many-one reducible to Sm,MoSm and WoSm ............ 38 3.7.2 MoSm and WoSm are AC0 many-one reducible to Ccv ................ 39 vi 4 Formalizing randomized matching algorithms 51 4.1 Edmonds’ Theorem . 52 4.2 The Schwartz-Zippel Lemma . 53 4.2.1 Edmonds’ polynomials for complete bipartite graphs . 54 4.2.2 Edmonds’ polynomials for general bipartite graphs . 55 4.2.3 The RNC2 algorithm for the bipartite perfect matching decision problem . 56 4.3 Formalizing the Hungarian algorithm . 56 4.4 FRNC2 algorithm for finding a bipartite perfect matching . 58 4.4.1 Isolating a perfect matching . 59 4.4.2 Extracting the unique minimum-weight perfect matching . 61 4.4.3 Related bipartite matching problems . 65 5 Moser’s constructive proof of the Lovász Local Lemma 66 5.1 Constructive proof for k-SAT with neighborhood size up to 2k−3 .............. 67 5.2 Constructive proofs for more general versions of the LLL . 69 6 Formalizing the Goldreich-Levin Theorem 71 6.1 Formalizing basic probability definitions in the conservative extension HARDA of VPV + sWPHP(LFP) ............................................ 71 6.1.1 Approximate probability . 74 6.1.2 Averaging principles . 75 6.1.3 Approximate expectation . 77 6.1.4 Approximate variance of a feasible random variable .

Bounded Arithmetic and Formalizing Probabilistic Proofs by Dai Tri Man

Formalizing Randomized Matching Algorithms

Coded Sparse Matrix Multiplication

Algebraic Algorithms in Combinatorial Optimization

Coded Computation for Speeding up Distributed Machine Learning

Matchings in Graphs 1 Definitions 2 Existence of a Perfect Matching

A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R

Advanced Algorithms, a Graduate-Level Course Taught by Anupam Gupta at Carnegie Mellon University in Fall 2020

Network Flow Algorithms for Wireless Networks and Design and Analysis of Rate Compatible LDPC Codes Cuizhu Shi Iowa State University

Coded Sparse Matrix Multiplication

Exact Covers Via Determinants Andreas Björklund

Admin Review String Matching

Lecture 11 — January 28, 2014 1 Overview 2 Ideas from Linear Algebra