Formalizing Randomized Matching Algorithms

Formalizing Randomized Matching Algorithms Dai Tri Man Leˆ and Stephen A. Cook Department of Computer Science, University of Toronto fledt, [email protected] Abstract—Using Jerˇabek’s´ framework for probabilistic rea- set to another. Using this method, Jerˇabek´ developed tools for soning, we formalize the correctness of two fundamental RNC2 describing algorithms in ZPP and RP. He also showed in [13], algorithms for bipartite perfect matching within the theory VPV [14] that the theory VPV+sWPHP(L ) is powerful enough to for polytime reasoning. The first algorithm is for testing if a FP bipartite graph has a perfect matching, and is based on the formalize proofs of very sophisticated derandomization results, Schwartz-Zippel Lemma for polynomial identity testing applied e.g. the Nisan-Wigderson theorem [23] and the Impagliazzo- to the Edmonds polynomial of the graph. The second algorithm, Wigderson theorem [12]. (Note that Jerˇabek´ actually used the due to Mulmuley, Vazirani and Vazirani, is for finding a perfect single-sorted theory PV1+sWPHP(PV), but these two theories matching, where the key ingredient of this algorithm is the are isomorphic.) Isolating Lemma. In [15], Jerˇabek´ developed an even more systematic approach by showing that for any bounded P=poly definable set, I. INTRODUCTION there exists a suitable pair of surjective “counting functions” 1 There is a substantial literature on theories such as PV, S2 , which can approximate the cardinality of the set up to a poly- VPV, V 1 which capture polynomial time reasoning [8], [4], nomially small error. From this and other results he argued [17], [7]. These theories prove the existence of polynomial convincingly that VPV + sWPHP(LFP) is the “right” theory time functions, and in many cases they can prove properties for reasoning about probabilistic polynomial time algorithms. of these functions that assert the correctness of the algorithms However so far no one has used his framework for feasible computing these functions. But in general these theories cannot reasoning about specific interesting randomized algorithms in prove the existence of probabilistic polynomial time relations classes such as RP and RNC2. such at those in ZPP; RP; BPP because defining the relevant In the present paper we analyze (in VPV) two such algo- probabilities involves defining cardinalities of exponentially rithms using Jerˇabek’s´ framework. The first one is the RNC2 large sets. Of course stronger theories, those which can define algorithm for determining whether a bipartite graph has a #P or PSPACE functions can treat these probabilities, but perfect matching, based on the Schwartz-Zippel Lemma [26], such theories are too powerful to capture the spirit of feasible [33] for polynomial identity testing applied to the Edmonds reasoning. polynomial [9] associated with the graph. The second algo- Note that we cannot hope to find theories that exactly cap- rithm, due to Mulmuley, Vazirani and Vazirani [21], is in the ture probabilistic complexity classes such as ZPP; RP; BPP function class associated with RNC2, and uses the Isolating because these are ‘semantic’ classes which we suppose are not Lemma to find such a perfect matching when it exists. Prov- recursively enumerable (cf. [29]). Nevertheless there has been ing correctness of these algorithms involves proving that the significant progress toward developing tools in weak theories probability of error is bounded above by 1=2. We formulate that might be used to describe some of the algorithms in these this assertion in a way suggested by Jerˇabek’s´ framework (see classes. Definition 1). This involves defining polynomial time functions Paris, Wilkie and Woods [24] and Pudlak´ [25] observed that from f0; 1gn onto f0; 1g × Φ(n), where Φ(n) is the set of we can simulate approximate counting in bounded arithmetic random bit strings of length n which cause an error in the by applying variants of the weak pigeonhole principle. It seems computation. We then show that VPV proves that the function unlikely that any of these variants can be proven in the the- is a surjection. ories for polynomial time, but they can be proven in Buss’s Our proofs are carried out in the theory VPV for polyno- theory S2 for the polynomial hierarchy. The first connection mial time reasoning, without the weak surjective pigeonhole between the weak pigeonhole principle and randomized algo- principle sWPHP(LFP). Jerˇabek´ used the sWPHP(LFP) prin- rithms was noticed by Wilkie (cf. [17]), who showed that ran- ciple to prove theorems justifying the above definition of error b 1 domized polytime functions witness Σ1-consequences of S2 + probability, but we do not need it to apply his definition. B 1 sWPHP(PV) (i.e., Σ1 -consequences of V + sWPHP(LFP) Many proofs concerning determinants are based on the La- in our two-sorted framework), where sWPHP(PV) denotes the grange expansion. Since our proofs in VPV can only use poly- surjective weak pigeonhole principle for all PV functions (i.e. nomial time concepts, we cannot formalize such proofs, and polytime functions). must use other techniques. In the same vein, the standard proof Building on these early results, Jerˇabek´ [13] showed that of the Schwartz-Zippel Lemma assumes that a multivariate we can “compare” the sizes of two bounded P=poly definable polynomial given by an arithmetic circuit can be expanded to sets within VPV by constructing a surjective mapping from one a sum of monomials. But this sum in general has exponentially B many terms so again we cannot directly formalize this proof function is Σ0 -definable from L if it is p-bounded and its B in VPV. graph is represented by a Σ0 (L)-formula. The theory V 0 for AC0 is the basis to develop theories II. PRELIMINARIES for small complexity classes within P in [7]. The theory V 0 2 A. Basic bounded arithmetic consists of the vocabulary LA and axiomatized by the sets of 2-BASIC axioms, which express basic properties of symbols The theory VPV for polynomial time reasoning used here 2 in LA, together with the comprehension axiom schema is a two-sorted theory described by Cook and Nguyen [7]. B 2 The two-sorted language has variable x; y; z; : : : ranging over Σ0 (LA)-COMP: 9X ≤ y 8z < y X(z) $ '(z) ; N and variables X; Y; Z; : : : ranging over finite subsets of N, where ' 2 ΣB(L2 ) and X does not occur free in '. 2 0 A interpreted as bit strings. Two sorted vocabulary LA includes In [7, Chapter 5], it was shown that V 0 is finitely axioma- the usual symbols 0; 1; +; ·; =; ≤ for arithmetic over N, the tizable and a p-bounded function is in FAC0 iff it is provably length function jXj for strings, the set membership relation total in V 0. A universally-axiomatized conservative extension 2, and string equality =2 (subscript 2 is usually omitted). We V 0 of V 0 was also obtained by introducing function symbols will use the notation X(t) for t 2 X, and think of X(t) as and their defining axioms for all FAC0 functions. th the t bit in the string X. In [7, Chapter 9], Cook and Nguyen showed how to asso- 2 The number terms in the base language LA are built from ciate a theory VC to each complexity class C ⊆ P, where VC the constants 0; 1, variables x; y; z; : : : and length terms jXj extends V 0 with an additional axiom asserting the existence using + and ·. The only string terms are string variables, but of a solution to a complete problem for C. General techniques 2 when we extend LA by adding string-valued functions, other are also presented for defining a universally-axiomatized con- string terms will be built as usual. The atomic formulas are servative extension VC of VC which has function symbols t = u, X = Y , t ≤ u, t 2 X for any number terms x; y and defining axioms for every function in FC , and VC ad- and string variables X; Y . Formulas are built from atomic mits induction on open formulas in this enriched vocabulary. formulas using ^; _; : and 9x, 9X, 8x, 8X. Bounded number It follows from Herbrand’s Theorem that the provably-total quantifiers are defined as usual, and bounded string quantifier functions in VC (and hence in VC ) are precisely the func- 9X ≤ t; ' stands for 9X(jXj ≤ t ^ ') and 8X ≤ t; ' stands tions in FC . Using this framework, Cook and Nguyen defined for 8X(jXj ≤ t ! '), where X does not appear in term t. explicitly theories for various complexity classes within P. B 2 Σ0 is the class of all LA-formulas with no string quantifiers Since we need some basic linear algebra in this paper, we B and only bounded number quantifiers. Σ1 -formulas are those are interested in the two-sorted theory V #L and its univer- ~ ~ B of the form 9X < t ', where ' 2 Σ0 and the prefix of sal conservative extension V #L from [6]. Recall that #L the bounded quantifiers might be empty. These classes are is usually defined as the class of functions f such that for B B extended to Σi (and Πi ) for all i ≥ 0, in the usual way. some nondeterministic logspace Turing machine M, f(x) is Two-sorted complexity classes contain relations R(~x; X~ ), the number of accepting computations of M on input x. Since where ~x are number arguments and X~ are string arguments. counting the number of accepting paths of nondeterministic In defining complexity classes using machines or circuits, the logspace is AC0-equivalent to matrix powering, V #L was number arguments are represented in unary notation and the defined to be the extension of the base theory V 0 with an string arguments are represented in binary.

Formalizing Randomized Matching Algorithms

Coded Sparse Matrix Multiplication

Algebraic Algorithms in Combinatorial Optimization

Coded Computation for Speeding up Distributed Machine Learning

Matchings in Graphs 1 Definitions 2 Existence of a Perfect Matching

A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R

Advanced Algorithms, a Graduate-Level Course Taught by Anupam Gupta at Carnegie Mellon University in Fall 2020

Network Flow Algorithms for Wireless Networks and Design and Analysis of Rate Compatible LDPC Codes Cuizhu Shi Iowa State University

Coded Sparse Matrix Multiplication

Exact Covers Via Determinants Andreas Björklund

Admin Review String Matching

Lecture 11 — January 28, 2014 1 Overview 2 Ideas from Linear Algebra

Algebraic Algorithms for Matching