Research Statement Prahladh Harsha
Total Page:16
File Type:pdf, Size:1020Kb
Research Statement Prahladh Harsha 1 Introduction My research interests are in the area of theoretical computer science, with special emphasis on computational complexity. The primary focus of my research has been in the area of probabilistically checkable proofs, and related areas such as property testing and information theory. The fundamental question in computational complexity is “what are the limits of feasible computation?” In fact, one of the articulations of this question is the famous “P vs. NP” question, where P refers to the class of problems that can be solved in polynomial time and NP to the class of problems whose solutions can be verified in polynomial time. To understand the limitations of efficient computation, we first need to understand what we mean by “efficient computation.” This natural connection between feasibility and hardness has many a time led to surprising consequences in complexity theory. One prime example is that of probabilistically checkable proofs. The original emphasis in the study of probabilistically checkable proofs was in the area of program checking. Surprisingly, it was soon realized that the existence of efficient probabilistic program checkers actually implied that the approximation versions of several NP-complete optimization problems were as intractable as the original optimization problems. Probabilistically checkable proofs provide an extremely efficient means of proof verification. The classical complexity class NP refers to the class of languages whose membership can be verified with the aid of a polynomial sized proof. Probabilistically checkable proofs (PCPs) are a means of encoding these proofs (and more generally any mathematical proof) into a format such that the encoded proof can be checked very efficiently, although in a probabilistic manner, by looking at it at only a constant number of locations (in fact, 3 bits suffice!) The main question addressed by my research in this area is the following: “how much does this encoding blow up the original proof while retaining the constant number of queries into the proof, and how efficiently (with respect to running time) can the checking be performed?” An important contribution of my work is the notion of a proof of proximity (also called PCP of proximity). A PCP of proximity is a strengthening of a PCP in the sense that it helps to decide if a statement is true with the help of an additional proof in the form of a PCP, by merely probing the statement at a few locations. In other words, a PCP of proximity makes constant probes not only to the proof but also to the statement whose truth it is checking. With such a stringent requirement, a PCP of proximity cannot distinguish true statements from false; however it can distinguish true statements from ones that are far from being true (in the sense that the statement is far from any true statement in Hamming distance). Thus, a PCP of proximity checks if a given statement is close to being true, without even reading the statement in its entirety! Hence, the name, proof of proximity. PCPs of proximity play a vital role in the construction of short PCPs, both in my work and in subsequent developments in the area of probabilistically checkable proofs. PCPs of proximity are also used in coding theory. All known constructions of locally testable codes are via PCPs of proximity. PCPs of proximity have also come very handy in simplifying the original proof of the PCP Theorem, which is one of the most involved proofs in complexity theory. In fact, the recent fully combinatorial proof of the PCP Theorem (due to Dinur [Din07]) crucially relies on PCPs of proximity. As mentioned above, the main focus of my research has been in the area of probabilistically checkable proofs. I have also worked in other areas such as property testing, information theory, proof complexity, network routing etc. Below I elaborate my work in three of these areas – probabilistically checkable proofs, property testing, and information theory. 2 Probabilistically checkable proofs The PCP Theorem The PCP Theorem [AS98, ALM+98] is one of the crowning achievements of com- plexity theory in the last decade. Probabilistically checkable proofs [BFLS91, FGL+96, AS98], as mentioned 1 earlier, are proofs that allow efficient probabilistic verification based on probing just a few bits of the proof. Informally speaking, the PCP Theorem states that any mathematical proof can be rewritten into a polyno- mially longer probabilistically checkable proof (PCP) such that its veracity can be checked very efficiently, although in a probabilistic manner, by looking at the rewritten proof at only a constant number of locations (in fact, 3 bits suffice) and furthermore proofs of false assertions are rejected with probability at least 1/2. The PCP Theorem has, since its discovery, attracted a lot of attention, motivated by its connection to inapproximability of optimization problems [FGL+96, AS98]. This connection led to a long line of fruitful research yielding inapproximability results (many of them optimal) of several optimization problems (e.g., Set-Cover [Fei98], Max-Clique [H˚as99], MAX-3SAT [H˚as01]). However, the significance of PCPs extends far beyond their applicability to deriving inapproximability results. The mere fact that proofs can be transformed into a format that supports super-fast probabilistic verification is remarkable. One would have naturally expected PCPs, as the name suggests, to lead to vast improvement in automated proof-checkers, theorem-provers, etc. However, unfortunately, this has not been the case. The chief reason why PCPs are not being used today in practice for automated proof-checking is that the blowup of the proof-size involved in all present constructions of PCPs makes it infeasible to do so. Just to put things in perspective, the original proof of the PCP Theorem [ALM+98] constructed PCPs of nearly cubic length with a query complexity roughly of the order of a million (in order to reject proofs of false assertion with probability at least 1/2). On the other hand, the 3-query optimal PCPs of [H˚as01, GLST98] 6 have length nearly n10 , which is still a polynomial! Even with respect to inapproximability results, though the PCP Theorem has been extremely successful in proving tight hardness results, the quantitative nature of these results has been rather unsatisfactory, once again due to the blowup involved in PCP constructions. To understand this it is instructive to compare the inapproximability hardness results obtained from the PCP Theorem with the optimization hardness results obtained from the usual textbook NP-completeness reductions. For example, the NP-completeness reduction from Satisfiability (SAT) to Clique transforms a Boolean formula on n variables to a graph with at most 10n vertices. On the other hand, the PCP reductions which show optimal inapproximability of Max- 6 Clique transform a Boolean formula of size n to a graph of size at least n10 . What these results imply in quantitative terms is that if one assumes solving satisfiability on formulae with 1, 000 variables is intractable, then NP-completeness reductions imply that solving Clique is intractable on graphs with 10, 000 vertices; while, the PCP reductions would imply that the optimal inapproximability hardness results for Max-Clique 6 sets in on graphs of size at least 100010 . 2.1 My research Short PCPs: Most of my work in the area of PCPs has focused on constructing short PCPs. In work done as part of my master’s thesis [HS00], I examine the size and query complexity of PCPs jointly and obtain a construction with reasonable performance in both parameters (more precisely n3 sized proofs with a query complexity of 16 bits). In a more recent work with Ben-Sasson, Goldreich, Sudan and Vadhan [BGH+06], I take a closer look at the PCP Theorem, simplify several parameters and obtain shorter PCPs. In quantitative terms, we obtain PCPs that are at most n · exp(logε n) in the size of the original proof for any ε > 0. PCPs of proximity and composition: Besides constructing short PCPs, the chief contribution of our work [BGH+06] is the simplification of PCP constructions. Previous construction of PCPs were extremely involved and elaborate. One of the main reasons for this is that “proof composition,” a key ingredient in all known PCP constructions, is a very involved process. We introduce “PCPs of proximity,” (a variant of PCPs mentioned earlier in Section 1) which facilitate very smooth composition – in fact, composition becomes almost definitional and syntactic given this variant. This new variant of PCPs and the corresponding composition have played a critical role in subsequent improvements in short PCP constructions (due to Ben- Sasson and Sudan [BS05] and Dinur [Din07]). Furthermore, these simplifications in the original proof of the PCP Theorem, in the guise of PCPs of proximity and the new composition, led to an alternate purely combinatorial proof of the PCP Theorem, due to Dinur [Din07]. This work [BGH+06] was invited to the 2 special issue of SIAM Journal on Computing on Randomness and Computation as well as the special issue of SIAM Journal on Computing for STOC 2004. Efficient PCPs: In the context of efficient proof verifiers, the running time of the verification process is as important a parameter as the length of the PCP. In fact, the emphases of the initial work of Babai et al. [BFLS91] in the area of PCPs was the time taken by the verifier and the length of the proof in the new format. In contrast, most succeeding works on PCPs have focused on the query complexity of the verifier and derived many strong inapproximability results for a wide variety of optimization problems; however, no later work seems to have returned to the question of the extreme efficiency of the verifier.