C Copyright 2007 Atri Rudra

Total Page:16

File Type:pdf, Size:1020Kb

C Copyright 2007 Atri Rudra c Copyright 2007 Atri Rudra List Decoding and Property Testing of Error Correcting Codes Atri Rudra A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2007 Program Authorized to Offer Degree: Computer Science and Engineering University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Atri Rudra and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Chair of the Supervisory Committee: Venkatesan Guruswami Reading Committee: Paul Beame Venkatesan Guruswami Dan Suciu Date: In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with “fair use” as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to Proquest Information and Learning, 300 North Zeeb Road, Ann Arbor, MI 48106-1346, 1-800-521-0600, to whom the author has granted “the right to reproduce and sell (a) copies of the manuscript in microform and/or (b) printed copies of the manuscript made from microform.” Signature Date University of Washington Abstract List Decoding and Property Testing of Error Correcting Codes Atri Rudra Chair of the Supervisory Committee: Associate Professor Venkatesan Guruswami Department of Computer Science and Engineering Error correcting codes systematically introduce redundancy into data so that the original in- formation can be recovered when parts of the redundant data are corrupted. Error correcting codes are used ubiquitously in communication and data storage. The process of recovering the original information from corrupted data is called decod- ing. Given the limitations imposed by the amount of redundancy used by the error correct- ing code, an ideal decoder should efficiently recover from as many errors as information- theoretically possible. In this thesis, we consider two relaxations of the usual decoding procedure: list decoding and property testing. A list decoding algorithm is allowed to output a small list of possibilities for the original information that could result in the given corrupted data. This relaxation allows for effi- cient correction of significantly more errors than what is possible through usual decoding procedure which is always constrained to output the transmitted information. We present the first explicit error correcting codes along with efficient list-decoding • algorithms that can correct a number of errors that approaches the information-theoretic limit. This meets one of the central challenges in the theory of error correcting codes. We also present explicit codes defined over smaller symbols that can correct signifi- • cantly more errors using efficient list-decoding algorithms than existing codes, while using the same amount of redundancy. We prove that an existing algorithm for a specific code family called Reed-Solomon • codes is optimal for “list recovery,” a generalization of list decoding. Property testing of error correcting codes entails “spot checking” corrupted data to quickly determine if the data is very corrupted or has few errors. Such spot checkers are closely related to the beautiful theory of Probabilistically Checkable Proofs. We present spot checkers that only access a nearly optimal number of data symbols • for an important family of codes called Reed-Muller codes. Our results are the first for certain classes of such codes. We define a generalization of the “usual” testers for error correcting codes by en- • dowing them with the very natural property of “tolerance,” which allows slightly corrupted data to pass the test. TABLE OF CONTENTS Page ListofFigures...................................... vi ListofTables ...................................... vii Chapter1: Introduction.............................. 1 1.1 BasicsofErrorCorrectingCodes. ... 2 1.1.1 Historical Background and Modeling the Channel Noise ...... 3 1.2 ListDecoding................................. 5 1.2.1 GoingBeyondHalftheDistanceBound . 8 1.2.2 WhyisListDecodingAnyGood?. 10 1.2.3 The Challenge of List Decoding (and What Was Already Known) . 12 1.3 PropertyTestingofErrorCorrectingCodes . ...... 13 1.3.1 ABriefHistoryofPropertyTestingofCodes . ... 14 1.4 ContributionsofThisThesis . .. 15 1.4.1 ListDecoding............................. 15 1.4.2 PropertyTesting ........................... 17 1.4.3 OrganizationoftheThesis . 19 Chapter2: Preliminaries ............................. 20 2.1 TheBasics................................... 20 2.1.1 BasicDefinitionsforCodes . 21 2.1.2 CodeFamilies ............................ 22 2.1.3 LinearCodes ............................. 22 2.2 Preliminaries and Definitions Related to List Decoding . .......... 24 2.2.1 Ratevs.Listdecodability . 26 2.2.2 Results Related to the q-aryEntropyFunction . 30 2.3 DefinitionsRelatedtoPropertyTestingofCodes . ....... 34 i 2.4 CommonFamiliesofCodes . 35 2.4.1 Reed-SolomonCodes . 35 2.4.2 Reed-MullerCodes. 36 2.5 BasicFiniteFieldAlgebra . 36 Chapter3: ListDecodingofFoldedReed-SolomonCodes . ....... 38 3.1 Introduction.................................. 38 3.2 FoldedReed-SolomonCodes. 41 3.2.1 DescriptionofFoldedReed-SolomonCodes . .. 41 3.2.2 WhyMightFoldingHelp? . 43 3.2.3 RelationtoParvareshVardyCodes. 44 3.3 Problem Statement and Informal Description of the Algorithms. 45 3.4 TrivariateInterpolationBasedDecoding . ....... 48 3.4.1 FactsaboutTrivariateInterpolation . .... 49 3.4.2 Using Trivariate Interpolation for Folded RS Codes . ....... 51 3.4.3 Root-findingStep. .. .. .. .. .. .. .. 53 3.5 CodesApproachingListDecodingCapacity . ..... 56 3.6 ExtensiontoListRecovery . 63 3.7 BibliographicNotesandOpenQuestions. ..... 66 Chapter4: ResultsviaCodeConcatenation . ..... 69 4.1 Introduction.................................. 69 4.1.1 CodeConcatenationandListRecovery . 70 4.2 Capacity-Achieving Codes over Smaller Alphabets . ........ 72 4.3 BinaryCodesListDecodableuptotheZyablovBound . ...... 76 4.4 UniqueDecodingofaRandomEnsembleofBinary Codes . ..... 77 4.5 ListDecodinguptotheBlokhZyablovBound. .... 80 4.5.1 MultilevelConcatenatedCodes . 81 4.5.2 LinearCodeswithGoodNestedListDecodability . .... 84 4.5.3 ListDecodingMultilevelConcatenatedCodes . .... 88 4.5.4 PuttingitTogether . .. .. .. .. .. .. .. 92 4.6 BibliographicNotesandOpenQuestions. ..... 94 ii Chapter 5: List Decodability of Random Linear Concatenated Codes. 96 5.1 Introduction.................................. 96 5.2 Preliminaries ................................. 97 5.3 OverviewoftheProofTechniques . 101 5.4 ListDecodabilityofRandomConcatenatedCodes . .......103 5.5 UsingFoldedReed-SolomonCodeasOuterCode . .111 5.5.1 Preliminaries .............................111 5.5.2 TheMainResult ...........................113 5.6 BibliographicNotesandOpenQuestions. .117 Chapter6: LimitstoListDecodingReed-SolomonCodes . .......119 6.1 Introduction..................................119 6.2 OverviewoftheResults. 120 6.2.1 LimitationstoListRecovery . 120 6.2.2 Explicit“Bad”ListDecodingConfigurations . .122 6.2.3 ProofApproach. .. .. .. .. .. .. .. .123 6.3 BCH CodesandListRecoveringReed-SolomonCodes . .123 6.3.1 MainResult..............................123 6.3.2 ImplicationsforReed-SolomonListDecoding . .129 6.3.3 Implications for List Recovering Folded Reed-SolomonCodes . 130 6.3.4 A Precise Description of Polynomials with Values in BaseField . 131 6.3.5 SomeFurtherFactsonBCHCodes . 133 6.4 Explicit Hamming Balls with Several Reed-Solomon Codewords . 135 6.4.1 ExistenceofBadListDecodingConfigurations . .135 6.4.2 LowRateReed-SolomonCodes . 136 6.4.3 HighRateReed-SolomonCodes . 140 6.5 BibliographicNotesandOpenQuestions. .143 Chapter7: LocalTestingofReed-MullerCodes . ......147 7.1 Introduction..................................147 7.1.1 ConnectiontoCodingTheory . 148 7.1.2 OverviewofOurResults . .148 7.1.3 OverviewoftheAnalysis. .150 iii 7.2 Preliminaries .................................151 7.2.1 FactsfromFiniteFields . .154 7.3 Characterization of Low Degree Polynomials over Fp ............156 Fn 7.4 A Tester for Low Degree Polynomials over p ................162 7.4.1 Testerin Fp ..............................163 7.4.2 Analysis of Algorithm Test- ....................164 Pt 7.4.3 ProofofLemma7.11. .167 7.4.4 ProofofLemma7.12. .173 7.4.5 ProofofLemma7.13. .178 7.5 ALowerBoundandImprovedSelf-correction . .179 7.5.1 ALowerBound............................179 7.5.2 ImprovedSelf-correction. 180 7.6 BibliographicsNotes . .. .. .. .. .. .. .. .. 182 Chapter8: TolerantLocallyTestableCodes . ......185 8.1 Introduction..................................185 8.2 Preliminaries .................................186 8.3 GeneralObservations . .. .. .. .. .. .. .. .. 188 8.4 TolerantTestersforBinaryCodes . 191 8.4.1 PCPofProximity. .. .. .. .. .. .. .. .191 8.4.2 TheCode ...............................195 8.5 ProductofCodes ...............................200 8.5.1 TolerantTestersforTensorProductsofCodes . .200 8.5.2 RobustTestabilityofProductofCodes . 202 8.6 TolerantTestingofReed-MullerCodes. .205 8.6.1 BivariatePolynomialCodes . 205 8.6.2 GeneralReed-MullerCodes . 207 8.7 BibliographicNotesandOpenQuestions. .209 Chapter9: ConcludingRemarks . 211 9.1 SummaryofContributions . 211 9.2
Recommended publications
  • Ashwin Nayak Project Suggestions
    C&O 739-2 Information Theory and Applications University of Waterloo, Winter 2020 Instructor: Ashwin Nayak Project Suggestions The following is a partial list of research articles related to the theme of the course. For related papers and other ideas, please talk to me. Please confirm the topic and the scope of the project with me. Applications of entropic quantities. • Sorting a partially ordered sequence: Jeff Kahn and Jeong Han Kim. Entropy and sorting. Journal of Computer and System Sciences, 51(3):390{399, December 1995. • Monotone functions: Jeff Kahn. Entropy, independent sets and antichains: a new approach to Dedekind's problem. Proceedings of the American Mathematical Society, 130(2):371{378, 2001. • Predecessor-search: Pranab Sen and Srinivasan Venkatesh. Lower bounds for predecessor searching in the cell probe model. Journal of Computer and System Sciences, 74(3):364{385, 2008. • More data-structures: Mihai Patrascu and Erik Demaine. Logarithmic Lower Bounds in the Cell- Probe Model. SIAM Journal on Computing, 35(4):932{963, 2006. • Set Disjointness and extended formulations: Mark Braverman and Ankur Moitra. An information complexity approach to extended formulations. In Proceedings of the forty-fifth Annual ACM Sym- posium on Theory of Computing, pp. 161{170, 2013. • Explicit polytopes with high extension complexity: Mika G¨o¨os,Rahul Jain, and Thomas Watson. Extension Complexity of Independent Set Polytopes. SIAM Journal on Computing, 47(1):241{269, 2018. Compression of interactive protocols. • Compression, direct sum: Boaz Barak, Mark Braverman, Xi Chen, Anup Rao. How to Compress Interactive Communication. SIAM Journal on Computing, 42(3):1327{1363, 2013.
    [Show full text]
  • The Presburger Award for Young Scientists 2020
    The Presburger Award for Young Scientists 2020 Call for Nominations Deadline: 15 February 2020 Starting in 2010, the European Association for Theoretical Computer Science (EATCS) established the Presburger Award. The Award is conferred annually at the International Colloquium on Automata, Languages and Programming (ICALP) to a young scientist (in exceptional cases to several young scientists) for outstand- ing contributions in theoretical computer science, documented by a published pa- per or a series of published papers. The Award is named after Moj˙zeszPresburger who accomplished his path- breaking work on decidability of the theory of addition (which today is called Presburger arithmetic) as a student in 1929. Nominations for the Presburger Award can be submitted by any member or group of members of the theoretical computer science community except the nom- inee and his/her advisors for the master thesis and the doctoral dissertation. Nom- inated scientists have to be at most 35 years at the time of the deadline of nomi- nation (i.e., for the Presburger Award of 2020 the date of birth should be in 1984 or later). The Presburger Award Committee of 2020 consists of Thore Husfeldt (Lund University and IT University of Copenhagen), Meena Mahajan (Chennai Mathematical Institute) and Anca Muscholl (LaBRI, Bordeaux, chair). Nomina- tions, consisting of a two page justification and (links to) the respective papers, as well as additional supporting letters, should be sent by e-mail to: [email protected] The subject line of every nomination should start with Presburger Award 2020, and the message must be received before February 15th, 2020.
    [Show full text]
  • Curriculum Vitae Venkatesan Guruswami
    Curriculum Vitae Venkatesan Guruswami May 2021 Webpage: http://www.cs.cmu.edu/~venkatg Email: [email protected] Contents 1 Education 2 2 Employment 2 3 Research Interests 2 4 Awards and Honors 3 5 Professional activities 3 6 Graduate student and postdoc supervision5 7 Selected invited talks/lecture series7 8 Publications 10 8.1 Books................................................. 10 8.2 Journal Publications......................................... 10 8.3 Refereed conference publications.................................. 15 8.4 Invited papers and surveys..................................... 25 1 Education Massachusetts Institute of Technology, Cambridge, MA. Ph.D., Computer Science August 2001 Dissertation: List Decoding of Error-Correcting Codes Winner, ACM Doctoral Dissertation Award, 2002. Advisor: Professor Madhu Sudan Master of Science, Computer Science May 1999 Thesis: Query-efficient Checking of Proofs and Improved PCP Characterizations of NP. Indian Institute of Technology, Madras (Chennai, India) Bachelor of Technology (B.Tech), Computer Science and Engineering June 1997 Postdoctoral fellowship Miller Research Fellow Sept 2001 - Aug 2002 Miller Institute for Basic Research in Science University of California, Berkeley, CA. 2 Employment Computer Science Department Carnegie Mellon University, Pittsburgh, PA. • Director of Ph.D. program June 2019 - present • Professor July 2014 - present • Associate Professor (tenured) July 2009 - June 2014 • Visiting Associate Professor Sept 2008 - June 2009 Visiting Researcher March-May 2018 Center for Mathematical Sciences and Applications, Harvard University. Visiitng Professor July 2017-Feb 2018 School of Physical & Mathematical Sciences, Nanyang Technological University, Singapore. Visiting Researcher January-June 2014 Microsoft Research New England. Member, School of Mathematics Sept 2007 - May 2008 Institute for Advanced Study, Princeton, NJ. Department of Computer Science and Engineering University of Washington, Seattle, WA.
    [Show full text]
  • Mathematics People
    NEWS Mathematics People Union. She is also a member of the board of the College Toro Awarded Assistant Migrant Program (CAMP) at the University of Washington. This program is federally funded through Blackwell–Tapia Prize the US Department of Education’s Office of Migrant Edu- Tatiana Toro of the University cation. It is designed to outreach to and support students of Washington, Seattle, has been from migrant and seasonal farmworker families during awarded the 2020 Blackwell–Tapia their first year in college. Inspired by the CAMP students, Prize. The prize honors excellence Toro spearheaded an effort to launch the first Latinx in in research among people who have the Mathematical Sciences Conference (LATMATH). This promoted diversity within the math- conference took place at IPAM in April 2015. Participants ematical and statistical sciences. included high school students, undergraduate students, The prize citation reads in part: graduate students, postdoctoral scholars and faculty, and “Toro is an analyst whose work lies researchers in industry and government. In 2018 she co- Tatiana Toro at the interface of geometric measure organized the second Latinx in the Mathematical Sciences theory, harmonic analysis, and par- Conference funded through the Mathematical Sciences tial differential equations. Her work focuses on understand- Institutes Diversity initiative. This conference attracted over ing mathematical questions that arise in an environment 250 participants.” where the known data are very rough. The main premise Toro was born in Bogotá, Colombia, and received her of her work is that under the right lens, objects, which at PhD from Stanford University in 1992 under the direction first glance might appear to be very irregular, do exhibit of Leon Simon.
    [Show full text]
  • Curriculum Vitae, Mahdi Cheraghchi
    September 2021 Page 1 of 12 MAHDI CHERAGHCHI Curriculum Vitae Mailing address: 2260 Hayward St., Room 3603 Phone: +1-734-763-9165 Department of EECS Fax: +1-734-763-1260 University of Michigan Email: [email protected] Ann Arbor, MI 48109 Web: http://mahdi.ch USA ORCID: 0000-0001-8957-0306 ResearcherID: N-1367-2015 Main research Interests • Interconnections between theoretical computer science and information and coding theory, • Sparse recovery (e.g., compressed sensing, sparse Fourier transforms, combinatorial group testing), high dimensional geometry and their applications to algorithms for massive data, • Information-theoretic privacy and cryptography, • Explicit construction of combinatorial objects and derandomization theory. Education • Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. (November 2005 – July 2010) Ph.D. in Computer Science. Dissertation Title: Applications of Derandomization Theory in Coding. Supervisor: Amin Shokrollahi, Professor. • Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. (October 2004 – July 2005) M.Sc. in Computer Science. Dissertation Title: Locally Testable Codes. (available online in ECCC thesis archive.) Supervisor: Amin Shokrollahi, Professor. • Sharif University of Technology, Tehran, Iran. (September 2000 – July 2004) B.Sc. in Software Engineering and B.Sc. in Computer Hardware Engineering. Work Experience / Affiliations • (September 2019–present) University of Michigan, Ann Arbor: Assistant Professor of Computer Science and Engineering (CSE), Electrical Engineering
    [Show full text]
  • Lipics-CCC-2021-0.Pdf (0.5
    36th Computational Complexity Conference CCC 2021, July 20–23, 2021, Toronto, Ontario, Canada (Virtual Conference) Edited by Valentine Kabanets L I P I c s – Vo l . 200 – CCC 2021 w w w . d a g s t u h l . d e / l i p i c s Editors Valentine Kabanets School of Computing Science Simon Fraser University Burnaby, BC Canada [email protected] ACM Classification 2012 Theory of computation ISBN 978-3-95977-193-1 Published online and open access by Schloss Dagstuhl – Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing, Saarbrücken/Wadern, Germany. Online available at https://www.dagstuhl.de/dagpub/978-3-95977-193-1. Publication date July, 2021 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at https://portal.dnb.de. License This work is licensed under a Creative Commons Attribution 4.0 International license (CC-BY 4.0): https://creativecommons.org/licenses/by/4.0/legalcode. In brief, this license authorizes each and everybody to share (to copy, distribute and transmit) the work under the following conditions, without impairing or restricting the authors’ moral rights: Attribution: The work must be attributed to its authors. The copyright is retained by the corresponding authors. Digital Object Identifier: 10.4230/LIPIcs.CCC.2021.0 ISBN 978-3-95977-193-1 ISSN 1868-8969 https://www.dagstuhl.de/lipics 0:iii LIPIcs – Leibniz International Proceedings in Informatics LIPIcs is a series of high-quality conference proceedings across all fields in informatics.
    [Show full text]
  • The Presburger Award for Young Scientists 2021
    The Presburger Award for Young Scientists 2021 Call for Nominations Deadline: 15 February 2021 The Presburger Award recognises outstanding contributions by a young scientist in theoretical computer science, documented by a published paper or a series of published papers. It is named after Mojzesz Presburger who accomplished his ground-breaking work on decidability of the theory of addition (known today as Presburger arithmetic) as a student in 1929. The award is conferred annually by the European Association for Theoretical Computer Science (EATCS) at the International Colloquium on Automata, Languages, and Programming (ICALP). Nominated scientists can be at most 35 years old on January 1st of the year of the award. Thus, for the 2021 award, the nominee should be born in 1985 or later. Nominations for the Presburger Award can be submitted by any member or group of members of the theoretical computer science community, but not by the nom- inee themselves nor the advisors for their master’s thesis or doctoral dissertation. The Presburger Award Committee of 2021 consists of Thore Husfeldt (Lund University and IT University of Copenhagen, chair), Meena Mahajan (The Insti- tute of Mathematical Sciences, Chennai) and Mikołaj Bojanczyk,´ (University of Warsaw). Nominations, consisting of a two page justification and (links to) the re- spective papers, as well as additional supporting letters, should be sent by e-mail to: [email protected] The subject line of every nomination should start with Presburger Award 2021, and the message must be received before February 15th, 2021. The award includes an amount of 1000 Euro and an invitation to ICALP 2021 for a lecture.
    [Show full text]
  • Accepted Papers Paper Titles and Author Information Appears As Submitted to Easy Chair
    SODA17 – List of Accepted Papers Paper titles and author information appears as submitted to Easy Chair. Paper title and author changes will not be made to this document. The online program will reflect the most up-to-date presentation details, and is scheduled for posting in November (http://www.siam.org/meetings/da17/program.php). Minimum Fill-In: Inapproximability and Almost Tight Lower Bounds Yixin Cao and R. B. Sandeep An O(nm) time algorithm for finding the min length directed cycle in a weighted graph James Orlin Random cluster dynamics for the Ising model is rapidly mixing Heng Guo and Mark Jerrum A constant-time algorithm for middle levels Gray codes Jerri Nummenpalo and Torsten Mütze Maximum Scatter TSP in Doubling Metrics László Kozma and Tobias Mömke Improved bounds and parallel algorithms for the Lovasz Local Lemma David Harris and Bernhard Haeupler Deterministic parallel algorithms for fooling polylogarithmic juntas and the Lov\'{a}sz Local Lemma David Harris A Logarithmic Additive Integrality Gap for Bin Packing Rebecca Hoberg and Thomas Rothvoss Firefighting on Trees Beyond Integrality Gaps David Adjiashvili, Andrea Baggio and Rico Zenklusen Three Colors Suffice: Conflict-Free Coloring of Planar Graphs Zachary Abel, Victor Alvarez, Erik D. Demaine, Sándor Fekete, Aman Gour, Adam Hesterberg, Phillip Keldenich and Christian Scheffer Counting matchings in irregular bipartite graphs and random lifts Marc Lelarge Geodesic Spanners for Points on a Polyhedral Terrain Mohammad Ali Abam, Mark de Berg and Mohammad Javad Rezaei seraji
    [Show full text]
  • The Presburger Award 2021 Laudatio for Shayan Oveis Gharan
    The Presburger Award 2021 Laudatio for Shayan Oveis Gharan The 2021 Presburger Committee has unanimously selected Shayan Oveis Gharan as the recipient of the 2021 EATCS Presburger Award for Young Scientists for his creative, profound, and ambitious contributions to the Traveling Salesman problem. Traveling Salesman is a drosophilia of theoretical computer science: immedi- ately accessible, intellectually appealing, computationally difficult, and extremely well-studied. It serves as a milestone and showcase for progress in our scientific understanding of the design and analysis of algorithms. Oveis Gharan’s work began during his PhD thesis, in a paper that presents an asymptotically sublogarithmic approximation guarantee for the Traveling Sales- man problem for the general (or, “asymmetric”) case. The primary tool is the neg- ative dependence between the presence of edges in a certain distribution on ran- dom spanning trees of a graph, a theme that has matured and expanded throughout much of the recipient’s work. In a remarkable tour de force, Oveis Gharan and his coauthors followed this up with a (3=2 − ")-approximation for the symmetric graphic case, and later also for the metric case. These results rely on a deeper understanding of negative dependence through the lens of strongly Rayleigh mea- sures and real-stable polynomials, the polyhedral structure of the Held–Karp poly- tope, and the combinatorics of near-minimum cuts in a graph. Besides his work on TSP, Oveis Gharan exhibits an amazingly consistent abil- ity to make progress on many other problems that had remained open for decades. For example, a Cheeger inequality proved by Oveis Gharan and co-authors was the main technical tool in Miclo’s proof of a 40-year old conjecture of Simon and Høegh-Krohn in the field of mathematical physics.
    [Show full text]
  • Correlation Clustering with a Fixed Number of Clusters∗
    THEORY OF COMPUTING, Volume 2 (2006), pp. 249–266 http://theoryofcomputing.org Correlation Clustering with a Fixed Number of Clusters∗ Ioannis Giotis Venkatesan Guruswami† Received: October 18, 2005; published: October 22, 2006. Abstract: We continue the investigation of problems concerning correlation clustering or clustering with qualitative information, which is a clustering formulation that has been studied recently (Bansal, Blum, Chawla (2004), Charikar, Guruswami, Wirth (FOCS’03), Charikar, Wirth (FOCS’04), Alon et al. (STOC’05)). In this problem, we are given a complete graph on n nodes (which correspond to nodes to be clustered) whose edges are labeled + (for similar pairs of items) and − (for dissimilar pairs of items). Thus our input consists of only qualitative information on similarity and no quantitative distance measure between items. The quality of a clustering is measured in terms of its number of agreements, which is simply the number of edges it correctly classifies, that is the sum of number of − edges whose endpoints it places in different clusters plus the number of + edges both of whose endpoints it places within the same cluster. In this paper, we study the problem of finding clusterings that maximize the number of agreements, and the complementary minimization version where we seek clusterings that min- imize the number of disagreements. We focus on the situation when the number of clusters is stipulated to be a small constant k. Our main result is that for every k, there is a polynomial time approximation scheme for both maximizing agreements and minimizing disagreements. ∗A preliminary version of this paper appeared in the Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, January 2006.
    [Show full text]
  • Curriculum Vitae, Mahdi Cheraghchi
    July 2015 Page 1 of 7 MAHDI CHERAGHCHI Curriculum Vitae Mailing address: Department of Computing Citizenship: Iran Imperial College London USA Permanent Resident Huxley Building, Room 450 180 Queen’s Gate London SW7 2RH, UK Phone: +44(0)20 7594 8216 Fax: +44(0)20 7594 8282 Email: [email protected] Web: http://cheraghchi.info Research Interests In a broad sense: Theoretical Computer Science, Coding and Information Theory, Signal Pro- cessing. In particular: • Interconnections between electrical engineering and theoretical computer science, • Sparse recovery (e.g., compressive sensing and combinatorial group testing) and high- dimensional geometry, • Information-theoretic privacy and security, • Derandomization, pseudorandomness and explicit construction of combinatorial objects, • Probabilistically Checkable Proofs, hardness of approximation and their connections with Boolean analysis. Education • Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. (November 2005 – July 2010) Ph.D. in Computer Science. Dissertation Title: Applications of Derandomization Theory in Coding. Supervisor: Amin Shokrollahi, Professor. • Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. (October 2004 – July 2005) M.Sc. in Computer Science. Dissertation Title: Locally Testable Codes. (available online in ECCC thesis archive.) Supervisor: Amin Shokrollahi, Professor. GPA: 5.94 / 6:00 . • Sharif University of Technology, Tehran, Iran. (September 2000 – July 2004) B.Sc. in Software Engineering and B.Sc. in Computer Hardware Engineering. B.Sc. Dissertation: Human Face Localization in Still Color Images. Dissertation Advisor: Mansour Jamzad, Associate Professor. GPA: 19.06 / 20.00 — Ranked First by the Education Bureau. MAHDI CHERAGHCHI Page 2 of 7 Work Experience / Affiliations • (July 2015–present) Imperial College London: Lecturer (equivalent US term: Assistant Professor), Department of Com- puting.
    [Show full text]