A Discriminative Framework for Clustering Via Similarity Functions

Total Page:16

File Type:pdf, Size:1020Kb

A Discriminative Framework for Clustering Via Similarity Functions A Discriminative Framework for Clustering via Similarity Functions Maria-Florina Balcan∗ Avrim Blum∗ Santosh Vempala† Dept. of Computer Science Dept. of Computer Science College of Computing Carnegie Mellon University Carnegie Mellon University Georgia Inst. of Technology Pittsburgh, PA Pittsburgh, PA Atlanta, GA [email protected] [email protected] [email protected] ABSTRACT new efficient algorithms that are able to take advantage of them. Problems of clustering data from pairwise similarity information Our algorithms for hierarchical clustering combine recent learning- are ubiquitous in Computer Science. Theoretical treatments typi- theoretic approaches with linkage-style methods. We also show cally view the similarity information as ground-truth and then de- how our algorithms can be extended to the inductive case, i.e., by sign algorithms to (approximately) optimize various graph-based using just a constant-sized sample, as in property testing. The anal- objective functions. However, in most applications, this similarity ysis here uses regularity-type results of [20] and [3]. information is merely based on some heuristic; the ground truth is Categories and Subject Descriptors: F.2.0 [Analysis of Algo- really the unknown correct clustering of the data points and the real rithms and Problem Complexity]: General goal is to achieve low error on the data. In this work, we develop a theoretical approach to clustering from this perspective. In partic- General Terms: Algorithms, Theory ular, motivated by recent work in learning theory that asks “what Keywords: Clustering, Similarity Functions, Learning. natural properties of a similarity (or kernel) function are sufficient to be able to learn well?” we ask “what natural properties of a 1. INTRODUCTION similarity function are sufficient to be able to cluster well?” To study this question we develop a theoretical framework that Clustering is an important problem in the analysis and explo- can be viewed as an analog of the PAC learning model for cluster- ration of data. It has a wide range of applications in data mining, ing, where the object of study, rather than being a concept class, computer vision and graphics, and gene analysis. It has many vari- is a class of (concept, similarity function) pairs, or equivalently, a ants and formulations and it has been extensively studied in many property the similarity function should satisfy with respect to the different communities. ground truth clustering. We then analyze both algorithmic and in- In the Algorithms literature, clustering is typically studied by formation theoretic issues in our model. While quite strong prop- posing some objective function, such as k-median, min-sum or erties are needed if the goal is to produce a single approximately- k-means, and then developing algorithms for approximately op- correct clustering, we find that a number of reasonable properties timizing this objective given a data set represented as a weighted are sufficient under two natural relaxations: (a) list clustering: anal- graph [12, 24, 22]. That is, the graph is viewed as “ground truth” ogous to the notion of list-decoding, the algorithm can produce a and then the goal is to design algorithms to optimize various objec- small list of clusterings (which a user can select from) and (b) hi- tives over this graph. However, for most clustering problems such erarchical clustering: the algorithm’s goal is to produce a hierarchy as clustering documents by topic or clustering web-search results such that desired clustering is some pruning of this tree (which a by category, ground truth is really the unknown true topic or true user could navigate). We develop a notion of the clustering com- category of each object. The construction of the weighted graph is plexity of a given property (analogous to notions of capacity in just done using some heuristic: e.g., cosine-similarity for cluster- learning theory), that characterizes its information-theoretic use- ing documents or a Smith-Waterman score in computational biol- fulness for clustering. We analyze this quantity for several natural ogy. In all these settings, the goal is really to produce a clustering game-theoretic and learning-theoretic properties, as well as design that gets the data correct. Alternatively, methods developed both in the algorithms and in the machine learning literature for learning ∗Supported in part by the National Science Foundation under grant mixtures of distributions [1, 5, 23, 36, 17, 15] explicitly have a no- CCF-0514922, and by a Google Research Grant. tion of ground-truth clusters which they aim to recover. However, †Supported in part by the National Science Foundation under grant such methods are based on very strong assumptions: they require CCF-0721503, and by a Raytheon fellowship. an embedding of the objects into Rn such that the clusters can be viewed as distributions with very specific properties (e.g., Gaus- sian or log-concave). In many real-world situations (e.g., cluster- ing web-search results by topic, where different users might have Permission to make digital or hard copies of all or part of this work for different notions of what a “topic” is) we can only expect a domain personal or classroom use is granted without fee provided that copies are expert to provide a notion of similarity between objects that is re- not made or distributed for profit or commercial advantage and that copies lated in some reasonable ways to the desired clustering goal, and bear this notice and the full citation on the first page. To copy otherwise, to not necessarily an embedding with such strong properties. republish, to post on servers or to redistribute to lists, requires prior specific In this work, we develop a theoretical study of the clustering permission and/or a fee. STOC’08, May 17–20, 2008, Victoria, British Columbia, Canada. problem from this perspective. In particular, motivated by work Copyright 2008 ACM 978-1-60558-047-0/08/05 ...$5.00. on learning with kernel functions that asks “what natural properties of a given kernel (or similarity) function are sufficient to allow one to learn well?” [6, 7, 31, 29, 21] weK ask the question “what natural properties of a pairwise similarity function are sufficient to A C allow one to cluster well?” To study this question we develop a theoretical framework which can be thought of as a discriminative (PAC style) model for clustering, though the basic object of study, rather than a concept class, is a property of the similarity function D in relation to the target concept, or equivalently a set of (concept, B similarityK function) pairs. The main difficulty that appears when phrasing the problem in this general way is that if one defines success as outputting a single clustering that closely approximates the correct clustering, then one Figure 1: Data lies in four regions A,B,C,D (e.g., think of as docu- needs to assume very strong conditions on the similarity function. ments on baseball, football, TCS, and AI). Suppose that K(x,y) = 1 if For example, if the function provided by our expert is extremely x and y belong to the same region, K(x,y) = 1/2 if x ∈ A and y ∈ B good, say (x,y) > 1/2 for all pairs x and y that should be in or if x ∈ C and y ∈ D, and K(x,y) = 0 otherwise. Even assuming the same cluster,K and (x,y) < 1/2 for all pairs x and y that that all points are more similar to other points in their own cluster than should be in different clusters,K then we could just use it to recover to any point in any other cluster, there are still multiple consistent clus- the clusters in a trivial way.1 However, if we just slightly weaken terings, including two consistent 3-clusterings ((A ∪ B, C, D) or (A, this condition to simply require that all points x are more similar B, C ∪ D)). However, there is a single hierarchical decomposition such to all points y from their own cluster than to any points y from any that any consistent clustering is a pruning of this tree. other clusters, then this is no longer sufficient to uniquely identify even a good approximation to the correct answer. For instance, in the example in Figure 1, there are multiple clusterings consistent 1.1 Perspective with this property. Even if one is told the correct clustering has The standard approach in theoretical computer science to clus- 3 clusters, there is no way for an algorithm to tell which of the tering is to choose some objective function (e.g., k-median) and two (very different) possible solutions is correct. In fact, results of then to develop algorithms that approximately optimize that objec- Kleinberg [25] can be viewed as effectively ruling out a broad class tive [12, 24, 22, 18]. If the true goal is to achieve low error with of scale-invariant properties such as this one as being sufficient for respect to an underlying correct clustering (e.g., a user’s desired producing the correct answer. clustering of search results by topic), however, then one can view In our work we overcome this problem by considering two re- this as implicitly making the strong assumption that not only does laxations of the clustering objective that are natural for many clus- the correct clustering have a good objective value, but also that all tering applications. The first is as in list-decoding to allow the al- clusterings that approximately optimize the objective must be close gorithm to produce a small list of clusterings such that at least one to the correct clustering as well. In this work, we instead explicitly of them has low error.
Recommended publications
  • Avrim Blum, John Hopcroft, Ravindran Kannan
    Foundations of Data Science This book provides an introduction to the mathematical and algorithmic founda- tions of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decompo- sition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms, and analysis for clustering, probabilistic models for large networks, representation learning including topic modeling and nonnegative matrix factorization, wavelets, and compressed sensing. Important probabilistic techniques are developed including the law of large num- bers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data. Avrim Blum is Chief Academic Officer at the Toyota Technological Institute at Chicago and formerly Professor at Carnegie Mellon University. He has over 25,000 citations for his work in algorithms and machine learning. He has received the AI Journal Classic Paper Award, ICML/COLT 10-Year Best Paper Award, Sloan Fellowship, NSF NYI award, and Herb Simon Teaching Award, and is a fellow of the Association for Computing Machinery. John Hopcroft is the IBM Professor of Engineering and Applied Mathematics at Cornell University. He is a member National Academy of Sciences and National Academy of Engineering, and a foreign member of the Chinese Academy of Sciences.
    [Show full text]
  • A Learning Theoretic Framework for Clustering with Similarity Functions
    A Learning Theoretic Framework for Clustering with Similarity Functions Maria-Florina Balcan∗ Avrim Blum∗ Santosh Vempala† November 23, 2007 Abstract Problems of clustering data from pairwise similarity information are ubiquitous in Computer Science. Theoretical treatments typically view the similarity information as ground-truthand then design algorithms to (approximately) optimize various graph-based objective functions. However, in most applications, this similarity information is merely based on some heuristic; the ground truth is really the (unknown) correct clustering of the data points and the real goal is to achieve low error on the data. In this work, we develop a theoretical approach to clustering from this perspective. In particular, motivated by recent work in learning theory that asks “what natural properties of a similarity (or kernel) function are sufficient to be able to learn well?” we ask “what natural properties of a similarity function are sufficient to be able to cluster well?” To study this question we develop a theoretical framework that can be viewed as an analog of the PAC learning model for clustering, where the object of study, rather than being a concept class, is a class of (concept, similarity function) pairs, or equivalently, a property the similarity function should satisfy with respect to the ground truth clustering. We then analyze both algorithmic and information theoretic issues in our model. While quite strong properties are needed if the goal is to produce a single approximately-correct clustering, we find that a number of reasonable properties are sufficient under two natural relaxations: (a) list clustering: analogous to the notion of list-decoding, the algorithm can produce a small list of clusterings (which a user can select from) and (b) hierarchical clustering: the algorithm’s goal is to produce a hierarchy such that desired clustering is some pruning of this tree (which a user could navigate).
    [Show full text]
  • IARCS, the Indian Association for Research in Computing Science
    IARCS, the Indian Association for Research in Computing Science, announces the 38th Foundations of Software Technology and Theoretical Computer Science conference at Ahmedabad University, Ahmedabad, Gujarat. The FSTTCS conference is a forum for presenting original results in foundational aspects of Computer Science and Software Technology. Representative areas include, but are not limited to, the following. ● Algorithmic Algebra ● Model Theory, Modal and Temporal ● Algorithms and Data Structures Logics ● Algorithmic Graph Theory and ● Models of Concurrent and Distributed Combinatorics Systems ● Approximation Algorithms ● Models of Timed, Reactive, Hybrid and ● Automata, Games and Formal Stochastic Systems Languages ● Parallel, Distributed and Online ● Combinatorial Optimization Algorithms ● Communication Complexity ● Parameterized Complexity ● Computational Biology ● Principles and Semantics of ● Computational Complexity Programming Languages ● Computational Geometry ● Program Analysis and Transformation ● Computational Learning ● Proof Complexity Theory ● Quantum Computing ● Cryptography and Security ● Randomness in Computing ● Data Streaming and Sublinear ● Specification, Verification, and algorithm Synthesis ● Game Theory and Mechanism ● Theorem Proving, Decision Procedures, Design Model Checking and Reactive Synthesis ● Logic in Computer Science ● Theoretical Aspects of Mobile and ● Machine Learning High-Performance Computing ● Quantum Computing Accepted Papers View the Accepted Papers. Click Here To Download PDF. Submissions Submissions
    [Show full text]
  • Curriculum Vitae
    Curriculum Vitae David P. Woodruff Biographical Computer Science Department Gates-Hillman Complex Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Citizenship: United States Email: [email protected] Home Page: http://www.cs.cmu.edu/~dwoodruf/ Research Interests Compressed Sensing, Data Stream Algorithms and Lower Bounds, Dimensionality Reduction, Distributed Computation, Machine Learning, Numerical Linear Algebra, Optimization Education All degrees received from Massachusetts Institute of Technology, Cambridge, MA Ph.D. in Computer Science. September 2007 Research Advisor: Piotr Indyk Thesis Title: Efficient and Private Distance Approximation in the Communication and Streaming Models. Master of Engineering in Computer Science and Electrical Engineering, May 2002 Research Advisor: Ron Rivest Thesis Title: Cryptography in an Unbounded Computational Model Bachelor of Science in Computer Science and Electrical Engineering May 2002 Bachelor of Pure Mathematics May 2002 Professional Experience August 2018-present, Carnegie Mellon University Computer Science Department Associate Professor (with tenure) August 2017 - present, Carnegie Mellon University Computer Science Department Associate Professor June 2018 – December 2018, Google, Mountain View Research Division Visiting Faculty Program August 2007 - August 2017, IBM Almaden Research Center Principles and Methodologies Group Research Scientist Aug 2005 - Aug 2006, Tsinghua University Institute for Theoretical Computer Science Visiting Scholar. Host: Andrew Yao Jun - Jul
    [Show full text]
  • Visions in Theoretical Computer Science
    VISIONS IN THEORETICAL COMPUTER SCIENCE A Report on the TCS Visioning Workshop 2020 The material is based upon work supported by the National Science Foundation under Grants No. 1136993 and No. 1734706. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. VISIONS IN THEORETICAL COMPUTER SCIENCE A REPORT ON THE TCS VISIONING WORKSHOP 2020 Edited by: Shuchi Chawla (University of Wisconsin-Madison) Jelani Nelson (University of California, Berkeley) Chris Umans (California Institute of Technology) David Woodruff (Carnegie Mellon University) 3 VISIONS IN THEORETICAL COMPUTER SCIENCE Table of Contents Foreword ................................................................................................................................................................ 5 How this Document came About ................................................................................................................................... 6 Models of Computation .........................................................................................................................................7 Computational Complexity ................................................................................................................................................7 Sublinear Algorithms ........................................................................................................................................................
    [Show full text]
  • Caltech/Mit Voting Technology Project
    CALTECH/MIT VOTING TECHNOLOGY PROJECT A multi-disciplinary, collaborative project of the California Institute of Technology – Pasadena, California 91125 and the Massachusetts Institute of Technology – Cambridge, Massachusetts 02139 ADVANCES IN CRYPTOGRAPHIC VOTING SYSTEMS BEN ADIDA MIT Key words: voting systems, cryptographic, election administration, secret- ballot elections VTP WORKING PAPER #51 September 2006 Advances in Cryptographic Voting Systems by Ben Adida Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY August 2006 c Massachusetts Institute of Technology 2006. All rights reserved. Author . .................................................................. Department of Electrical Engineering and Computer Science August 31st, 2006 Certified by . ............................................................. Ronald L. Rivest Viterbi Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by . ............................................................ Arthur C. Smith Chairman, Department Committee on Graduate Students 2 Advances in Cryptographic Voting Systems by Ben Adida Submitted to the Department of Electrical Engineering and Computer Science on August 31st, 2006, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Abstract Democracy depends on the proper administration of popular
    [Show full text]
  • Random Projection in the Brain and Computation with Assemblies Of
    1 Random Projection in the Brain and Computation 2 with Assemblies of Neurons 3 Christos H. Papadimitriou 4 Columbia University, USA 5 [email protected] 6 Santosh S. Vempala 7 Georgia Tech, USA 8 [email protected] 9 Abstract 10 It has been recently shown via simulations [7] that random projection followed by a cap operation 11 (setting to one the k largest elements of a vector and everything else to zero), a map believed 12 to be an important part of the insect olfactory system, has strong locality sensitivity properties. 13 We calculate the asymptotic law whereby the overlap in the input vectors is conserved, verify- 14 ing mathematically this empirical finding. We then focus on the far more complex homologous 15 operation in the mammalian brain, the creation through successive projections and caps of an 16 assembly (roughly, a set of excitatory neurons representing a memory or concept) in the presence 17 of recurrent synapses and plasticity. After providing a careful definition of assemblies, we prove 18 that the operation of assembly projection converges with high probability, over the randomness 19 of synaptic connectivity, even if plasticity is relatively small (previous proofs relied on high plas- 20 ticity). We also show that assembly projection has itself some locality preservation properties. 21 Finally, we propose a large repertoire of assembly operations, including associate, merge, recip- 22 rocal project, and append, each of them both biologically plausible and consistent with what we 23 know from experiments, and show that this computational system is capable of simulating, again 24 with high probability, arbitrary computation in a quite natural way.
    [Show full text]
  • Fair and Diverse Data Representation in Machine Learning
    FAIR AND DIVERSE DATA REPRESENTATION IN MACHINE LEARNING A Dissertation Presented to The Academic Faculty By Uthaipon Tantipongpipat In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Algorithms, Combinatorics, and Optimization Georgia Institute of Technology August 2020 Copyright © Uthaipon Tantipongpipat 2020 FAIR AND DIVERSE DATA REPRESENTATION IN MACHINE LEARNING Approved by: Dr. Mohit Singh, Advisor Dr. Sebastian Pokutta School of Industrial and Systems Institute of Mathematics Engineering Technical University of Berlin Georgia Institute of Technology Dr. Santosh Vempala Dr. Rachel Cummings School of Computer Science School of Industrial and Systems Georgia Institute of Technology Engineering Georgia Institute of Technology Date Approved: May 8, 2020 Dr. Aleksandar Nikolov Department of Computer Science University of Toronto ACKNOWLEDGEMENTS This thesis would not be complete without many great collaborators and much support I re- ceived. My advisor, Mohit Singh, is extremely supportive for me to explore my own research interests, while always happy to discuss even a smallest technical detail when I need help. His ad- vising aim has always been for my best – for my career growth as a researcher and for a fulfilling life as a person. He is unquestionably an integral part of my success today. I want to thank Rachel Cummings, Aleksandar (Sasho) Nikolov, Sebastian Pokutta, and San- tosh Vempala for many research discussions we had together and their willingness to serve as my dissertation committees amidst COVID-19 disruption. I want to thank my other coauthors as part of my academic path: Digvijay Boob, Sara Krehbiel, Kevin Lai, Vivek Madan, Jamie Morgen- stern, Samira Samadi, Amaresh (Ankit) Siva, Chris Waites, and Weijun Xie.
    [Show full text]
  • Conference Program
    Last update: June 11 STOC 2004 Schedule Chicago Hilton and Towers Hotel 720 South Michigan Avenue, Chicago Saturday, June 12, 2004 5:00 - 9:00 PM Registration desk open 7:00 - 9:00 PM Welcome Reception. Boulevard Room and Foyer, 2nd floor Sunday, June 13, 2004 8:00 AM - 5:00 PM Registration desk open 8:00 - 8:35 AM Continental breakfast. Continental Ballroom Lobby, 1st floor 8:35 - 10:10 AM Session 1A. Continental Ballroom A-B. Chair: Oded Regev 8:35 - 8:55 AM Robust PCPs of Proximity, Shorter PCPs and Applications to Coding Eli Ben-Sasson (Radcliffe/Harvard), Oded Goldreich (Weizmann) Prahladh Harsha (MIT), Madhu Sudan (MIT), Salil Vadhan (Harvard) 9:00 - 9:20 AM A New PCP Outer Verifier with Applications to Homogeneous Linear Equations and Max-Bisection Jonas Holmerin (KTH, Stockholm), Subhash Khot (Georgia Tech) 9:25 - 9:45 AM Asymmetric k-Center is log∗ n - Hard to Approximate Julia Chuzhoy (Technion), Sudipto Guha (UPenn), Eran Halperin (Berkeley), Sanjeev Khanna (UPenn), Guy Kortsarz (Rutgers), Joseph (Seffi) Naor (Technion) 9:50 - 10:10 AM New Hardness Results for Congestion Minimization and Machine Scheduling Julia Chuzhoy (Technion), Joseph (Seffi) Naor (Technion) 8:35 - 10:10 AM Session 1B. Continental Ballroom C. Chair: Sandy Irani 8:35 - 8:55 AM On the Performance of Greedy Algorithms in Packet Buffering Susanne Albers (Freiburg), Markus Schmidt (Freiburg) 9:00 - 9:20 AM Adaptive Routing with End-to-End Feedback: Distributed Learning and Geometric Approaches Baruch Awerbuch (Johns Hopkins), Robert D. Kleinberg (MIT) 9:25 - 9:45 AM Know Thy Neighbor’s Neighbor: The Power of Lookahead in Randomized P2P Net- works Gurmeet Singh Manku (Stanford), Moni Naor (Weizmann), Udi Wieder (Weizmann) 9:50 - 10:10 AM The Zero-One Principle for Switching Networks Yossi Azar (Tel Aviv), Yossi Richter (Tel Aviv) 10:10 - 10:35 AM Coffee break.
    [Show full text]
  • Manuel Blum's CV
    Manuel Blum [email protected] MANUEL BLUM Bruce Nelson Professor of Computer Science Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Telephone: Office: (412) 268-3742, Fax: (412) 268-5576 Home: (412) 687-8730, Mobile: (412) 596-4063 Email: [email protected] Personal Born: 26 April1938 in Caracas, Venezuela. Citizenship: Venezuela and USA. Naturalized Citizen of USA, January 2000. Wife: Lenore Blum, Distinguished Career Professor of Computer Science, CMU Son: Avrim Blum, Professor of Computer Science, CMU. Interests: Computational Complexity; Automata Theory; Algorithms; Inductive Inference: Cryptography; Program Result-Checking; Human Interactive Proofs. Employment History Research Assistant and Research Associate for Dr. Warren S. McCulloch, Research Laboratory of Electronics, MIT, 1960-1965. Assistant Professor, Department of Mathematics, MIT, 1966-68. Visiting Assistant Professor, Associate Professor, Professor, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, 1968-2001. Associate Chair for Computer Science, U.C. Berkeley, 1977-1980. Arthur J. Chick Professor of Computer Science, U.C. Berkeley, 1995-2001. Group in Logic and Methodology of Science, U.C. Berkeley, 1974-present. Visiting Professor of Computer Science. City University of Hong Kong, 1997-1999. Bruce Nelson Professor of Computer Science, Carnegie Mellon University, 2001-present. Education B.S., Electrical Engineering, MIT, 1959; M.S., Electrical Engineering, MIT. 1961. Ph.D., Mathematics, MIT, 1964, Professor Marvin Minsky, supervisor. Manuel Blum [email protected] Honors MIT Class VIB (Honor Sequence in EE), 1958-61 Sloan Foundation Fellowship, 1972-73. U.C. Berkeley Distinguished Teaching Award, 1977. Fellow of the IEEE "for fundamental contributions to the abstract theory of computational complexity,” 1982.
    [Show full text]
  • Foundations of Data Science∗
    Foundations of Data Science∗ Avrim Blum, John Hopcroft and Ravindran Kannan Thursday 9th June, 2016 ∗Copyright 2015. All rights reserved 1 Contents 1 Introduction 8 2 High-Dimensional Space 11 2.1 Introduction . 11 2.2 The Law of Large Numbers . 11 2.3 The Geometry of High Dimensions . 14 2.4 Properties of the Unit Ball . 15 2.4.1 Volume of the Unit Ball . 15 2.4.2 Most of the Volume is Near the Equator . 17 2.5 Generating Points Uniformly at Random from a Ball . 20 2.6 Gaussians in High Dimension . 21 2.7 Random Projection and Johnson-Lindenstrauss Lemma . 23 2.8 Separating Gaussians . 25 2.9 Fitting a Single Spherical Gaussian to Data . 27 2.10 Bibliographic Notes . 29 2.11 Exercises . 30 3 Best-Fit Subspaces and Singular Value Decomposition (SVD) 38 3.1 Introduction and Overview . 38 3.2 Preliminaries . 39 3.3 Singular Vectors . 41 3.4 Singular Value Decomposition (SVD) . 44 3.5 Best Rank-k Approximations . 45 3.6 Left Singular Vectors . 47 3.7 Power Method for Computing the Singular Value Decomposition . 49 3.7.1 A Faster Method . 50 3.8 Singular Vectors and Eigenvectors . 52 3.9 Applications of Singular Value Decomposition . 52 3.9.1 Centering Data . 52 3.9.2 Principal Component Analysis . 53 3.9.3 Clustering a Mixture of Spherical Gaussians . 54 3.9.4 Ranking Documents and Web Pages . 59 3.9.5 An Application of SVD to a Discrete Optimization Problem . 60 3.10 Bibliographic Notes . 63 3.11 Exercises .
    [Show full text]
  • Lipics-ITCS-2018-0.Pdf (0.4
    9th Innovations in Theoretical Computer Science ITCS 2018, January 11–14, 2018, Cambridge, MA, USA Edited by Anna R. Karlin LIPIcs – Vol. 94 – ITCS2018 www.dagstuhl.de/lipics Editor Anna R. Karlin Allen School of Computer Science and Engineering University of Washington [email protected] ACM Classification 1998 F. Theory of Computation, G. Mathematics of Computing ISBN 978-3-95977-060-6 Published online and open access by Schloss Dagstuhl – Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing, Saarbrücken/Wadern, Germany. Online available at http://www.dagstuhl.de/dagpub/978-3-95977-060-6. Publication date January, 2018 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. License This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC-BY 3.0): http://creativecommons.org/licenses/by/3.0/legalcode. In brief, this license authorizes each and everybody to share (to copy, distribute and transmit) the work under the following conditions, without impairing or restricting the authors’ moral rights: Attribution: The work must be attributed to its authors. The copyright is retained by the corresponding authors. Digital Object Identifier: 10.4230/LIPIcs.ITCS.2018.0 ISBN 978-3-95977-060-6 ISSN 1868-8969 http://www.dagstuhl.de/lipics 0:iii LIPIcs – Leibniz International Proceedings in Informatics LIPIcs is a series of high-quality conference proceedings across all fields in informatics. LIPIcs volumes are published according to the principle of Open Access, i.e., they are available online and free of charge.
    [Show full text]