Algorithms and Hardness for Linear Algebra on Geometric Graphs

Total Page:16

File Type:pdf, Size:1020Kb

Algorithms and Hardness for Linear Algebra on Geometric Graphs Algorithms and Hardness for Linear Algebra on Geometric Graphs∗ Josh Alman† Timothy Chu‡ Aaron Schild§ Zhao Song¶ Abstract d d d For a function K : R R R≥ , and a set P = x ,...,xn R of n points, the K × → 0 { 1 } ⊂ graph GP of P is the complete graph on n nodes where the weight between nodes i and j is given by K(xi, xj ). In this paper, we initiate the study of when efficient spectral graph theory is possible on these graphs. We investigate whether or not it is possible to solve the following 1+o(1) o(1) problems in n time for a K-graph GP when d<n : • Multiply a given vector by the adjacency matrix or Laplacian matrix of GP • Find a spectral sparsifier of GP • Solve a Laplacian system in GP ’s Laplacian matrix 2 For each of these problems, we consider all functions of the form K(u, v) = f( u v 2) for a function f : R R. We provide algorithms and comparable hardness resultsk for−manyk such K, including the→ Gaussian kernel, Neural tangent kernels, and more. For example, in dimension d = Ω(log n), we show that there is a parameter associated with the function f for which low parameter values imply n1+o(1) time algorithms for all three of these problems and high parameter values imply the nonexistence of subquadratic time algorithms assuming Strong Exponential Time Hypothesis (SETH), given natural assumptions on f. As part of our results, we also show that the exponential dependence on the dimension d in the celebrated fast multipole method of Greengard and Rokhlin cannot be improved, assuming SETH, for a broad class of functions f. To the best of our knowledge, this is the first formal limitation proven about fast multipole methods. arXiv:2011.02466v1 [cs.DS] 4 Nov 2020 ∗A preliminary version of this paper appeared in the Proceedings of 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020). †[email protected], Harvard University, supported by a Rabin fellowship. Part of the work done while the author was a graduate student at MIT. Supported in part by a Michael O. Rabin Postdoctoral Fellowship. ‡[email protected], Carnegie Mellon University. §[email protected], University of Washington. Part of the work done while the author was a graduate student at UC Berkeley. Supported in part by ONR #N00014-17-1-2429, a Packard fellowship, and the University of Washington. ¶[email protected], Columbia University, Princeton University and Institute for Advanced Study. Part of the work done while the author was a graduate student at UT-Austin. Supported in part by Ma Huateng Foundation, Schmidt Foundation, Simons Foundation, NSF, DARPA/SRC, Google and Amazon. Contents 1 Introduction 1 1.1 High-dimensional results . ......... 3 1.1.1 Multiplication................................ .... 3 1.1.2 Sparsification ................................. ... 4 1.1.3 Laplaciansolving.............................. .... 5 1.2 OurTechniques................................... .... 6 1.2.1 Multiplication................................ .... 6 1.2.2 Sparsification ................................. ... 8 1.3 Brief summary of our results in terms of pf ....................... 9 1.4 Summary of our Results on Examples . ....... 11 1.5 OtherRelatedWork ................................ 11 2 Summary of Low Dimensional Results 12 2.1 Multiplication.................................. ...... 12 2.2 Sparsification ................................... ..... 13 2.3 Laplaciansolving ................................ ...... 13 3 Preliminaries 14 3.1 Notation........................................ 14 3.2 Graph and Laplacian Notation . ....... 15 3.3 Spectral Sparsification via Random Sampling . ............ 16 3.4 WoodburyIdentity ................................ ..... 17 3.5 TailBounds....................................... 17 3.6 Fine-GrainedHypotheses . ....... 17 3.7 Dimensionality Reduction . ........ 18 3.8 NearestNeighborSearch . ....... 18 3.9 Geometric Laplacian System . ........ 19 4 Equivalence of Matrix-Vector Multiplication and Solving Linear Systems 20 4.1 Solving Linear Systems Implies Matrix-Vector Multiplication . 22 4.2 Matrix-Vector Multiplication Implies Solving Linear Systems ............. 24 4.3 Lower bound for high-dimensional linear system solving ................ 24 5 Matrix-Vector Multiplication 25 5.1 Equivalence between Adjacency and Laplacian Evaluation ............... 26 5.2 ApproximateDegree ............................... ..... 27 5.3 ‘Kernel Method’ Algorithms . ........ 28 5.4 Lower Bound in High Dimensions . ...... 29 5.5 Lower Bounds in Low Dimensions . ...... 37 5.6 Hardness of the n-BodyProblem ............................. 39 5.7 HardnessofKernelPCA............................. ..... 39 6 Sparsifying Multiplicatively Lipschitz Functions in AlmostLinearTime 40 6.1 High Dimensional Sparsification . .......... 43 6.2 Low Dimensional Sparsification . ......... 44 i 7 Sparsifiers for x,y 44 |h i| 7.1 Existence of large expanders in inner product graphs . ............... 44 7.2 Efficient algorithm for finding sets with low effective resistance diameter . 47 7.3 Using low-effective-resistance clusters to sparsify the unweighted IP graph . 51 7.4 Sampling data structure . ....... 53 7.5 Weighted IP graph sparsification . ......... 56 7.5.1 (ζ, κ)-cover for unweighted IP graphs . 58 7.5.2 (ζ, κ)-cover for weighted IP graphs on bounded-norm vectors . 60 7.5.3 (ζ, κ)-cover for weighted IP graphs on vectors with norms in the set [1, 2] ∪ [z, 2z] for any z > 1 ................................ 60 7.5.4 (ζ, κ)-cover for weighted IP graphs with polylogarithmic dependence on norm 63 7.5.5 Desired (ζ, κ, δ)-cover ............................... 63 7.5.6 ProofofLemma7.1 ............................... 67 8 Hardness of Sparsifying and Solving Non-Multiplicatively-Lipschitz Laplacians 68 9 Fast Multipole Method 77 9.1 Overview ........................................ 77 9.2 K(x,y) = exp( x y 2),FastGaussiantransform. 78 −k − k2 9.2.1 Estimation .................................... 80 9.2.2 Algorithm..................................... 84 9.2.3 Result........................................ 86 9.3 Generalization.................................. ...... 87 9.4 K(x,y) = 1/ x y 2 .................................... 88 k − k2 10 Neural Tangent Kernel 88 References 89 ii 1 Introduction Linear algebra has a myriad of applications throughout computer science and physics. Consider the following seemingly unrelated tasks: 1. n-body simulation (one step): Given n bodies X located at points in Rd, compute the gravitational force on each body induced by the other bodies. 2. Spectral clustering: Given n points X in Rd, partition X by building a graph G on the points in X, computing the top k eigenvectors of the Laplacian matrix L of G for some k 1 G ≥ to embed X into Rk, and run k-means on the resulting points. 3. Semi-supervised learning: Given n points X in Rd and a function g : X R whose values → on some of X are known, extend g to the rest of X. Each of these tasks has seen much work throughout numerical analysis, theoretical computer science, and machine learning. The first task is a celebrated application of the fast multipole method of Greengard and Rokhlin [GR87, GR88, GR89], voted one of the top ten algorithms of the twentieth century by the editors of Computing in Science and Engineering [DS00]. The second task is spectral clustering [NJW02, LWDH13], a popular algorithm for clustering data. The third task is to label a full set of data given only a small set of partial labels [Zhu05b, CSZ09, ZL05], which has seen increasing use in machine learning. One notable method for performing semi-supervised learning is the graph-based Laplacian regularizer method [LSZ+19b, ZL05, BNS06, Zhu05a]. Popular techniques for each of these problems benefit from primitives in spectral graph theory on a special class of dense graphs called geometric graphs. For a function K : Rd Rd R and a set × → of points X Rd, the K-graph on X is a graph with vertex set X and edges with weight K(u, v) for ⊆ each pair u, v X. Adjacency matrix-vector multiplication, spectral sparsification, and Laplacian ∈ system solving in geometric graphs are directly relevant to each of the above problems, respectively: 1. n-body simulation (one step): For each i 1, 2,...,d , make a weighted graph G ∈ { } i on the points in X, in which the weight of the edge between the points u, v X in Gi is Ggrav mu mv vi ui ∈ Ki(u, v) := ( · 2· )( − ), where Ggrav is the gravitational constant and mx is the u v 2 u v 2 mass of the pointk −x k X. Letk −Ak denote the weighted adjacency matrix of G . Then A 1 is the ∈ i i i vector of ith coordinates of force vectors. In particular, gravitational force can be computed by doing O(d) adjacency matrix-vector multiplications, where each adjacency matrix is that of the Ki-graph on X for some i. 2 2. Spectral clustering: Make a K graph G on X. In applications, K(u, v)= f( u v 2), where z k − k f is often chosen to be f(z)= e− [vL07, NJW02]. Instead of directly running a spectral clus- tering algorithm on LG, one popular method is to construct a sparse matrix M approximating + LG and run spectral clustering on M instead [CFH16, CSB 11, KMT12]. Standard sparsifi- cation methods in the literature are heuristical, and include the widely used Nystrom method which uniformly samples rows and columns from the original matrix [CJK+13]. If H is a spectral sparsifier of G, it has been suggested that spectral clustering with the top k eigenvectors of LH performs just as well in practice as spectral clustering with the top k eigenvectors of LG [CFH16]. One justification is that since H is a spectral sparsifier of G, the eigenvalues of LH are at most a constant factor larger than those of LG, so cuts with similar conductance guarantees are produced. Moreover, spectral clustering using sparse matrices like LH is known to be faster than spectral clustering on dense matrices like LG [CFH16, CJK+13, KMT12]. 3. Semi-supervised learning: An important subroutine in semi-supervised learning is com- + pletion based on ℓ2-minimization [Zhu05a, Zhu05b, LSZ 19b]. Specifically, given values gv for v Y , where Y is a subset of X, find the vector g Rn (variable over X Y ) that minimizes ∈ ∈ \ 1 K(u, v)(g g )2.
Recommended publications
  • INSTITUTE for COMPUTATIONAL and BIOLOGICAL LEARNING (ICBL)
    ICBL (HANSON, GLUCK, VAPNIK) 1 August 20th 2003 Governor's Commission on Jobs & Economic Growth INSTITUTE FOR COMPUTATIONAL and BIOLOGICAL LEARNING (ICBL) PRIMARY PARTICIPANTS: Stephen José Hanson, Psych. Rutgers-Newark; Cog. Sci. Rutgers-NB, Info. Sci., NJIT; Advanced Imaging Center, UMDNJ (Principal Investigator and Co-Director) Mark A. Gluck, CMBN Rutgers-Newark (co-Principal Investigator and Co-Director) Vladimir Vapnik, NEC (co-Principal Investigator and Co-Director) We also include participants and contact personnel for other participating institutions including Rutgers NB-Piscataway, Rutgers-Camden, UMDNJ, NJIT, Princeton, NEC, ATT, Siemens Corporate Research, and Merck Pharmaceuticals (See attachments that follow this document.) PRIMARY OBJECTIVE: The Institute for Computational and Biological Learning (ICBL) will be an interdisciplinary center for research and training in the learning sciences, encompassing computational, biological,and psychological studies of systems that adapt, extract patterns from high complexity data and in general improve from experience. Core programs include learning theory, neuroinformatics, computational neuroscience of learning and memory, neural-network models, machine learning, human computer interaction and financial/economic modeling. The learning sciences is an emerging interdisciplinary technology, recently recognized by the National Science Foundation (NSF) as a national priority. The NSF announced this year a major multi-disciplinary initiative which will fund approximately five new national Learning Sciences Centers with budgets of $3M to $5M a year for up to 10 years, for a total commitment of up to $50,000,000 per center. As envisioned by NSF, these centers will be “large-scale, long-term Centers that will extend the frontiers of knowledge on learning and create the intellectual, organizational, and physical infrastructure needed for the long-term advancement of learning research.
    [Show full text]
  • A Priori Knowledge from Non-Examples
    A priori Knowledge from Non-Examples Diplomarbeit im Fach Bioinformatik vorgelegt am 29. März 2007 von Fabian Sinz Fakultät für Kognitions- und Informationswissenschaften Wilhelm-Schickard-Institut für Informatik der Universität Tübingen in Zusammenarbeit mit Max-Planck-Institut für biologische Kybernetik Tübingen Betreuer: Prof. Dr. Bernhard Schölkopf (MPI für biologische Kybernetik) Prof. Dr. Andreas Schilling (WSI für Informatik) Prof. Dr. Hanspeter Mallot (Biologische Fakultät) 1 Erklärung Die folgende Diplomarbeit wurde ausschließlich von mir (Fabian Sinz) erstellt. Es wurden keine weiteren Hilfmittel oder Quellen außer den Angegebenen benutzt. Tübingen, den 29. März 2007, Fabian Sinz Contents 1 Introduction 4 1.1 Foreword . 4 1.2 Notation . 5 1.3 Machine Learning . 7 1.3.1 Machine Learning . 7 1.3.2 Philosophy of Science in a Nutshell - About the Im- possibility to learn without Assumptions . 11 1.3.3 General Assumptions in Bayesian and Frequentist Learning . 14 1.3.4 Implicit prior Knowledge via additional Data . 17 1.4 Mathematical Tools . 18 1.4.1 Reproducing Kernel Hilbert Spaces . 18 1.5 Regularisation . 22 1.5.1 General regularisation . 22 1.5.2 Data-dependent Regularisation . 27 2 The Universum Algorithm 32 2.1 The Universum Idea . 32 2.1.1 VC Theory and Structural Risk Minimisation (SRM) in a Nutshell . 32 2.1.2 The original Universum Idea by Vapnik . 35 2.2 Implementing the Universum into SVMs . 37 2.2.1 Support Vector Machine Implementations of the Uni- versum . 38 2.2.1.1 Hinge Loss Universum . 38 2.2.1.2 Least Squares Universum . 46 2.2.1.3 Multiclass Universum .
    [Show full text]
  • Three Puzzles on Mathematics, Computation, and Games
    P. I. C. M. – 2018 Rio de Janeiro, Vol. 1 (551–606) THREE PUZZLES ON MATHEMATICS, COMPUTATION, AND GAMES G K Abstract In this lecture I will talk about three mathematical puzzles involving mathemat- ics and computation that have preoccupied me over the years. The first puzzle is to understand the amazing success of the simplex algorithm for linear programming. The second puzzle is about errors made when votes are counted during elections. The third puzzle is: are quantum computers possible? 1 Introduction The theory of computing and computer science as a whole are precious resources for mathematicians. They bring up new questions, profound new ideas, and new perspec- tives on classical mathematical objects, and serve as new areas for applications of math- ematics and mathematical reasoning. In my lecture I will talk about three mathematical puzzles involving mathematics and computation (and, at times, other fields) that have preoccupied me over the years. The connection between mathematics and computing is especially strong in my field of combinatorics, and I believe that being able to person- ally experience the scientific developments described here over the past three decades may give my description some added value. For all three puzzles I will try to describe in some detail both the large picture at hand, and zoom in on topics related to my own work. Puzzle 1: What can explain the success of the simplex algorithm? Linear program- ming is the problem of maximizing a linear function subject to a system of linear inequalities. The set of solutions to the linear inequalities is a convex polyhedron P .
    [Show full text]
  • Integrity, Authentication and Confidentiality in Public-Key Cryptography Houda Ferradi
    Integrity, authentication and confidentiality in public-key cryptography Houda Ferradi To cite this version: Houda Ferradi. Integrity, authentication and confidentiality in public-key cryptography. Cryptography and Security [cs.CR]. Université Paris sciences et lettres, 2016. English. NNT : 2016PSLEE045. tel- 01745919 HAL Id: tel-01745919 https://tel.archives-ouvertes.fr/tel-01745919 Submitted on 28 Mar 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE DE DOCTORAT de l’Université de recherche Paris Sciences et Lettres PSL Research University Préparée à l’École normale supérieure Integrity, Authentication and Confidentiality in Public-Key Cryptography École doctorale n◦386 Sciences Mathématiques de Paris Centre Spécialité Informatique COMPOSITION DU JURY M. FOUQUE Pierre-Alain Université Rennes 1 Rapporteur M. YUNG Moti Columbia University et Snapchat Rapporteur M. FERREIRA ABDALLA Michel Soutenue par Houda FERRADI CNRS, École normale supérieure le 22 septembre 2016 Membre du jury M. CORON Jean-Sébastien Université du Luxembourg Dirigée par
    [Show full text]
  • LNAI 4264, Pp
    Lecture Notes in Artificial Intelligence 4264 Edited by J. G. Carbonell and J. Siekmann Subseries of Lecture Notes in Computer Science José L. Balcázar Philip M. Long Frank Stephan (Eds.) Algorithmic Learning Theory 17th International Conference, ALT 2006 Barcelona, Spain, October 7-10, 2006 Proceedings 1 3 Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA Jörg Siekmann, University of Saarland, Saarbrücken, Germany Volume Editors José L. Balcázar Universitat Politecnica de Catalunya, Dept. Llenguatges i Sistemes Informatics c/ Jordi Girona, 1-3, 08034 Barcelona, Spain E-mail: [email protected] Philip M. Long Google 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA E-mail: [email protected] Frank Stephan National University of Singapore, Depts. of Mathematics and Computer Science 2 Science Drive 2, Singapore 117543, Singapore E-mail: [email protected] Library of Congress Control Number: 2006933733 CR Subject Classification (1998): I.2.6, I.2.3, F.1, F.2, F.4, I.7 LNCS Sublibrary: SL 7 – Artificial Intelligence ISSN 0302-9743 ISBN-10 3-540-46649-5 Springer Berlin Heidelberg New York ISBN-13 978-3-540-46649-9 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer.
    [Show full text]
  • United States Patent 19 11 Patent Number: 5,640,492 Cortes Et Al
    US005640492A United States Patent 19 11 Patent Number: 5,640,492 Cortes et al. 45 Date of Patent: Jun. 17, 1997 54). SOFT MARGIN CLASSIFIER B.E. Boser, I. Goyon, and V.N. Vapnik, “A Training Algo rithm. For Optimal Margin Classifiers”. Proceedings Of The 75 Inventors: Corinna Cortes, New York, N.Y.: 4-th Workshop of Computational Learning Theory, vol. 4, Vladimir Vapnik, Middletown, N.J. San Mateo, CA, Morgan Kaufman, 1992. 73) Assignee: Lucent Technologies Inc., Murray Hill, Y. Le Cun, B. Boser, J. Denker, D. Henderson, R. Howard, N.J. W. Hubbard, and L. Jackel, "Handwritten Digit Recognition with a Back-Propagation Network', (D. Touretzky, Ed.), Advances in Neural Information Processing Systems, vol. 2, 21 Appl. No.: 268.361 Morgan Kaufman, 1990. 22 Filed: Jun. 30, 1994 A.N. Referes et al., “Stock Ranking: Neural Networks Vs. (51] Int. Cl. ................ G06E 1/00; G06E 3100 Multiple Linear Regression':, IEEE International Conf. on 52 U.S. Cl. .................................. 395/23 Neural Networks, vol. 3, San Francisco, CA, IEEE, pp. 58) Field of Search ............................................ 395/23 1419-1426, 1993. M. Rebollo et al., “AMixed Integer Programming Approach 56) References Cited to Multi-Spectral Image Classification”, Pattern Recogni U.S. PATENT DOCUMENTS tion, vol. 9, No. 1, Jan. 1977, pp. 47-51, 54-55, and 57. 4,120,049 10/1978 Thaler et al. ........................... 365/230 V. Uebele et al., "Extracting Fuzzy Rules From Pattern 4,122,443 10/1978 Thaler et al. ....... 340/.46.3 Classification Neural Networks”, Proceedings From 1993 5,040,214 8/1991 Grossberg et al. ....................... 381/43 International Conference on Systems, Man and Cybernetics, 5,214,716 5/1993 Refregier et al..
    [Show full text]
  • Three Puzzles on Mathematics Computations, and Games
    January 5, 2019 17:43 icm-961x669 main-pr page 551 P. I. C. M. – 2018 Rio de Janeiro, Vol. 1 (551–606) THREE PUZZLES ON MATHEMATICS, COMPUTATION, AND GAMES G K Abstract In this lecture I will talk about three mathematical puzzles involving mathemat- ics and computation that have preoccupied me over the years. The first puzzle is to understand the amazing success of the simplex algorithm for linear programming. The second puzzle is about errors made when votes are counted during elections. The third puzzle is: are quantum computers possible? 1 Introduction The theory of computing and computer science as a whole are precious resources for mathematicians. They bring up new questions, profound new ideas, and new perspec- tives on classical mathematical objects, and serve as new areas for applications of math- ematics and mathematical reasoning. In my lecture I will talk about three mathematical puzzles involving mathematics and computation (and, at times, other fields) that have preoccupied me over the years. The connection between mathematics and computing is especially strong in my field of combinatorics, and I believe that being able to person- ally experience the scientific developments described here over the past three decades may give my description some added value. For all three puzzles I will try to describe in some detail both the large picture at hand, and zoom in on topics related to my own work. Puzzle 1: What can explain the success of the simplex algorithm? Linear program- ming is the problem of maximizing a linear function subject to a system of linear inequalities.
    [Show full text]
  • On the State of the Art in Machine Learning: a Personal Review
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by CiteSeerX University of Bristol DEPARTMENT OF COMPUTER SCIENCE On the state of the art in Machine Learning: a personal review Peter A. Flach December 2000 CSTR-00-018 On the state of the art in Machine Learning: a personal review Peter A. Flach December 15, 2000 Abstract This paper reviews a number of recent books related to current developments in machine learning. Some (anticipated) trends will be sketched. These include: a trend towards combining approaches that were hitherto regarded as distinct and were studied by separate research communities; a trend towards a more prominent role of representation; and a tighter integration of machine learning techniques with techniques from areas of application such as bioinformatics. The intended readership has some knowledge of what machine learning is about, but brief tuto- rial introductions to some of the more specialist research areas will also be given. 1 Introduction This paper reviews a number of books that appeared recently in my main area of exper- tise, machine learning. Following a request from the book review editors of Artificial Intelligence, I selected about a dozen books with which I either was already familiar, or which I would be happy to read in order to find out what’s new in that particular area of the field. The choice of books is therefore subjective rather than comprehensive (and also biased towards books I liked). However, it does reflect a personal view of what I think are exciting developments in machine learning, and as a secondary aim of this paper I will expand a bit on this personal view, sketching some of the current and anticipated future trends.
    [Show full text]
  • UNIVERSITY of CALIFORNIA Los Angeles Data-Driven Learning Of
    UNIVERSITY OF CALIFORNIA Los Angeles Data-Driven Learning of Invariants and Specifications A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Saswat Padhi 2020 © Copyright by Saswat Padhi 2020 ABSTRACT OF THE DISSERTATION Data-Driven Learning of Invariants and Specifications by Saswat Padhi Doctor of Philosophy in Computer Science University of California, Los Angeles, 2020 Professor Todd D. Millstein, Chair Although the program verification community has developed several techniques for analyzing software and formally proving their correctness, these techniques are too sophisticated for end users and require significant investment in terms of time and effort. In this dissertation, I present techniques that help programmers easily formalize the initial requirements for verifying their programs — specifications and inductive invariants. The proposed techniques leverage ideas from program synthesis and statistical learning to automatically generate these formal requirements from readily available program-related data, such as test cases, execution traces etc. I detail three of these data-driven learning techniques – FlashProfile and PIE for specification learning, and LoopInvGen for invariant learning. I conclude with some principles for building robust synthesis engines, which I learned while refining the aforementioned techniques. Since program synthesis is a form of function learning, it is perhaps unsurprising that some of the fundamental issues in program synthesis have also been explored in the machine learning community. I study one particular phenomenon — overfitting. I present a formalization of overfitting in program synthesis, and discuss two mitigation strategies, inspired by existing techniques. ii The dissertation of Saswat Padhi is approved. Adnan Darwiche Sumit Gulwani Miryung Kim Jens Palsberg Todd D.
    [Show full text]
  • MMLS 2017 Booklet
    MMLS 2017 Booklet Monday, June 19 9:30 - 11:15 TTIC Continental breakfast. (TTIC Colloquium: 10:00-11:00.) 11:15 - 11:30 GPAH Opening remarks: Po-Ling Loh. 11:30 - 12:20 GPAH Plenary speaker: Devavrat Shah . (Chair: Po-Ling Loh .) Latent Variable Model Estimation via Collaborative Filtering. 12:20 - 2:50 GPAH Lunch, posters. 2:50 - 3:30 GPAH Invited talks (chair: Mesrob Ohannessian ). 2:50: Rina Foygel Projected Gradient Descent with Nonconvex Constraints. 3:10: Maxim Raginsky Non-Convex Learning via Stochastic Gradient Langevin Dynamics. 3:30 - 4:00 GPAH Coffee Break. 4:00 - 4:50 GPAH Plenary speaker: Rayid Ghani . (Chair: Nati Srebro .) Machine Learning for Public Policy: Opportunities and Challenges 4:50 - 5:30 GPAH Invited talks (chair: Laura Balzano ). 4:50: Dimitris Papailiopoulos Gradient Diversity Empowers Distributed Learning. 5:10: Alan Ritter Large-Scale Learning for Information Extraction. 5:45 - 7:00 TTIC Reception, with remarks by Sadaoki Furui (TTIC President). Tuesday, June 20 8:30 - 9:50 GPAH Continental breakfast. 9:00 - 9:50 GPAH Bonus speaker: Larry Wasserman . (Chair: Mladen Kolar .) Locally Optimal Testing. [ Cancelled. ] 9:50 - 10:50 GPAH Invited talks (chair: Misha Belkin ). 9:50: Srinadh Bhojanapalli Effectiveness of Local Search for Low Rank Recovery 10:10: Niao He Learning From Conditional Distributions. 10:30: Clayton Scott Nonparametric Preference Completion. 10:50 - 11:20 GPAH Coffee break. 11:20 - 12:20 GPAH Invited talks (chair: Jason Lee ). 11:20: Lev Reyzin On the Complexity of Learning from Label Proportions. 11:40: Ambuj Tewari Random Perturbations in Online Learning. 12:00: Risi Kondor Multiresolution Matrix Factorization.
    [Show full text]
  • Accelerating Text-As-Data Research in Computational Social Science
    Accelerating Text-as-Data Research in Computational Social Science Dallas Card August 2019 CMU-ML-19-109 Machine Learning Department School of Computer Science Carnegie Mellon University Pittsburgh, PA Thesis Committee: Noah A. Smith, Chair Artur Dubrawski Geoff Gordon Dan Jurafsky (Stanford University) Submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Copyright c 2019 Dallas Card This work was supported by a Natural Sciences and Engineering Research Council of Canada Postgraduate Scholarship, NSF grant IIS-1211277, an REU supplement to NSF grant IIS-1562364, a Bloomberg Data Science Research Grant, a University of Washington Innovation Award, and computing resources provided by XSEDE. Keywords: machine learning, natural language processing, computational social science, graphical models, interpretability, calibration, conformal methods Abstract Natural language corpora are phenomenally rich resources for learning about people and society, and have long been used as such by various disciplines such as history and political science. Recent advances in machine learning and natural language pro- cessing are creating remarkable new possibilities for how scholars might analyze such corpora, but working with textual data brings its own unique challenges, and much of the research in computer science may not align with the desiderata of social scien- tists. In this thesis, I present a line of work on developing methods for computational social science focused primarily on observational research using natural language text. Throughout, I take seriously the concerns and priorities of the social sciences, leading to a focus on aspects of machine learning which are otherwise sometimes secondary, including calibration, interpretability, and transparency. Two ideas which unify this work are the problems of exploration and measurement, and as a running example I consider the problem of analyzing how news sources frame contemporary political issues.
    [Show full text]
  • Nil Ib N O Ir Ali Mi S Na El Oo B Ilp Itl
    ecneicS retupmoC retupmoC ecneicS ecneicS - o t- o l t aA l aA DD DD 9 9 / / OOnn BBiilliinneeaarr TTeecchhnniiqquueess ffoorr 0202 0202 a p pa p r p a r K a K ii t t t aM t aM SSiimmiillaarriittyy SSeeaarrcchh aanndd BBoooolleeaann MMaattrriixx M ultiplication Multiplication MMaatttit iK Kaarprpppaa noi taci lpi t luM xi r taM naelooB dna hcraeS yt i ral imiS rof seuqinhceT raeni l iB nO noi taci lpi t luM xi r taM naelooB dna hcraeS yt i ral imiS rof seuqinhceT raeni l iB nO BBUUSSININESESS + + ECECOONNOOMMY Y NSI I NBS NBS 879 - 879 - 259 - 259 - 06 - 06 - 5198 - 5198 7 - 7 ( p ( r p i n r t i n t de ) de ) AARRT T+ + NSI I NBS NBS 879 - 879 - 259 - 259 - 06 - 06 - 6198 - 6198 4 - 4 ( ( dp f dp ) f ) DDESESIGIGNN + + NSI I NSS NSS 9971 - 9971 - 4394 4394 ( p ( r p i n r t i n t de ) de ) AARRCCHHITIETCECTUTURRE E NSI I NSS NSS 9971 - 9971 - 2494 2494 ( ( dp f dp ) f ) SSCCIEINENCCE E+ + TETCECHHNNOOLOLOGGY Y tirvn tlaA ot laA ot isrevinU yt isrevinU yt CCRROOSSSOOVEVRE R ceic fo o oohcS f l cS o oohcS i f l cS i ecne ecne DDOOCCTOTORRAAL L ecneicS retupmoC retupmoC ecneicS ecneicS DDISISSERERTATTAITOIONNS S DDOOCCTOTORRAAL L +hfbjia*GMFTSH9 +hfbjia*GMFTSH9 fi.otlaa.www . www a . a a l a t o l t . o fi . fi DDISISSERERTATTAITOIONNS S ot laA ytot laA isrevinU yt isrevinU 0202 0202 Aalto University publication series DOCTORAL DISSERTATIONS 9/2020 On Bilinear Techniques for Similarity Search and Boolean Matrix Multiplication Matti Karppa A doctoral dissertation completed for the degree of Doctor of Science (Technology) to be defended, with the permission of the Aalto University School of Science, at a public examination held at the lecture hall T2 of the school on 24 January 2020 at 12.
    [Show full text]