Nil Ib N O Ir Ali Mi S Na El Oo B Ilp Itl
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Arxiv:1510.00844V3 [Cs.DC] 16 Nov 2016 Splitting (As Opposed to Replicating) Input Submatrices Across Processor Layers
EXPLOITING MULTIPLE LEVELS OF PARALLELISM IN SPARSE MATRIX-MATRIX MULTIPLICATION ARIFUL AZAD∗, GREY BALLARDy , AYDIN BULUC¸ z , JAMES DEMMELx , LAURA GRIGORI{, ODED SCHWARTZk, SIVAN TOLEDO∗∗, AND SAMUEL WILLIAMSyy Abstract. Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erd}os-R´enyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intra-node and inter-node) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research. Key words. Parallel computing, numerical linear algebra, sparse matrix-matrix multiplication, 2.5D algorithms, 3D algorithms, multithreading, SpGEMM, 2D decomposition, graph algorithms. AMS subject classifications. 05C50, 05C85, 65F50, 68W10 1. Introduction. Multiplication of two sparse matrices (SpGEMM) is a key operation for high-performance graph computations in the language of linear alge- bra [31, 40]. Examples include graph contraction [25], betweenness centrality [13], Markov clustering [47], peer pressure clustering [43], triangle counting [4], and cycle detection [49]. SpGEMM is also used in scientific computing. For instance, it is often a performance bottleneck for Algebraic Multigrid (AMG), where it is used in the set- up phase for restricting and interpolating matrices [7]. -
LNAI 4264, Pp
Lecture Notes in Artificial Intelligence 4264 Edited by J. G. Carbonell and J. Siekmann Subseries of Lecture Notes in Computer Science José L. Balcázar Philip M. Long Frank Stephan (Eds.) Algorithmic Learning Theory 17th International Conference, ALT 2006 Barcelona, Spain, October 7-10, 2006 Proceedings 1 3 Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA Jörg Siekmann, University of Saarland, Saarbrücken, Germany Volume Editors José L. Balcázar Universitat Politecnica de Catalunya, Dept. Llenguatges i Sistemes Informatics c/ Jordi Girona, 1-3, 08034 Barcelona, Spain E-mail: [email protected] Philip M. Long Google 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA E-mail: [email protected] Frank Stephan National University of Singapore, Depts. of Mathematics and Computer Science 2 Science Drive 2, Singapore 117543, Singapore E-mail: [email protected] Library of Congress Control Number: 2006933733 CR Subject Classification (1998): I.2.6, I.2.3, F.1, F.2, F.4, I.7 LNCS Sublibrary: SL 7 – Artificial Intelligence ISSN 0302-9743 ISBN-10 3-540-46649-5 Springer Berlin Heidelberg New York ISBN-13 978-3-540-46649-9 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. -
Small Tensor Operations on Advanced Architectures for High-Order Applications
Small Tensor Operations on Advanced Architectures for High-order Applications A. Abdelfattah1,M.Baboulin5, V. Dobrev2, J. Dongarra1,3, A. Haidar1, I. Karlin2,Tz.Kolev2,I.Masliah4,andS.Tomov1 1 Innovative Computing Laboratory, University of Tennessee, Knoxville, TN, USA 2 Lawrence Livermore National Laboratory, Livermore, CA, USA 3 University of Manchester, Manchester, UK 4 Inria Bordeaux, France 5 University of Paris-Sud, France Abstract This technical report describes our findings regarding performance optimizations of the tensor con- traction kernels used in BLAST – a high-order FE hydrodynamics research code developed at LLNL – on various modern architectures. Our approach considers and shows ways to organize the contrac- tions, their vectorization, data storage formats, read/write patterns, and parametrization as related to batched execution and parallelism in general. Autotuning framework is designed and used to find empirically best performing tensor kernels by exploring a large search space that results from the tech- niques described. We analyze these kernels to show the trade-o↵s between the various implementations, how di↵erent tensor kernel implementations perform on di↵erent architectures, and what tuning param- eters can have a significant impact on performance. In all cases, we organize the tensor contractions to minimize their communications by considering index reordering that enables their execution as highly efficient batched matrix-matrix multiplications (GEMMs). We derive a performance model and bound for the maximum performance that can be achieved under the maximum data-reuse scenario, and show that our implementations achieve 90+% of these theoretically derived peaks on advanced multicore x86 CPU, ARM, GPU, and Xeon Phi architectures. -
Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Part 1 Low-Rank Tensor Decompositions
R Foundations and Trends• in Machine Learning Vol. 9, No. 4-5 (2016) 249–429 c 2017 A. Cichocki et al. • DOI: 10.1561/2200000059 Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Part 1 Low-Rank Tensor Decompositions Andrzej Cichocki RIKEN Brain Science Institute (BSI), Japan and Skolkovo Institute of Science and Technology (SKOLTECH) [email protected] Namgil Lee RIKEN BSI, [email protected] Ivan Oseledets Skolkovo Institute of Science and Technology (SKOLTECH) and Institute of Numerical Mathematics of Russian Academy of Sciences [email protected] Anh-Huy Phan RIKEN BSI, [email protected] Qibin Zhao RIKEN BSI, [email protected] Danilo P. Mandic Department of Electrical and Electronic Engineering Imperial College London [email protected] Contents 1 Introduction and Motivation 250 1.1 Challenges in Big Data Processing ............. 251 1.2 Tensor Notations and Graphical Representations ...... 252 1.3 Curse of Dimensionality and Generalized Separation of Vari- ables for Multivariate Functions ............... 260 1.4 Advantages of Multiway Analysis via Tensor Networks ... 268 1.5 Scope and Objectives .................... 269 2 Tensor Operations and Tensor Network Diagrams 272 2.1 Basic Multilinear Operations ................ 272 2.2 Graphical Representation of Fundamental Tensor Networks 292 2.3 Generalized Tensor Network Formats ............ 310 3 Constrained Tensor Decompositions: From Two-way to Mul- tiway Component Analysis 314 3.1 Constrained Low-Rank Matrix Factorizations ........ 314 3.2 The CP Format ....................... 317 3.3 The Tucker Tensor Format ................. 323 3.4 Higher Order SVD (HOSVD) for Large-Scale Problems .. 332 3.5 Tensor Sketching Using Tucker Model .......... -
Plenary Speakers
FoCM95 Park City: Plenary speakers: WEEK 1 Marie-Francoise Roy, Universite de Rennes Shmuel Winograd, IBM Dima Y. Grigoriev, Pennsylvania State University Richard S. Varga, Kent State University Steve Smale, University of California, Berkeley John CannyUniversity of California, Berkeley Felipe Cucker, Universitat Pampeu Fabra, Spain Victor Pan, Herbert H. Lehman College, CUNY Michael Shub, IBM Roger Brockett, Harvard University WEEK 2 Henryk Wozniakowski, University of Warsaw David Donoho, University of California, Berkeley and Columbia University Yosef Yomdin, Weizmann Institute of Science, Israel Margaret H. Wright, AT&T Bell Laboratories N. Karmarker, AT&T Bell Laboratories Manuel Blum, University of California, Berkeley Roger Temam, Indiana University Arkadi Nemirovski, Israel Institute of Technology Hubertus Th.Jongen, Reinisch-Westf Tech Hochschule James M. Renegar, Cornell University WEEK 3 Herb Keller, California Institute of Technology Gene H. Golub, Stanford University Alexandre J. Chorin, University of California, Berkeley T. Y. Li, Michigan State University James Yorke, University of Maryland Lenore Blum, MSRI Eugene L. Allgower, Colorado State University Arieh Iserles, University of Cambridge, UK James W. Demmel, University of California, Berkeley W. Dahmen, Reinisch-Westf Tech Hochschule WEEK 4 Ronald A. DeVore, University of South Carolina, Columbia Ulrich Kulisch, University of Karlsruhe Victor A. V. Vassiliev, Institute for System Studies, Moscow Jacques Louis Lions, College de France Henryk Wozniakowski, University of -
MMLS 2017 Booklet
MMLS 2017 Booklet Monday, June 19 9:30 - 11:15 TTIC Continental breakfast. (TTIC Colloquium: 10:00-11:00.) 11:15 - 11:30 GPAH Opening remarks: Po-Ling Loh. 11:30 - 12:20 GPAH Plenary speaker: Devavrat Shah . (Chair: Po-Ling Loh .) Latent Variable Model Estimation via Collaborative Filtering. 12:20 - 2:50 GPAH Lunch, posters. 2:50 - 3:30 GPAH Invited talks (chair: Mesrob Ohannessian ). 2:50: Rina Foygel Projected Gradient Descent with Nonconvex Constraints. 3:10: Maxim Raginsky Non-Convex Learning via Stochastic Gradient Langevin Dynamics. 3:30 - 4:00 GPAH Coffee Break. 4:00 - 4:50 GPAH Plenary speaker: Rayid Ghani . (Chair: Nati Srebro .) Machine Learning for Public Policy: Opportunities and Challenges 4:50 - 5:30 GPAH Invited talks (chair: Laura Balzano ). 4:50: Dimitris Papailiopoulos Gradient Diversity Empowers Distributed Learning. 5:10: Alan Ritter Large-Scale Learning for Information Extraction. 5:45 - 7:00 TTIC Reception, with remarks by Sadaoki Furui (TTIC President). Tuesday, June 20 8:30 - 9:50 GPAH Continental breakfast. 9:00 - 9:50 GPAH Bonus speaker: Larry Wasserman . (Chair: Mladen Kolar .) Locally Optimal Testing. [ Cancelled. ] 9:50 - 10:50 GPAH Invited talks (chair: Misha Belkin ). 9:50: Srinadh Bhojanapalli Effectiveness of Local Search for Low Rank Recovery 10:10: Niao He Learning From Conditional Distributions. 10:30: Clayton Scott Nonparametric Preference Completion. 10:50 - 11:20 GPAH Coffee break. 11:20 - 12:20 GPAH Invited talks (chair: Jason Lee ). 11:20: Lev Reyzin On the Complexity of Learning from Label Proportions. 11:40: Ambuj Tewari Random Perturbations in Online Learning. 12:00: Risi Kondor Multiresolution Matrix Factorization. -
Grey Ballard [email protected] PO Box 7311 • Computer Science Department • Wake Forest University • Winston Salem, NC 27106
Grey Ballard [email protected] www.wfu.edu/~ballard PO Box 7311 • Computer Science Department • Wake Forest University • Winston Salem, NC 27106 Professional Assistant Professor 2016 { present Wake Forest University Department of Computer Science, Winston Salem NC Harry S. Truman Postdoctoral Fellow 2013 { 2016 Sandia National Laboratories, Livermore CA Education Ph.D. in Computer Science Fall 2008 { Spring 2013 University of California Berkeley, with a Designated Emphasis in Computational Science and Engineering Advisor: James Demmel, Thesis: Avoiding Communication in Dense Linear Algebra M.A. in Mathematics Fall 2006 { Spring 2008 Wake Forest University, Advisor: John Baxley B.S. in Mathematics and Computer Science Fall 2002 { Spring 2006 Wake Forest University, summa cum laude with honors in mathematics and honors in computer science Honors and Awards WFU Award for Excellence in Research 2021 Awarded to early-career faculty member for significant research, creative activity, or scholarly activity NSF CAREER Award 2020-2024 National Science Foundation Faculty Early Career Development Program Dunn-Riley Faculty Fellowship 2020-2022 Wake Forest Faculty Fellowship Program ACM Senior Member 2021 Recognizes those ACM members with at least 10 years of professional experience who have demonstrated performance through technical leadership, and technical or professional contributions ICDM Best Paper Award 2015 Awarded by the program committee, with coauthors Tamara Kolda, Ali Pinar, and C. Seshadri ACM Doctoral Dissertation Award - Honorable Mention -
David Samuel Bindel Research Interests Education Professional
David Samuel Bindel Associate Professor of Computer Science http://www.cs.cornell.edu/~bindel/cv/cv.pdf Department of Computer Science [email protected] Cornell University www.cs.cornell.edu/~bindel Ithaca, NY 14853 Office: 607-255-5395 Research interests • Applied numerical linear algebra • Scientific computing • High-performance computing • Spectral network analysis methods • Optimization via surrogate models • Finite element analysis • Computational tools for electrical power grids • Simulation tools for micro-electro-mechanical systems (MEMS) Education May 1999 B.S. in Mathematics and in Computer Science, University of Maryland, College Park December 2006 Ph.D. in Computer Science, University of California, Berkeley Advisors: James Demmel (Computer Science Division and Department of Mathematics) Sanjay Govindjee (Department of Civil Engineering) Dissertation title: Structured and Parameter-Dependent Eigensolvers for Simulation-Based Design of Resonant MEMS Professional Experience Summer 2017-present. Associate Professor, Department of Computer Science, Cornell University Spring 2019. Visiting Scholar, Department of Statistics, University of Chicago Fall 2018. Faculty Research Participant, Argonne National Laboratory Summer 2009-Summer 2017. Assistant Professor, Department of Computer Science, Cornell University Fall 2006-Summer 2009. Courant Instructor of Mathematics, New York University Fall 1999-Summer 2006. Graduate Student Researcher, CS Division, UC Berkeley Fall 2005 and Spring 2001. Graduate Student Instructor, CS Division, UC Berkeley 1 Awards 2020 James and Mary Tien Excellence in Teaching Award Highest award for teaching in Cornell's College of Engineering. 2019 KDD Best Research Paper Award 2018 Cornell COE Research Excellence Award Awarded annually to two Cornell engineering professors at each level. 2018 ASPLOS Most Influential Paper Award 2018 Recognizes a historical ASPLOS paper that has had major influence on the field. -
Effective Criteria for Specific Identifiability of Tensors and Forms
FLORE Repository istituzionale dell'Università degli Studi di Firenze Effective criteria for specific identifiability of tensors and forms Questa è la Versione finale referata (Post print/Accepted manuscript) della seguente pubblicazione: Original Citation: Effective criteria for specific identifiability of tensors and forms / Chiantini, Luca; Ottaviani, GIORGIO MARIA; Vannieuwenhoven, NICK JOS. - In: SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS. - ISSN 0895-4798. - STAMPA. - 38(2017), pp. 656-681. [10.1137/16M1090132] Availability: This version is available at: 2158/1103187 since: 2021-03-18T12:40:06Z Published version: DOI: 10.1137/16M1090132 Terms of use: Open Access La pubblicazione è resa disponibile sotto le norme e i termini della licenza di deposito, secondo quanto stabilito dalla Policy per l'accesso aperto dell'Università degli Studi di Firenze (https://www.sba.unifi.it/upload/policy-oa-2016-1.pdf) Publisher copyright claim: (Article begins on next page) 26 September 2021 EFFECTIVE CRITERIA FOR SPECIFIC IDENTIFIABILITY OF TENSORS AND FORMS LUCA CHIANTINI, GIORGIO OTTAVIANI, AND NICK VANNIEUWENHOVEN Abstract. In several applications where the tensor rank decomposition arises, one often relies on its identifiability properties for meaningfully interpreting the individual rank- 1 terms appearing in the decomposition. Several criteria for identifiability have been proposed in the literature, however few results exist on how frequently they are satisfied. We propose to call such a criterion effective if it is satisfied on a dense, open subset of the smallest semi-algebraic set enclosing the set of rank-r tensors. No criteria that are effective for all ranks up to the smallest typical rank of the tensor space are known. -
Theory and Algorithms for Modern Problems in Machine Learning and an Analysis of Markets
Theory and Algorithms for Modern Problems in Machine Learning and an Analysis of Markets by Ashish Rastogi A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Courant Institute of Mathematical Sciences New York University May 2008 Richard Cole—Advisor Mehryar Mohri—Advisor °c Ashish Rastogi All Rights Reserved, 2008 To the most wonderful parents in the whole world, Mrs. Asha Rastogi and Mr. Shyam Lal Rastogi iv Acknowledgements First and foremost, I would like to thank my advisors, Professor Richard Cole and Professor Mehryar Mohri, for their unwavering support, guidance and constant encouragement. They have been inspiring mentors and much of what lies in the following pages can be credited to them. Working under their supervision has been one of the most enriching experiences of my life. I would also like to thank Professor Joel Spencer, Professor Arun Sun- dararajan, Professor Subhash Khot and Dr. Corinna Cortes for agreeing to serve as members on my thesis committee. Professor Spencer’s class on Random Graphs remains one of the most stim- ulating courses I undertook as a graduate student. Internships at Google through the summers of 2005, 2006 and 2007 were some of the most enjoy- able periods of my graduate school life. Many thanks are due to Dr. Corinna Cortes for providing me with the opportunity to work on several challenging problems at Google. Research initiated during these internships culminated in the development of ideas that form the bulk of this thesis. I would also like to thank my peers from the graduate school. -
On-Device Machine Learning: an Algorithms and Learning Theory Perspective
On-Device Machine Learning: An Algorithms and Learning Theory Perspective SAUPTIK DHAR, America Research Center, LG Electronics JUNYAO GUO, America Research Center, LG Electronics JIAYI (JASON) LIU, America Research Center, LG Electronics SAMARTH TRIPATHI, America Research Center, LG Electronics UNMESH KURUP, America Research Center, LG Electronics MOHAK SHAH, America Research Center, LG Electronics The predominant paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective sets the stage for both understanding the state-of-the-art and for identifying open challenges and future avenues of research. However, on-device learning is an expansive field with connections to a large number of related topics in AI and machine learning (including online learning, model adaptation, one/few-shot learning, etc.). Hence, covering such a large number of topics in a single survey is impractical. This survey finds a middle ground by reformulating the problem of on-device learning as resource constrained learning where the resources are compute and memory. This reformulation allows tools, techniques, and algorithms from a wide variety of research areas to be compared equitably. In addition to summarizing the state-of-the-art, the survey also identifies a number of challenges and next steps for both the algorithmic and theoretical aspects ofon-device learning. ACM Reference Format: Sauptik Dhar, Junyao Guo, Jiayi (Jason) Liu, Samarth Tripathi, Unmesh Kurup, and Mohak Shah. -
Bayesian Dynamic Tensor Regression
Bayesian Dynamic Tensor Regression∗ 1 1 §1,2 Monica Billio† , Roberto Casarin‡ , Matteo Iacopini , Sylvia 3 Kaufmann¶ 1Ca’ Foscari University of Venice 2Scuola Normale Superiore of Pisa 3Study Center Gerzensee, Foundation of the Swiss National Bank Abstract Tensor-valued data are becoming increasingly available in economics and this calls for suitable econometric tools. We propose a new dynamic linear model for tensor-valued response variables and covariates that encompasses some well-known econometric models as special cases. Our contribution is manifold. First, we define a tensor autoregressive process (ART), study its properties and derive the associated impulse response function. Second, we exploit the PARAFAC low-rank decomposition for providing a parsimonious parametrization and to incorporate sparsity effects. We also contribute to inference methods for tensors by developing a Bayesian framework which allows for including extra-sample information and for introducing shrinking ∗We are grateful to Federico Bassetti, Sylvia Frühwirth-Schnatter, Christian Gouriéroux, Søren Johansen, Siem Jan Koopman, Gary Koop, André Lucas, Alain Monfort, Peter Phillips, Christian Robert, Mike West, for their comments and suggestions. Also, we thank the seminar arXiv:1709.09606v3 [stat.ME] 3 Jul 2019 participants at: CREST, University of Southampton, Vrije University of Amsterdam, London School of Economics, Maastricht University, Polytechnic University of Milan. Moreover, we thank the conference and workshop participants at: “ICEEE 2019” in Lecce, 2019, “CFENetwork 2018” in Pisa, 2018, “29th EC2 conference” in Rome, 2018, “12th RCEA Annual meeting” in Rimini, 2018, “8th MAF” in Madrid, 2018, “CFENetwork 2017” in London, 2017, “ICEEE 2017” in Messina, 2017, “3rd Vienna Workshop on High-dimensional Time Series in Macroeconomics and Finance” in Wien, 2017, “BISP10” in Milan, 2017, “ESOBE” in Venice, 2016, “CFENetwork” in Seville, 2016, and the “SIS Intermediate Meeting” of the Italian Statistical Society in Florence, 2016.