Multi-Dimensional Tensor Sketch: Dimensionality Reduction That Retains Efficient Tensor Operations
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Short and Deep: Sketching and Neural Networks
SHORT AND DEEP: SKETCHING AND NEURAL NETWORKS Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar∗ Google Brain ABSTRACT Data-independent methods for dimensionality reduction such as random projec- tions, sketches, and feature hashing have become increasingly popular in recent years. These methods often seek to reduce dimensionality while preserving the hypothesis class, resulting in inherent lower bounds on the size of projected data. For example, preserving linear separability requires Ω(1/γ2) dimensions, where γ is the margin, and in the case of polynomial functions, the number of required di- mensions has an exponential dependence on the polynomial degree. Despite these limitations, we show that the dimensionality can be reduced further while main- taining performance guarantees, using improper learning with a slightly larger hypothesis class. In particular, we show that any sparse polynomial function of a sparse binary vector can be computed from a compact sketch by a single-layer neu- ral network, where the sketch size has a logarithmic dependence on the polyno- mial degree. A practical consequence is that networks trained on sketched data are compact, and therefore suitable for settings with memory and power constraints. We empirically show that our approach leads to networks with fewer parameters than related methods such as feature hashing, at equal or better performance. 1 INTRODUCTION In many supervised learning problems, input data are high-dimensional and sparse. The high di- mensionality may be inherent in the domain, such as a large vocabulary in a language model, or the result of creating hybrid conjunction features. This setting poses known statistical and compu- tational challenges for standard supervised learning techniques, as high-dimensional inputs lead to models with a very large number of parameters. -
Low-Rank Tucker Approximation of a Tensor from Streaming Data : ; \ \ } Yiming Sun , Yang Guo , Charlene Luo , Joel Tropp , and Madeleine Udell
SIAM J. MATH.DATA SCI. © 2020 Society for Industrial and Applied Mathematics Vol. 2, No. 4, pp. 1123{1150 Low-Rank Tucker Approximation of a Tensor from Streaming Data˚ : ; x { } Yiming Sun , Yang Guo , Charlene Luo , Joel Tropp , and Madeleine Udell Abstract. This paper describes a new algorithm for computing a low-Tucker-rank approximation of a tensor. The method applies a randomized linear map to the tensor to obtain a sketch that captures the important directions within each mode, as well as the interactions among the modes. The sketch can be extracted from streaming or distributed data or with a single pass over the tensor, and it uses storage proportional to the degrees of freedom in the output Tucker approximation. The algorithm does not require a second pass over the tensor, although it can exploit another view to compute a superior approximation. The paper provides a rigorous theoretical guarantee on the approximation error. Extensive numerical experiments show that the algorithm produces useful results that improve on the state-of-the-art for streaming Tucker decomposition. Key words. Tucker decomposition, tensor compression, dimension reduction, sketching method, randomized algorithm, streaming algorithm AMS subject classifcations. 68Q25, 68R10, 68U05 DOI. 10.1137/19M1257718 1. Introduction. Large-scale datasets with natural tensor (multidimensional array) struc- ture arise in a wide variety of applications, including computer vision [39], neuroscience [9], scientifc simulation [3], sensor networks [31], and data mining [21]. In many cases, these tensors are too large to manipulate, to transmit, or even to store in a single machine. Luckily, tensors often exhibit a low-rank structure and can be approximated by a low-rank tensor factorization, such as CANDECOMP/PARAFAC (CP), tensor train, or Tucker factorization [20]. -
On Manifolds of Tensors of Fixed Tt-Rank
ON MANIFOLDS OF TENSORS OF FIXED TT-RANK SEBASTIAN HOLTZ, THORSTEN ROHWEDDER, AND REINHOLD SCHNEIDER Abstract. Recently, the format of TT tensors [19, 38, 34, 39] has turned out to be a promising new format for the approximation of solutions of high dimensional problems. In this paper, we prove some new results for the TT representation of a tensor U ∈ Rn1×...×nd and for the manifold of tensors of TT-rank r. As a first result, we prove that the TT (or compression) ranks ri of a tensor U are unique and equal to the respective seperation ranks of U if the components of the TT decomposition are required to fulfil a certain maximal rank condition. We then show that d the set T of TT tensors of fixed rank r forms an embedded manifold in Rn , therefore preserving the essential theoretical properties of the Tucker format, but often showing an improved scaling behaviour. Extending a similar approach for matrices [7], we introduce certain gauge conditions to obtain a unique representation of the tangent space TU T of T and deduce a local parametrization of the TT manifold. The parametrisation of TU T is often crucial for an algorithmic treatment of high-dimensional time-dependent PDEs and minimisation problems [33]. We conclude with remarks on those applications and present some numerical examples. 1. Introduction The treatment of high-dimensional problems, typically of problems involving quantities from Rd for larger dimensions d, is still a challenging task for numerical approxima- tion. This is owed to the principal problem that classical approaches for their treatment normally scale exponentially in the dimension d in both needed storage and computa- tional time and thus quickly become computationally infeasable for sensible discretiza- tions of problems of interest. -
Fast and Scalable Polynomial Kernels Via Explicit Feature Maps *
Fast and Scalable Polynomial Kernels via Explicit Feature Maps * Ninh Pham Rasmus Pagh IT University of Copenhagen IT University of Copenhagen Copenhagen, Denmark Copenhagen, Denmark [email protected] [email protected] ABSTRACT data space into high-dimensional feature space, where each Approximation of non-linear kernels using random feature coordinate corresponds to one feature of the data points. In mapping has been successfully employed in large-scale data that space, one can perform well-known data analysis al- analysis applications, accelerating the training of kernel ma- gorithms without ever interacting with the coordinates of chines. While previous random feature mappings run in the data, but rather by simply computing their pairwise in- O(ndD) time for n training samples in d-dimensional space ner products. This operation can not only avoid the cost and D random feature maps, we propose a novel random- of explicit computation of the coordinates in feature space ized tensor product technique, called Tensor Sketching, for but also handle general types of data (such as numeric data, approximating any polynomial kernel in O(n(d + D log D)) symbolic data). time. Also, we introduce both absolute and relative error While kernel methods have been used successfully in a va- bounds for our approximation to guarantee the reliability riety of data analysis tasks, their scalability is a bottleneck. of our estimation algorithm. Empirically, Tensor Sketching Kernel-based learning algorithms usually scale poorly with achieves higher accuracy and often runs orders of magni- the number of the training samples (a cubic running time tude faster than the state-of-the-art approach for large-scale and quadratic storage for direct methods). -
Matrices and Tensors
APPENDIX MATRICES AND TENSORS A.1. INTRODUCTION AND RATIONALE The purpose of this appendix is to present the notation and most of the mathematical tech- niques that are used in the body of the text. The audience is assumed to have been through sev- eral years of college-level mathematics, which included the differential and integral calculus, differential equations, functions of several variables, partial derivatives, and an introduction to linear algebra. Matrices are reviewed briefly, and determinants, vectors, and tensors of order two are described. The application of this linear algebra to material that appears in under- graduate engineering courses on mechanics is illustrated by discussions of concepts like the area and mass moments of inertia, Mohr’s circles, and the vector cross and triple scalar prod- ucts. The notation, as far as possible, will be a matrix notation that is easily entered into exist- ing symbolic computational programs like Maple, Mathematica, Matlab, and Mathcad. The desire to represent the components of three-dimensional fourth-order tensors that appear in anisotropic elasticity as the components of six-dimensional second-order tensors and thus rep- resent these components in matrices of tensor components in six dimensions leads to the non- traditional part of this appendix. This is also one of the nontraditional aspects in the text of the book, but a minor one. This is described in §A.11, along with the rationale for this approach. A.2. DEFINITION OF SQUARE, COLUMN, AND ROW MATRICES An r-by-c matrix, M, is a rectangular array of numbers consisting of r rows and c columns: ¯MM.. -
A Some Basic Rules of Tensor Calculus
A Some Basic Rules of Tensor Calculus The tensor calculus is a powerful tool for the description of the fundamentals in con- tinuum mechanics and the derivation of the governing equations for applied prob- lems. In general, there are two possibilities for the representation of the tensors and the tensorial equations: – the direct (symbolic) notation and – the index (component) notation The direct notation operates with scalars, vectors and tensors as physical objects defined in the three dimensional space. A vector (first rank tensor) a is considered as a directed line segment rather than a triple of numbers (coordinates). A second rank tensor A is any finite sum of ordered vector pairs A = a b + ... +c d. The scalars, vectors and tensors are handled as invariant (independent⊗ from the choice⊗ of the coordinate system) objects. This is the reason for the use of the direct notation in the modern literature of mechanics and rheology, e.g. [29, 32, 49, 123, 131, 199, 246, 313, 334] among others. The index notation deals with components or coordinates of vectors and tensors. For a selected basis, e.g. gi, i = 1, 2, 3 one can write a = aig , A = aibj + ... + cidj g g i i ⊗ j Here the Einstein’s summation convention is used: in one expression the twice re- peated indices are summed up from 1 to 3, e.g. 3 3 k k ik ik a gk ∑ a gk, A bk ∑ A bk ≡ k=1 ≡ k=1 In the above examples k is a so-called dummy index. Within the index notation the basic operations with tensors are defined with respect to their coordinates, e. -
Learning-Based Frequency Estimation Algorithms
Published as a conference paper at ICLR 2019 LEARNING-BASED FREQUENCY ESTIMATION ALGORITHMS Chen-Yu Hsu, Piotr Indyk, Dina Katabi & Ali Vakilian Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139, USA fcyhsu,indyk,dk,[email protected] ABSTRACT Estimating the frequencies of elements in a data stream is a fundamental task in data analysis and machine learning. The problem is typically addressed using streaming algorithms which can process very large data using limited storage. Today’s streaming algorithms, however, cannot exploit patterns in their input to improve performance. We propose a new class of algorithms that automatically learn relevant patterns in the input data and use them to improve its frequency estimates. The proposed algorithms combine the benefits of machine learning with the formal guarantees available through algorithm theory. We prove that our learning-based algorithms have lower estimation errors than their non-learning counterparts. We also evaluate our algorithms on two real-world datasets and demonstrate empirically their performance gains. 1.I NTRODUCTION Classical algorithms provide formal guarantees over their performance, but often fail to leverage useful patterns in their input data to improve their output. On the other hand, deep learning models are highly successful at capturing and utilizing complex data patterns, but often lack formal error bounds. The last few years have witnessed a growing effort to bridge this gap and introduce al- gorithms that can adapt to data properties while delivering worst case guarantees. Deep learning modules have been integrated into the design of Bloom filters (Kraska et al., 2018; Mitzenmacher, 2018), caching algorithms (Lykouris & Vassilvitskii, 2018), graph optimization (Dai et al., 2017), similarity search (Salakhutdinov & Hinton, 2009; Weiss et al., 2009) and compressive sensing (Bora et al., 2017). -
1.3 Cartesian Tensors a Second-Order Cartesian Tensor Is Defined As A
1.3 Cartesian tensors A second-order Cartesian tensor is defined as a linear combination of dyadic products as, T Tijee i j . (1.3.1) The coefficients Tij are the components of T . A tensor exists independent of any coordinate system. The tensor will have different components in different coordinate systems. The tensor T has components Tij with respect to basis {ei} and components Tij with respect to basis {e i}, i.e., T T e e T e e . (1.3.2) pq p q ij i j From (1.3.2) and (1.2.4.6), Tpq ep eq TpqQipQjqei e j Tij e i e j . (1.3.3) Tij QipQjqTpq . (1.3.4) Similarly from (1.3.2) and (1.2.4.6) Tij e i e j Tij QipQjqep eq Tpqe p eq , (1.3.5) Tpq QipQjqTij . (1.3.6) Equations (1.3.4) and (1.3.6) are the transformation rules for changing second order tensor components under change of basis. In general Cartesian tensors of higher order can be expressed as T T e e ... e , (1.3.7) ij ...n i j n and the components transform according to Tijk ... QipQjqQkr ...Tpqr... , Tpqr ... QipQjqQkr ...Tijk ... (1.3.8) The tensor product S T of a CT(m) S and a CT(n) T is a CT(m+n) such that S T S T e e e e e e . i1i2 im j1j 2 jn i1 i2 im j1 j2 j n 1.3.1 Contraction T Consider the components i1i2 ip iq in of a CT(n). -
Polynomial Tensor Sketch for Element-Wise Function of Low-Rank Matrix
Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix Insu Han 1 Haim Avron 2 Jinwoo Shin 3 1 Abstract side, we obtain a o(n2)-time approximation scheme of f (A)x ≈ T T >x for an arbitrary vector x 2 n due This paper studies how to sketch element-wise U V R to the associative property of matrix multiplication, where functions of low-rank matrices. Formally, given the exact computation of f (A)x requires the complexity low-rank matrix A = [A ] and scalar non-linear ij of Θ(n2). function f, we aim for finding an approximated low-rank representation of the (possibly high- The matrix-vector multiplication f (A)x or the low-rank > rank) matrix [f(Aij)]. To this end, we propose an decomposition f (A) ≈ TU TV is useful in many machine efficient sketching-based algorithm whose com- learning algorithms. For example, the Gram matrices of plexity is significantly lower than the number of certain kernel functions, e.g., polynomial and radial basis entries of A, i.e., it runs without accessing all function (RBF), are element-wise matrix functions where entries of [f(Aij)] explicitly. The main idea un- the rank is the dimension of the underlying data. Such derlying our method is to combine a polynomial matrices are the cornerstone of so-called kernel methods approximation of f with the existing tensor sketch and the ability to multiply the Gram matrix to a vector scheme for approximating monomials of entries suffices for most kernel learning. The Sinkhorn-Knopp of A. -
Tensor-Spinor Theory of Gravitation in General Even Space-Time Dimensions
Physics Letters B 817 (2021) 136288 Contents lists available at ScienceDirect Physics Letters B www.elsevier.com/locate/physletb Tensor-spinor theory of gravitation in general even space-time dimensions ∗ Hitoshi Nishino a, ,1, Subhash Rajpoot b a Department of Physics, College of Natural Sciences and Mathematics, California State University, 2345 E. San Ramon Avenue, M/S ST90, Fresno, CA 93740, United States of America b Department of Physics & Astronomy, California State University, 1250 Bellflower Boulevard, Long Beach, CA 90840, United States of America a r t i c l e i n f o a b s t r a c t Article history: We present a purely tensor-spinor theory of gravity in arbitrary even D = 2n space-time dimensions. Received 18 March 2021 This is a generalization of the purely vector-spinor theory of gravitation by Bars and MacDowell (BM) in Accepted 9 April 2021 4D to general even dimensions with the signature (2n − 1, 1). In the original BM-theory in D = (3, 1), Available online 21 April 2021 the conventional Einstein equation emerges from a theory based on the vector-spinor field ψμ from a Editor: N. Lambert m lagrangian free of both the fundamental metric gμν and the vierbein eμ . We first improve the original Keywords: BM-formulation by introducing a compensator χ, so that the resulting theory has manifest invariance = =− = Bars-MacDowell theory under the nilpotent local fermionic symmetry: δψ Dμ and δ χ . We next generalize it to D Vector-spinor (2n − 1, 1), following the same principle based on a lagrangian free of fundamental metric or vielbein Tensors-spinors rs − now with the field content (ψμ1···μn−1 , ωμ , χμ1···μn−2 ), where ψμ1···μn−1 (or χμ1···μn−2 ) is a (n 1) (or Metric-less formulation (n − 2)) rank tensor-spinor. -
Low-Level Image Processing with the Structure Multivector
Low-Level Image Processing with the Structure Multivector Michael Felsberg Bericht Nr. 0202 Institut f¨ur Informatik und Praktische Mathematik der Christian-Albrechts-Universitat¨ zu Kiel Olshausenstr. 40 D – 24098 Kiel e-mail: [email protected] 12. Marz¨ 2002 Dieser Bericht enthalt¨ die Dissertation des Verfassers 1. Gutachter Prof. G. Sommer (Kiel) 2. Gutachter Prof. U. Heute (Kiel) 3. Gutachter Prof. J. J. Koenderink (Utrecht) Datum der mundlichen¨ Prufung:¨ 12.2.2002 To Regina ABSTRACT The present thesis deals with two-dimensional signal processing for computer vi- sion. The main topic is the development of a sophisticated generalization of the one-dimensional analytic signal to two dimensions. Motivated by the fundamental property of the latter, the invariance – equivariance constraint, and by its relation to complex analysis and potential theory, a two-dimensional approach is derived. This method is called the monogenic signal and it is based on the Riesz transform instead of the Hilbert transform. By means of this linear approach it is possible to estimate the local orientation and the local phase of signals which are projections of one-dimensional functions to two dimensions. For general two-dimensional signals, however, the monogenic signal has to be further extended, yielding the structure multivector. The latter approach combines the ideas of the structure tensor and the quaternionic analytic signal. A rich feature set can be extracted from the structure multivector, which contains measures for local amplitudes, the local anisotropy, the local orientation, and two local phases. Both, the monogenic signal and the struc- ture multivector are combined with an appropriate scale-space approach, resulting in generalized quadrature filters. -
The Riemann Curvature Tensor
The Riemann Curvature Tensor Jennifer Cox May 6, 2019 Project Advisor: Dr. Jonathan Walters Abstract A tensor is a mathematical object that has applications in areas including physics, psychology, and artificial intelligence. The Riemann curvature tensor is a tool used to describe the curvature of n-dimensional spaces such as Riemannian manifolds in the field of differential geometry. The Riemann tensor plays an important role in the theories of general relativity and gravity as well as the curvature of spacetime. This paper will provide an overview of tensors and tensor operations. In particular, properties of the Riemann tensor will be examined. Calculations of the Riemann tensor for several two and three dimensional surfaces such as that of the sphere and torus will be demonstrated. The relationship between the Riemann tensor for the 2-sphere and 3-sphere will be studied, and it will be shown that these tensors satisfy the general equation of the Riemann tensor for an n-dimensional sphere. The connection between the Gaussian curvature and the Riemann curvature tensor will also be shown using Gauss's Theorem Egregium. Keywords: tensor, tensors, Riemann tensor, Riemann curvature tensor, curvature 1 Introduction Coordinate systems are the basis of analytic geometry and are necessary to solve geomet- ric problems using algebraic methods. The introduction of coordinate systems allowed for the blending of algebraic and geometric methods that eventually led to the development of calculus. Reliance on coordinate systems, however, can result in a loss of geometric insight and an unnecessary increase in the complexity of relevant expressions. Tensor calculus is an effective framework that will avoid the cons of relying on coordinate systems.