Tensor Rank Decompositions Via the Pseudo-Moment Method
Total Page:16
File Type:pdf, Size:1020Kb
TENSOR RANK DECOMPOSITIONS VIA THE PSEUDO-MOMENT METHOD A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Jonathan Shi December 2019 © 2019 Jonathan Shi ALL RIGHTS RESERVED TENSOR RANK DECOMPOSITIONS VIA THE PSEUDO-MOMENT METHOD Jonathan Shi, Ph.D. Cornell University 2019 Over a series of four articles and an introduction, a “method of pseudo-moments” is developed which gives polynomial-time tensor rank decompositions for a variety of tensor component models. Algorithms given fall into two general classes: those falling back on convex optimization, which develop the theory of polynomial-time algorithms, as well as those constructed through spectral and matrix polynomial methods, which illustrate the possibility of realizing the aforementioned polynomial-time algorithm ideas in runtimes practical for real- life inputs. Tensor component models covered include those featuring random, worst-case, generic, or smoothed inputs, as well as cases featuring overcomplete and undercomplete rank. All models are capable of tolerating substantial noise in tensor inputs, measured in spectral norm. BIOGRAPHICAL SKETCH Jonathan Shi was raised in Seattle, Washington, and received a Bachelors of Science from the University of Washington in 2013 in the major concentrations of Computer Science, Mathematics, and Physics. He has been working at Cornell University under the guidance of Professor David Steurer and will soon join Bocconi University in Milan as a Research Fellow. iii This dissertation is dedicated to my mother and father, Richard C-J Shi and Tracey H. Luo, who made this work possible with their perseverence, and in the memory of Mary Kaitlynne Richardson, whose kindness would have driven her to greatness. iv ACKNOWLEDGEMENTS I am in gratitude for the support and mentorship of my advisor David Steurer, who oversaw my development of skill and knowledge in this thesis topic from the starting scraps I had, as well as Professor Robert Kleinberg in his role as a Director of Graduate Studies, committee member, and all-around pretty cool person. I would like to ensure the acknowledgement of those in the student community who worked (in part in student organizations) to create a welcoming, supportive, and inclusive community, as well as those who labor to keep the Computer Science Field at Cornell running. I would not have been able to complete this work without the aid of Eve Abrams, LCSW-R, Robert Mendola, MD, Clint Wattenberg, MS RD, and Edward Koppel, MD in improving my health and directing me to needed resources. I acknowledge support from a Cornell University Fellowship as well as the NSF via my advisor’s NSF CAREER Grant CCF-1350196 over the duration of my graduate program. v CONTENTS Biographical Sketch.............................. iii Dedication................................... iv Acknowledgements..............................v Contents.................................... vi List of Tables..................................x List of Figures................................. xi 1 Introduction1 1.0.1 Results stated..........................3 1.0.2 Overview of methods.....................7 1.0.3 Frontier of what is possible.................. 13 1.0.4 Organization.......................... 15 2 Spiked tensor model 17 2.1 Introduction............................... 17 2.1.1 Results.............................. 20 2.1.2 Techniques............................ 23 2.1.3 Related Work.......................... 27 2.2 Preliminaries............................... 29 2.2.1 Notation............................. 29 2.2.2 Polynomials and Matrices................... 30 2.2.3 The Sum of Squares (SoS) Algorithm............. 31 2.3 Certifying Bounds on Random Polynomials............. 32 2.4 Polynomial-Time Recovery via Sum of Squares........... 35 2.4.1 Semi-Random Tensor PCA.................. 39 2.5 Linear Time Recovery via Further Relaxation............ 40 2.5.1 The Spectral SoS Relaxation.................. 41 Í ⊗ 2.5.2 Recovery via the i Ti Ti Spectral SoS Solution...... 44 2.5.3 Nearly-Linear-Time Recovery via Tensor Unfolding and Spectral SoS........................... 49 2.5.4 Fast Recovery in the Semi-Random Model......... 52 2.5.5 Fast Recovery with Symmetric Noise............. 54 2.5.6 Numerical Simulations..................... 58 2.6 Lower Bounds.............................. 59 2.6.1 Polynomials, Vectors, Matrices, and Symmetries, Redux. 62 2.6.2 Formal Statement of the Lower Bound............ 65 2.6.3 In-depth Preliminaries for Pseudo-Expectation Symmetries 68 2.6.4 Construction of Initial Pseudo-Distributions........ 71 2.6.5 Getting to the Unit Sphere................... 75 2.6.6 Repairing Almost-Pseudo-Distributions........... 79 2.6.7 Putting Everything Together................. 80 2.7 Higher-Order Tensors......................... 87 vi 3 Spectral methods for the random overcomplete model 90 3.1 Introduction............................... 90 3.1.1 Planted Sparse Vector in Random Linear Subspace.... 92 3.1.2 Overcomplete Tensor Decomposition............ 94 3.1.3 Tensor Principal Component Analysis............ 97 3.1.4 Related Work.......................... 98 3.2 Techniques................................ 100 3.2.1 Planted Sparse Vector in Random Linear Subspace.... 104 3.2.2 Overcomplete Tensor Decomposition............ 107 3.2.3 Tensor Principal Component Analysis............ 111 3.3 Preliminaries............................... 113 3.4 Planted Sparse Vector in Random Linear Subspace......... 115 3.4.1 Algorithm Succeeds on Good Basis............. 119 3.5 Overcomplete Tensor Decomposition................. 122 3.5.1 Proof of Theorem 3.5.3..................... 125 3.5.2 Discussion of Full Algorithm................. 129 3.6 Tensor principal component analysis................. 132 3.6.1 Spiked tensor model...................... 132 3.6.2 Linear-time algorithm..................... 133 4 Polynomial lifts 136 4.1 Introduction............................... 136 4.1.1 Results for tensor decomposition............... 140 4.1.2 Applications of tensor decomposition............ 146 4.1.3 Polynomial optimization with few global optima...... 148 4.2 Techniques................................ 149 4.2.1 Rounding pseudo-distributions by matrix diagonalization 150 4.2.2 Overcomplete fourth-order tensor.............. 154 4.2.3 Random overcomplete third-order tensor.......... 157 4.3 Preliminaries............................... 158 4.3.1 Pseudo-distributions...................... 160 4.3.2 Sum of squares proofs..................... 162 4.3.3 Matrix constraints and sum-of-squares proofs....... 164 4.4 Rounding pseudo-distributions.................... 165 4.4.1 Rounding by matrix diagonalization............. 165 4.4.2 Improving accuracy of a found solution........... 169 4.5 Decomposition with sum-of-squares................. 171 4.5.1 General algorithm for tensor decomposition........ 172 4.5.2 Tensors with orthogonal components............ 175 4.5.3 Tensors with separated components............. 176 4.6 Spectral norms and tensor operations................ 179 4.6.1 Spectral norms and pseudo-distributions.......... 179 4.6.2 Spectral norm of random contraction............ 181 4.7 Decomposition of random overcomplete 3-tensors......... 185 vii 4.8 Robust decomposition of overcomplete 4-tensors.......... 189 4.8.1 Noiseless case.......................... 193 4.8.2 Noisy case............................ 195 4.8.3 Condition number under smooth analysis.......... 199 4.9 Tensor decomposition with general components.......... 201 4.9.1 Improved rounding of pseudo-distributions........ 202 4.9.2 Finding all components.................... 208 4.10 Fast orthogonal tensor decomposition without sum-of-squares.. 211 5 Overcomplete generic decomposition spectrally 216 5.1 Introduction............................... 216 5.1.1 Our Results........................... 221 5.1.2 Related works.......................... 224 5.2 Overview of algorithm......................... 230 5.3 Preliminaries............................... 235 5.4 Tools for analysis and implementation................ 238 5.4.1 Robustness and spectral perturbation............ 238 5.4.2 Efficient implementation and runtime analysis....... 239 5.5 Lifting.................................. 241 5.5.1 Algebraic identifiability argument.............. 245 5.5.2 Robustness arguments..................... 247 5.6 Rounding................................. 250 5.6.1 Recovering candidate whitened and squared components 251 5.6.2 Extracting components from the whitened squares.... 255 5.6.3 Testing candidate components................ 257 5.6.4 Putting things together..................... 259 5.6.5 Cleaning............................. 261 5.7 Combining lift and round for final algorithm............ 262 5.8 Condition number of random tensors................ 266 5.8.1 Notation............................. 272 5.8.2 Fourth Moment Identities................... 273 5.8.3 Matrix Product Identities................... 274 5.8.4 Naive Spectral Norm Estimate................ 277 5.8.5 Off-Diagonal Second Moment Estimates........... 278 5.8.6 Matrix Decoupling....................... 280 5.8.7 Putting It Together....................... 281 5.8.8 Omitted Proofs......................... 283 Bibliography 286 A Spiked tensor model 294 A.1 Pseudo-Distribution Facts....................... 294 A.2 Concentration bounds......................... 297 A.2.1 Elementary Random Matrix Review............. 297 viii Í ⊗ A.2.2 Concentration