Learning Low-Dimensional Models for Heterogeneous Data by David

Total Page:16

File Type:pdf, Size:1020Kb

Learning Low-Dimensional Models for Heterogeneous Data by David Learning Low-Dimensional Models for Heterogeneous Data by David Hong A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering: Systems) in The University of Michigan 2019 Doctoral Committee: Professor Laura Balzano, Co-Chair Professor Jeffrey A. Fessler, Co-Chair Professor Rina Foygel Barber, The University of Chicago Professor Anna Gilbert Professor Raj Rao Nadakuditi David Hong [email protected] ORCID iD: 0000-0003-4698-4175 © David Hong 2019 To mom and dad. ii ACKNOWLEDGEMENTS As always, no list of dissertation acknowledgements can hope to be complete, and I have many people to thank for their wonderful influence over the past six years of graduate study. To start, I thank my advisors, Professor Laura Balzano and Professor Jeffrey Fessler, who introduced me to the exciting challenges of learning low-dimensional image models from heterogeneous data and taught me the funda- mental ideas and techniques needed to tackle it; this dissertation is the direct result. Laura and Jeff, I am always amazed by your deep and broad technical insights, your sincere care for students and your great patience (with me!). Working with you has changed the way I think, write and teach, and I cannot say how thankful I am to have had the opportunity to learn from you. I hope to one day be as wonderful an advisor to students of my own. I am also thankful for my excellent committee. Professor Rina Barber has many times provided deep statistical insight into my work, and her wonderful suggestions can be found throughout this dissertation. Professor Anna Gilbert has always chal- lenged me to strive for work that is both deep in theory and significant in practice, and she continues to be a role model for me. Professor Raj Nadakuditi guided me when I first arrived in Ann Arbor, encouraged me during my early difficulties, and taught me many of the random matrix theory tools I used in Chapters III andIV.I am indebted to all three and am honored to have had them form my committee. I also had the distinct pleasure of spending the summer of 2017 interning at Sandia National Laboratories in Livermore, California, where I worked on the Gen- eralized CP tensor decomposition in ChapterV. I am so thankful for the wonderful mentorship of Dr. Tammy Kolda and Dr. Cliff Anderson-Bergman, who proposed this work and taught me nearly everything I now know about tensor decomposition and its numerous applications. I continue to learn from their many insights into the practice and theory of data analysis. At Sandia, I also had the pleasure of working alongside Dr. Jed Duersch and of sharing a cubicle with John Kallaugher, who both made my time there truly delightful. I have also been fortunate to interact with many great postdoctoral scholars (all in fact hired by members of my committee!), who made Ann Arbor intellectually fertile grounds: Dr. Sai Ravishankar, Dr. Il Yong Chun, Dr. Greg Ongie and Dr. Lalit Jain. I am thankful for our many fun conversations that have shaped how I think about choosing and tackling research problems. Chapters III andIV of the dissertation also iii benefitted from discussions with Professor Edgar Dobriban and Professor Romain Couillet, and I thank them for sharing their insights about spiked covariance models and random matrix theory tools for analyzing their asymptotic properties. My many wonderful peers have also created a great learning environment. I thank Wooseok Ha (who in fact introduced me to Rina), Audra McMillan, and Yan Shuo Tan for our many stimulating discussions over the years; I am so glad we got to meet at the 2016 Park City Mathematics Institute. Likewise, I am thankful for Mohsen Heidari, Matthew Kvalheim, Aniket Deshmukh, John Lipor and Dejiao Zhang, who began the Ph.D. program with me and have been wonderful co-voyagers. A special thank you goes to John { J.J., you have often been a true encouragement and I have learned so much from you about life and research. I am also so thankful for all the students of the SPADA Lab and Lab of Jeff (LoJ) who make coming to campus a joy; I will miss our many fun chats. I thank Donghwan Kim, Madison McGaffin, Mai Le and Gopal Nataraj, who welcomed me and my many questions about optimization and imaging, as well as Curtis Jin, Raj Tejas Suryaprakash, Brian Moore, Himanshu Nayar and Arvind Prasadan, who welcomed my random matrix theory questions. I also had the pleasure of working with Robert Malinas while he was an undergraduate on the Sequences of Unions of Subspaces (SUoS) in Chapter VII; I am so excited that you are continuing in research and look forward to all that you will accomplish! My graduate studies were supported by a Rackham Merit Fellowship, a National Science Foundation (NSF) Graduate Research Fellowship (DGE #1256260) and NSF Grant ECCS-1508943. I am thankful to these sources that afforded me great intel- lectual freedom and made this dissertation possible. Hyunkoo and Yujin Chung, Chad Liu and Anna Cunningham, Hal Zhang, Ruiji Jiang and Max Li, thank you for your friendship over the years. Thanks also to the Campredon's { you are dear friends (and have fed me many times). Special thanks to David Allor, David Nesbitt and Chris Dinh, who were not only roommates but also true brothers in Christ, and to Pastor Shannon Nielsen and my church, who so faithfully point me to Christ and my ultimate hope in Him. Finally and most importantly, I thank my family. My older sisters always took excellent care of their little brother, and I have followed in their footsteps all my life (since, you know, I want to be like you both). Pauline and Esther, thank you for being such loving sisters and incredible examples to me. Josh and Martin, it is wonderful to finally have older brothers(-in-law) too! Mirabelle, my super cute niece, thanks for all your awesome smiles! Mom and Dad, you introduced me to an exciting world full of wonder and adventure, taught me to have fun working hard, and lovingly picked me up all those times I fell down. Thank you. This dissertation is dedicated to you. iv TABLE OF CONTENTS DEDICATION ::::::::::::::::::::::::::::::::::: ii ACKNOWLEDGEMENTS :::::::::::::::::::::::::::: iii LIST OF TABLES ::::::::::::::::::::::::::::::::: viii LIST OF FIGURES :::::::::::::::::::::::::::::::: ix LIST OF ALGORITHMS ::::::::::::::::::::::::::::: xx ABSTRACT :::::::::::::::::::::::::::::::::::: xxi CHAPTER I. Introduction ................................1 1.1 Low-dimensional models for high-dimensional data.........1 1.2 Data with heterogeneous quality...................2 1.3 Data with heterogeneous statistical assumptions..........2 1.4 Data with heterogeneous linear structure..............3 1.5 Organization of dissertation.....................4 II. Background ................................6 2.1 Notations and Conventions......................6 2.2 Principal Component Analysis (PCA)................7 2.3 Canonical Polyadic Tensor Decomposition............. 14 2.4 Unions of subspaces.......................... 19 2.5 Low-dimensional models and medical imaging........... 22 III. Asymptotic performance of PCA for high-dimensional het- eroscedastic data ............................. 26 3.1 Introduction.............................. 27 3.2 Main results.............................. 30 3.3 Impact of parameters......................... 38 3.4 Numerical simulation......................... 43 v 3.5 Proof of Theorem 3.4......................... 45 3.6 Proof of Theorem 3.9......................... 52 3.7 Discussion............................... 53 3.8 Supplementary material....................... 55 IV. Optimally weighted PCA for high-dimensional heteroscedastic data ..................................... 73 4.1 Introduction.............................. 73 4.2 Model for heteroscedastic data.................... 77 4.3 Asymptotic performance of weighted PCA............. 78 4.4 Proof sketch for Theorem 4.3..................... 83 4.5 Optimally weighted PCA....................... 85 4.6 Suboptimal weighting......................... 89 4.7 Impact of model parameters..................... 90 4.8 Optimal sampling under budget constraints............. 94 4.9 Numerical simulation......................... 97 4.10 Discussion............................... 98 4.11 Supplementary material....................... 101 V. Generalized canonical polyadic tensor decomposition for non- Gaussian data ............................... 123 5.1 Introduction.............................. 124 5.2 Background and notation....................... 126 5.3 Choice of loss function........................ 128 5.4 GCP decomposition.......................... 135 5.5 Experimental results......................... 141 5.6 Discussion............................... 154 5.7 Supplementary material....................... 155 VI. Ensemble K-subspaces for data from unions of subspaces ... 158 6.1 Introduction.............................. 159 6.2 Problem Formulation & Related Work............... 160 6.3 Ensemble K-subspaces........................ 164 6.4 Recovery Guarantees......................... 168 6.5 Experimental Results......................... 177 6.6 Discussion............................... 181 6.7 Proofs of Theoretical Results..................... 181 6.8 Implementation Details........................ 192 VII. Sequences of unions of subspaces for data with heterogeneous complexity ................................. 195 vi 7.1 Introduction.............................. 196 7.2 Related Work............................. 197 7.3 Learning a Sequence of Unions of Subspaces...........
Recommended publications
  • Five Lectures on Sparse and Redundant Representations Modelling of Images
    Five Lectures on Sparse and Redundant Representations Modelling of Images Michael Elad IAS/Park City Mathematics Series Volume 19, 2010 Five Lectures on Sparse and Redundant Representations Modelling of Images Michael Elad Preface The field of sparse and redundant representations has evolved tremendously over the last decade or so. In this short series of lectures, we intend to cover this progress through its various aspects: theoretical claims, numerical problems (and solutions), and lastly, applications, focusing on tasks in image processing. Our story begins with a search for a proper model, and of ways to use it. We are surrounded by huge masses of data, coming from various sources, such as voice signals, still images, radar images, CT images, traffic data and heart signals just to name a few. Despite their diversity, these sources of data all have something in common: each and everyone of these signals has some kind of internal structure, which we wish to exploit for processing it better. Consider the simplest problem of removing noise from data. Given a set of noisy samples (e.g., value as a function of time), we wish to recover the true (noise- free) samples with as high accuracy as possible. Suppose that we know the exact statistical characteristics of the noise. Could we perform denoising of the given data? The answer is negative: Without knowing anything about the properties of the signal in mind, this is an impossible task. However, if we could assume, for example, that the signal is piecewise-linear, the noise removal task becomes feasible. The same can be said for almost any data processing problem, such as de- blurring, super-resolution, inpainting, compression, anomaly detection, detection, recognition, and more - relying on a good model is in the base of any useful solution.
    [Show full text]
  • Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries Michael Elad and Michal Aharon
    3736 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 12, DECEMBER 2006 Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries Michael Elad and Michal Aharon Abstract—We address the image denoising problem, where we intend to concentrate on one specific approach towards the zero-mean white and homogeneous Gaussian additive noise is to image denoising problem that we find to be highly effective and be removed from a given image. The approach taken is based promising: the use of sparse and redundant representations over on sparse and redundant representations over trained dictio- naries. Using the K-SVD algorithm, we obtain a dictionary that trained dictionaries. describes the image content effectively. Two training options are Using redundant representations and sparsity as driving considered: using the corrupted image itself, or training on a forces for denoising of signals has drawn a lot of research corpus of high-quality image database. Since the K-SVD is limited attention in the past decade or so. At first, sparsity of the unitary in handling small image patches, we extend its deployment to wavelet coefficients was considered, leading to the celebrated arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how shrinkage algorithm [1]–[9]. One reason to turn to redundant such Bayesian treatment leads to a simple and effective denoising representations was the desire to have the shift invariance algorithm. This leads to a state-of-the-art denoising performance, property [10]. Also, with the growing realization that regular equivalent and sometimes surpassing recently published leading separable 1-D wavelets are inappropriate for handling images, alternative denoising methods.
    [Show full text]
  • From Sparse Solutions of Systems of Equations to Sparse Modeling Of
    SIAM REVIEW c 2009 Society for Industrial and Applied Mathematics Vol. 51,No. 1,pp. 34–81 From Sparse Solutions of Systems of Equations to Sparse ∗ Modeling of Signals and Images Alfred M. Bruckstein† David L. Donoho‡ Michael Elad§ Abstract. A full-rank matrix A ∈ Rn×m with n<mgenerates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries. Can it ever be unique? If so, when? As opti- mization of sparsity is combinatorial in nature, are there efficient methods for finding the sparsest solution? These questions have been answered positively and constructively in recent years, exposing a wide variety of surprising phenomena, in particular the existence of easily verifiable conditions under which optimally sparse solutions can be found by con- crete, effective computational methods. Such theoretical results inspire a bold perspective on some important practical problems in signal and image processing. Several well-known signal and image processing problems can be cast as demanding solutions of undetermined systems of equations. Such problems have previously seemed, to many, intractable, but there is considerable evidence that these problems often have sparse solutions. Hence, ad- vances in finding sparse solutions to underdetermined systems have energized research on such signal and image processing problems—to striking effect. In this paper we review the theoretical results on sparse solutions of linear systems, empirical results on sparse mod- eling of signals and images, and recent applications in inverse problems and compression in image processing. This work lies at the intersection of signal processing and applied mathematics, and arose initially from the wavelets and harmonic analysis research commu- nities.
    [Show full text]