Data Representation from Finite to Infinite-Dimensional Settings

Total Page:16

File Type:pdf, Size:1020Kb

Data Representation from Finite to Infinite-Dimensional Settings From Covariance Matrices to Covariance Operators Data Representation from Finite to Infinite-Dimensional Settings Ha` Quang Minh Pattern Analysis and Computer Vision (PAVIS) Istituto Italiano di Tecnologia, ITALY November 30, 2017 H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 1 / 99 Acknowledgement Collaborators Marco San Biagio Vittorio Murino H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 2 / 99 From finite to infinite dimensions H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 3 / 99 Part I - Outline Finite-dimensional setting Covariance Matrices and Applications 1 Data Representation by Covariance Matrices 2 Geometry of SPD matrices 3 Machine Learning Methods on Covariance Matrices and Applications in Computer Vision H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 4 / 99 Part II - Outline Infinite-dimensional setting Covariance Operators and Applications 1 Data Representation by Covariance Operators 2 Geometry of Covariance Operators 3 Machine Learning Methods on Covariance Operators and Applications in Computer Vision H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 5 / 99 From finite to infinite-dimensional settings H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 6 / 99 Part II - Outline Infinite-dimensional setting Covariance Operators and Applications 1 Data Representation by Covariance Operators 2 Geometry of Covariance Operators 3 Machine Learning Methods on Covariance Operators and Applications in Computer Vision H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 7 / 99 Covariance operator representation - Motivation Covariance matrices encode linear correlations of input features Nonlinearization 1 Map original input features into a high (generally infinite) dimensional feature space (via kernels) 2 Covariance operators: covariance matrices of infinite-dimensional features 3 Encode nonlinear correlations of input features 4 Provide a richer, more expressive representation of the data H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 8 / 99 Covariance operator representation S.K. Zhou and R. Chellappa. From sample similarity to ensemble similarity: Probabilistic distance measures in reproducing kernel Hilbert space, PAMI 2006 M. Harandi, M. Salzmann, and F. Porikli. Bregman divergences for infinite-dimensional covariance matrices, CVPR 2014 H.Q.Minh, M. San Biagio, V. Murino. Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces, NIPS 2014 H.Q.Minh, M. San Biagio, L. Bazzani, V. Murino. Approximate Log-Hilbert-Schmidt distances between covariance operators for image classification, CVPR 2016 H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 9 / 99 Positive Definite Kernels X any nonempty set K : X × X ! R is a (real-valued) positive definite kernel if it is symmetric and N X ai aj K (xi ; xj ) ≥ 0 i;j=1 N for any finite set of points fxi gi=1 2 X and real numbers N fai gi=1 2 R. N [K (xi ; xj )]i;j=1 is symmetric positive semi-definite H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 10 / 99 Reproducing Kernel Hilbert Spaces K a positive definite kernel on X × X . For each x 2 X , there is a function Kx : X! R, with Kx (t) = K (x; t). N X HK = f ai Kxi : N 2 Ng i=1 with inner product X X X h ai Kxi ; bj Kyj iHK = ai bj K (xi ; yj ) i j i;j HK = RKHS associated with K (unique). H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 11 / 99 Reproducing Kernel Hilbert Spaces Reproducing property: for each f 2 HK , for every x 2 X f (x) = hf ; Kx iHK Abstract theory due to Aronszajn (1950) Numerous applications in machine learning (kernel methods) H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 12 / 99 Examples: RKHS jx−yj2 n The Gaussian kernel K (x; y) = exp(− σ2 ) on R induces the space Z 2 2 2 1 σ jξj 2 H = fjjf jj = p e 4 jf (ξ)j dξ < 1g: K HK n n b (2π) (σ π) Rn The Laplacian kernel K (x; y) = exp (−ajx − yj), a > 0, on Rn induces the space Z 2 1 1 2 2 n+1 2 H = fjjf jj = (a + jξj ) 2 j^f (ξ)j dξ < 1g: K HK n (2π) aC(n) Rn n−1 n 2 n+1 with C(n) = 2 π Γ( 2 ) H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 13 / 99 Feature map and feature space Geometric viewpoint from machine learning Positive definite kernel K on X × X induces feature map Φ: X!HK Φ(x) = Kx 2 HK ; HK = feature space hΦ(x); Φ(y)iHK = hKx ; Ky iHK = K (x; y) Kernelization: Transform linear algorithm depending on hx; yiRn into nonlinear algorithms depending on K (x; y) H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 14 / 99 Covariance matrices ρ = Borel probability distribution on X ⊂ Rn, with Z jjxjj2dρ(x) < 1 X Mean vector Z µ = xdρ(x) X Covariance matrix Z C = (x − µ)(x − µ)T dρ(x) X H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 15 / 99 RKHS covariance operators ρ = Borel probability distribution on X , with Z Z jjΦ(x)jj2 dρ(x) = K (x; x)dρ(x) < 1 HK X X RKHS mean vector Z µΦ = Eρ[Φ(x)] = Φ(x)dρ(x) 2 HK X For any f 2 HK Z Z hµΦ; f iHK = hf ; Φ(x)iHK dρ(x) = f (x)dρ(x)= Eρ[f ] X X H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 16 / 99 RKHS covariance operators RKHS covariance operator CΦ : HK !HK CΦ = Eρ[(Φ(x) − µ) ⊗ (Φ(x) − µ)] Z = Φ(x) ⊗ Φ(x)dρ(x) − µ ⊗ µ X For all f ; g 2 HK Z hf ; CΦgiHK = hf ; Φ(x)iHK hg; Φ(x)iHK dρ(x) − hµ, f iHK hµ, giHK X = Eρ(fg) − Eρ(f )Eρ(g) H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 17 / 99 Empirical mean and covariance X = [x1;:::; xm] = data matrix randomly sampled from X according to ρ, with m observations Informally, Φ gives an infinite feature matrix in the feature space HK , of size dim(HK ) × m Φ(X) = [Φ(x1);:::; Φ(xm)] m Formally, Φ(X): R !HK is the bounded linear operator m X m Φ(X)w = wi Φ(xi ); w 2 R i=1 H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 18 / 99 Empirical mean and covariance Theoretical RKHS mean Z µΦ = Φ(x)dρ(x) 2 HK X Empirical RKHS mean m 1 X 1 µ = Φ(x ) = Φ(X)1 2 H Φ(X) m i m m K i=1 H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 19 / 99 Empirical mean and covariance Theoretical covariance operator CΦ : HK !HK Z CΦ = Φ(x) ⊗ Φ(x)dρ(x) − µ ⊗ µ X Empirical covariance operator CΦ(x) : HK !HK m 1 X C = Φ(x ) ⊗ Φ(x ) − µ ⊗ µ Φ(X) m i i Φ(X) Φ(X) i=1 1 = Φ(X)J Φ(X)∗ m m 1 T Jm = Im − m 1m1m = centering matrix H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 20 / 99 RKHS covariance operators CΦ is a self-adjoint, positive, trace class operator, with a countable 1 set of eigenvalues fλk gk=1, λk ≥ 0 and 1 X λk < 1 k=1 CΦ(X) is a self-adjoint, positive, finite rank operator H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 21 / 99 Covariance operator representation of images Given an image F (or a patch in F), at each pixel, extract a feature vector (e.g. intensity, colors, filter responses etc) Each image corresponds to a data matrix X X = [x1;:::; xm] = n × m matrix where m = number of pixels, n = number of features at each pixel Define a kernel K , with corresponding feature map Φ and feature matrix Φ(X) = [Φ(x1);:::; Φ(xm)] H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 22 / 99 Covariance operator representation of images Each image is represented by covariance operator 1 C = Φ(X)J Φ(X)∗ Φ(X) m m This representation is implicit, since Φ is generally implicit Computations are carried out via Gram matrices Can be approximated by explicit, low-dimensional approximation of Φ H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 23 / 99 Part II - Outline Infinite-dimensional setting Covariance Operators and Applications 1 Data Representation by Covariance Operators 2 Geometry of Covariance Operators 3 Machine Learning Methods on Covariance Operators and Applications in Computer Vision H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 24 / 99 Geometry of Covariance Operators 1 Hilbert-Schmidt metric: generalizing the Euclidean metric 2 Manifold of positive definite operators Infinite-dimensional generalization of Sym++(n) Infinite-dimensional affine-invariant Riemannian metric: generalizing finite-dimensional affine-invariant Riemannian metric Log-Hilbert-Schmidt metric: generalizing Log-Euclidean metric 3 Convex cone of positive definite operators Log-Determinant divergences: generalize the finite-dimensional Log-Determinant divergences H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 25 / 99 Geometry of Covariance Operators 1 Hilbert-Schmidt metric: generalizing the Euclidean metric 2 Manifold of positive definite operators Infinite-dimensional generalization of Sym++(n) Infinite-dimensional affine-invariant Riemannian metric: generalizing finite-dimensional affine-invariant Riemannian metric Log-Hilbert-Schmidt metric: generalizing Log-Euclidean metric 3 Convex cone of positive definite operators Log-Determinant divergences: generalize the finite-dimensional Log-Determinant divergences H.Q. Minh (IIT) Covariance matrices & covariance operators November 30, 2017 26 / 99 Hilbert-Schmidt metric Generalizing Frobenius inner product and distance on Sym++(n) T hA; BiF = tr(A B); dE (A; B) = jjA − BjjF H = infinite-dimensional separable real Hilbert space, with a 1 countable orthonormal basis fek gk=1 A : H!H = bounded linear operator jjAxjj jjAjj = sup < 1 x2H;x6=0 jjxjj ∗ Adjoint operator A : H!H (transpose if H = Rn) hAx; yi = hx; A∗yi 8x; y 2 H Self-adjoint: A∗ = A H.Q.
Recommended publications
  • MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster
    MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster Cotter, Simon and Roberts, Gareth and Stuart, Andrew and White, David 2013 MIMS EPrint: 2014.71 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK ISSN 1749-9097 Statistical Science 2013, Vol. 28, No. 3, 424–446 DOI: 10.1214/13-STS421 c Institute of Mathematical Statistics, 2013 MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster S. L. Cotter, G. O. Roberts, A. M. Stuart and D. White Abstract. Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the un- known function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distri- butions, when evaluated at any finite set of N points, are RN -valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation.
    [Show full text]
  • Perturbations of Operators with Application to Testing Equality of Covariance Operators
    Perturbations of Operators with Application to Testing Equality of Covariance Operators. by Krishna Kaphle, M.S. A Dissertation In Mathematics and Statistics Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philoshopy Approved Frits H. Ruymgaart Linda J. Allen Petros Hadjicostas Peggy Gordon Miller Interim Dean of the Graduate School August, 2011 c 2011, Krishna Kaphle Texas Tech University, Krishna Kaphle, August-2011 ACKNOWLEDGEMENTS This dissertation would not have been possible without continuous support of my advisor Horn Professor Dr. Firts H. Ruymgaart. I am heartily thankful to Dr. Ruym- gaart for his encouragement, and supervision from the research problem identification to the present level of my dissertation research. His continuous supervision enabled me not only to bring my dissertation in this stage but also to develop an understand- ing of the subject. I am lucky to have a great person like him as my advisor. I am grateful to Horn. Professor Dr. Linda J. Allen for her help throughout my stay at Texas Tech. I would also like to thank Dr. Petros Hadjicostas for encouragement and several valuable remarks during my research. I am also heartily thankful to Dr. David Gilliam for his support and help on writing Matlab code. I am pleased to thank those who made this dissertation possible. Among those, very important people are Dr. Kent Pearce, chair, Dr. Ram Iyer, graduate advisor, and all the staffs of Department of Mathematics and Statistics, who gave me the moral, financial, and technological supports required during my study.
    [Show full text]
  • JOINT MEASURES and CROSS-COVARIANCE OPERATORS(L)
    TRANSACTIONSOF THE AMERICANMATHEMATICAL SOCIETY Volume 186, December 1973 JOINT MEASURESAND CROSS-COVARIANCE OPERATORS(l) BY CHARLESR. BAKER ABSTRACT. Let H. (resp., H ) be a real and separable Hubert space with Borel O'field T (resp., rj, and let (H. X //-, T, X T.) be the product measurable space generated by the measurable rectangles. This paper develops relations between probability measures on (H. x H , V x T.), i.e., joint measures, and the projections of such measures on (H., T.) and (H , Y ). In particular, the class of all joint Gaussian measures having two specified Gaussian measures as projections is characterized, and conditions are ob- tained for two joint Gaussian measures to be mutually absolutely continuous. The cross-covariance operator of a joint measure plays a major role in these results and these operators are characterized. Introduction. Let H. (resp., A7 ) be a real and separable Hubert space with inner product (•, *)j (resp.,(', -)2) and Borel o-field Tj (resp., TJ. Let Tj x T denote the o-field generated by the measurable rectangles AxB,A£rj,B£ I"^. Define H. x //, = {(u, v): u in H., v in H A. f/j x H is a real linear space, with addition and scalar multiplication defined by (u, v) + (z, y) = (u + z, v + y) and k(u, v) = (ka, k\). H. x H2 is a separable Hubert space under the inner product [•, •] defined by [(u, v), (t, z)] = (u, t)j + (v, z)2; moreover, the open sets under the norm obtained from this inner product generate Tj x T2 [10].
    [Show full text]
  • Karhunen-Lo\Eve Decomposition of Gaussian Measures on Banach
    KARHUNEN-LOEVE` DECOMPOSITION OF GAUSSIAN MEASURES ON BANACH SPACES XAVIER BAY AND JEAN-CHARLES CROIX Abstract. The study of Gaussian measures on Banach spaces is of active in- terest both in pure and applied mathematics. In particular, the spectral theo- rem for self-adjoint compact operators on Hilbert spaces provides a canonical decomposition of Gaussian measures on Hilbert spaces, the so-called Karhunen- Lo`eve expansion. In this paper, we extend this result to Gaussian measures on Banach spaces in a very similar and constructive manner. In some sense, this can also be seen as a generalization of the spectral theorem for covariance operators associated to Gaussian measures on Banach spaces. In the special case of the standard Wiener measure, this decomposition matches with Paul L´evy’s construction of Brownian motion. 1. Preliminaries on Gaussian measures Let us first remind a few properties of Gaussian measures on Banach spaces. Our terminology and notations are essentially taken from [2] (alternative presentations can be found in [7], [15] or [5]). In this work, we consider a separable Banach space X, equipped with its Borel σ-algebra (X). Note that every probability measure on (X, (X)) is Radon and that BorelB and cylindrical σ-algebras are equal in this setting.B A probability measure γ on (X, (X)) is Gaussian if and only if for all f X∗ (the topological dual space of X), theB pushforward measure γ f −1 (of γ through∈ f) is a Gaussian measure on (R, (R)). Here, we only consider◦ the case γ centered for simplicity (the general caseB being obtained through a translation).
    [Show full text]
  • Functional Data Analysis • Explore Related Articles • Search Keywords Jane-Ling Wang,1 Jeng-Min Chiou,2 and Hans-Georg Muller¨ 1
    ST03CH11-Wang ARI 29 April 2016 13:44 ANNUAL REVIEWS Further Click here to view this article's online features: • Download figures as PPT slides • Navigate linked references • Download citations Functional Data Analysis • Explore related articles • Search keywords Jane-Ling Wang,1 Jeng-Min Chiou,2 and Hans-Georg Muller¨ 1 1Department of Statistics, University of California, Davis, California 95616; email: [email protected] 2Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan, Republic of China Annu. Rev. Stat. Appl. 2016. 3:257–95 Keywords The Annual Review of Statistics and Its Application is functional principal component analysis, functional correlation, functional online at statistics.annualreviews.org linear regression, functional additive model, clustering and classification, This article’s doi: time warping 10.1146/annurev-statistics-041715-033624 Copyright c 2016 by Annual Reviews. Abstract All rights reserved With the advance of modern technology, more and more data are being recorded continuously during a time interval or intermittently at several discrete time points. These are both examples of functional data, which has become a commonly encountered type of data. Functional data analy- sis (FDA) encompasses the statistical methodology for such data. Broadly interpreted, FDA deals with the analysis and theory of data that are in the form of functions. This paper provides an overview of FDA, starting with Annu. Rev. Stat. Appl. 2016.3:257-295. Downloaded from www.annualreviews.org simple statistical notions such as mean and covariance functions, then cover- ing some core techniques, the most popular of which is functional principal Access provided by Academia Sinica - Life Science Library on 06/04/16.
    [Show full text]
  • Review of Functional Data Analysis
    Review of Functional Data Analysis Jane-Ling Wang,1 Jeng-Min Chiou,2 and Hans-Georg M¨uller1 1Department of Statistics, University of California, Davis, USA, 95616 2Institute of Statistical Science, Academia Sinica,Tapei, Taiwan, R.O.C. ABSTRACT With the advance of modern technology, more and more data are being recorded continuously during a time interval or intermittently at several discrete time points. They are both examples of “functional data”, which have become a prevailing type of data. Functional Data Analysis (FDA) encompasses the statistical methodology for such data. Broadly interpreted, FDA deals with the analysis and theory of data that are in the form of functions. This paper provides an overview of FDA, starting with simple statistical notions such as mean and covariance functions, then covering some core techniques, the most popular of which is Functional Principal Component Analysis (FPCA). FPCA is an important dimension reduction tool and in sparse data situations can be used to impute functional data that are sparsely observed. Other dimension reduction approaches are also discussed. In addition, we review another core technique, functional linear regression, as well as clustering and classification of functional data. Beyond linear and single or multiple index methods we touch upon a few nonlinear approaches that are promising for certain applications. They include additive and other nonlinear functional regression models, such as time warping, manifold learning, and dynamic modeling with empirical differential equations. The paper concludes with a brief discussion of future directions. KEY WORDS: Functional principal component analysis, functional correlation, functional linear regression, functional additive model, clustering and classification, time warping arXiv:1507.05135v1 [stat.ME] 18 Jul 2015 1 Introduction Functional data analysis (FDA) deals with the analysis and theory of data that are in the form of functions, images and shapes, or more general objects.
    [Show full text]
  • Functional Data Analysis
    Functional data analysis Jane-Ling Wang,1 Jeng-Min Chiou,2 and Hans-Georg M¨uller1 1Department of Statistics/University of California, Davis, USA, 95616 2Institute of Statistical Science/Academia Sinica,Tapei, Taiwan, R.O.C. Annu. Rev. Statist. 2015. ?:1{41 Keywords This article's doi: Functional principal component analysis, functional correlation, 10.1146/((please add article doi)) functional linear regression, functional additive model, clustering and Copyright c 2015 by Annual Reviews. classification, time warping All rights reserved Abstract With the advance of modern technology, more and more data are be- ing recorded continuously during a time interval or intermittently at several discrete time points. They are both examples of \functional data", which have become a commonly encountered type of data. Func- tional Data Analysis (FDA) encompasses the statistical methodology for such data. Broadly interpreted, FDA deals with the analysis and theory of data that are in the form of functions. This paper provides an overview of FDA, starting with simple statistical notions such as mean and covariance functions, then covering some core techniques, the most popular of which is Functional Principal Component Analysis (FPCA). FPCA is an important dimension reduction tool and in sparse data situations can be used to impute functional data that are sparsely observed. Other dimension reduction approaches are also discussed. In addition, we review another core technique, functional linear regression, as well as clustering and classification of functional data. Beyond linear and single or multiple index methods we touch upon a few nonlinear approaches that are promising for certain applications. They include additive and other nonlinear functional regression models, and models that feature time warping, manifold learning, and empirical differen- tial equations.
    [Show full text]
  • POLITECNICO DI MILANO Covariance Operators As Object Data
    POLITECNICO DI MILANO Department of Mathematics Doctoral Program in Mathematical Models and Methods in Engineering Covariance operators as object data: statistical methods and applications Doctoral Dissertation of: Davide Pigoli Supervisor: Prof. Piercesare Secchi The Chair of the Doctoral Program: Prof. Roberto Lucchetti Year 2013 - XXV cycle Abstract In recent years, new technologies have provided data more complex than num- bers or vectors, such as high dimensional arrays, curves, shapes, diffusion ten- sors. These kinds of complex data can be analysed in the framework of Object Oriented Data Analysis. This work deals with a particularly interesting example of complex data: those belonging to a Riemannian manifold. These data are par- ticularly interesting both from a mathematical and from a practical point of view. In particular, we focus on the case of covariance operators. First, a framework is developed for the analysis of covariance operators of functional random processes, where the covariance operator itself is the object of interest. Distances for comparing positive definite covariance matrices are either extended or shown to be inapplicable for functional data. In particular, an infinite dimensional analogue of the Procrustes size and shape distance is developed. The proposed distances are used to address important inferential problems, namely, the point estimation of covariance operators and the comparison of covariance operators between two population of curves. These techniques are applied to two problems where inference concerning the covariance is of interest. Firstly, in data arising from a study into cerebral aneurysms, it is necessary to investigate the covariance structures of radius and curvature curves among different groups of patients.
    [Show full text]
  • L2 Space of Random Functions L2 Space of Random Functions Marek Dvoˇr´Ak
    L2 space of random functions L2 space of random functions Marek Dvoˇr´ak Hilbert Spaces Definition of Hilbert Space Marek Dvoˇr´ak Orthogonality Compact operators Matematicko-fyzik´aln´ıfakulta Univerzity Karlovy Riesz representation theorem Adjoint operators Singular values Hilbert-Schmidt operators Stochastic Modeling in Economics and Finance Space L2 in more detail 11.11.2013 Integral operator Covariance operator Asymptotic properties in L2 Estimation of mean and covariance functions 1/56 Outline L2 space of 1 Introduction to Hilbert Spaces random functions Definition of Hilbert Space Marek Dvoˇr´ak Orthogonality 2 Compact linear operators Hilbert Spaces Definition of Riesz representation theorem Hilbert Space Orthogonality Adjoint operators Compact Singular values operators Riesz Hilbert-Schmidt operators representation theorem 2 Adjoint 3 Space L in more detail operators Singular values Integral operator Hilbert-Schmidt operators Covariance operator 2 Space L2 in Asymptotic properties in L more detail Integral operator 4 Estimation of mean and covariance functions Covariance operator Estimation of mean Asymptotic properties in L2 Estimation of covariance Estimation of 5 Estimating eigenelements of the covariance operator mean and covariance functions 2/56 Outline L2 space of random 1 Introduction to Hilbert Spaces functions Definition of Hilbert Space Marek Dvoˇr´ak Orthogonality Hilbert Spaces 2 Compact linear operators Definition of Hilbert Space Riesz representation theorem Orthogonality Adjoint operators Compact Singular values operators
    [Show full text]
  • Hilbert Space Operators and Functional Integrals
    APPENDIX TO PART I Hilbert Space Operators and Functional Integrals Mathematical background material assumed throughout Part I is presented here. A basic knowledge of Hilbert space and self-adjoint operators through the spectral theorem is assumed; this material is summarized in the opening sections. The notions of trace class operators and nuclear spaces are devel­ oped, leading to Gaussian measures over the Schwartz space fl" of tempered distributions. Basic properties of measures on Y' are presented, including the Bochner theorem and the Feynman-Kac formula. A.I Bounded and Unbounded Operators on Hilbert Space A Hilbert space Yf is a vector space over the complex numbers having a positive definite inner product < , ), and which is complete in the corre­ sponding norm 11811 = <8,8)1/2. The inner product < , ) is a map from ff- x Yf to the scalars which is linear in its second argument and antilinear in its first argument. Positive definiteness is the statement 0::; <8,8) with equality only for 8 = o. Some useful elementary formulas relating the norm and inner product are: 118 + xl12 + 118 - xl12 = 211811 2 + 211xl12 (Parallelogram law), (A.U) <8, X) = H 118 + xl12 - 118 - xl12 (Polarization), (A. 1.2) + illx + Wll 2 - illx - i8112} 122 A.l Bounded and Unbounded Operators on Hilbert Space 123 <0, X) ~ 11011 Ilxll (Schwarz inequality), (A. 1.3) 110 + xii ~ 11011 + Ilxll (Triangle inequality), (A. 1.4) Completeness of Yf means that Cauchy sequences have limits. Thus if then there is a 0 E Yf for which On ~ O. To give examples and fix notation we list as Hilbert spaces 12 (square summable sequences), L2 (square summable functions), and en (complex n-space).
    [Show full text]
  • Thermal Equilibrium Distribution in Infinite-Dimensional Hilbert Spaces
    Thermal Equilibrium Distribution in Infinite-Dimensional Hilbert Spaces Roderich Tumulka∗ November 19, 2019 Abstract The thermal equilibrium distribution over quantum-mechanical wave functions is a so-called Gaussian adjusted projected (GAP) measure, GAP (ρβ), for a ther- mal density operator ρβ at inverse temperature β. More generally, GAP (ρ) is a probability measure on the unit sphere in Hilbert space for any density operator ρ (i.e., a positive operator with trace 1). In this note, we collect the mathemat- ical details concerning the rigorous definition of GAP (ρ) in infinite-dimensional separable Hilbert spaces. Its existence and uniqueness follows from Prohorov’s theorem on the existence and uniqueness of Gaussian measures in Hilbert spaces with given mean and covariance. We also give an alternative existence proof. Fi- nally, we give a proof that GAP (ρ) depends continuously on ρ in the sense that convergence of ρ in the trace norm implies weak convergence of GAP (ρ). Key words: Gaussian measure, GAP measure, Scrooge measure, canonical ensem- ble in quantum mechanics. 1 Introduction The Gaussian adjusted projected (GAP) measures are certain probability distributions over quantum-mechanical wave functions that arise as the typical distribution of the conditional wave function for a quantum system in thermal equilibrium with a heat bath arXiv:2004.14226v1 [math-ph] 29 Apr 2020 at a given temperature [4, 3]. The goal of this note is to provide a rigorous foundation of the measure GAP (ρ) for every density operator ρ in the case of an infinite-dimensional separable Hilbert space H . After we formulate the precise definition, the existence and uniqueness of GAP (ρ) will follow from corresponding theorems about Gaussian measures on such Hilbert spaces.
    [Show full text]
  • Analysis of Spdes Arising in Path Sampling Part I: the Gaussian Case
    Analysis of SPDEs Arising in Path Sampling Part I: The Gaussian Case September 2, 2005 M. Hairer1, A. M. Stuart1, J. Voss1,2, and P. Wiberg1,3 1 Mathematics Institute, The University of Warwick, Coventry CV4 7AL, UK 2 Previously at University of Kaiserslautern, 67653 Kaiserslautern, Germany 3 Now at Goldman-Sachs, London Abstract In many applications it is important to be able to sample paths of SDEs conditional on observations of various kinds. This paper studies SPDEs which solve such sam- pling problems. The SPDE may be viewed as an infinite dimensional analogue of the Langevin SDE used in finite dimensional sampling. Here the theory is developed for conditioned Gaussian processes for which the resulting SPDE is linear. Applications include the Kalman-Bucy filter/smoother. A companion paper studies the nonlinear case, building on the linear analysis provided here. 1 Introduction An important basic concept in sampling is Langevin dynamics: suppose a target den- sity p on Rd has the form p(x) = c exp(−V (x)). Then the stochastic differential equation (SDE) dx √ dW = −∇V (x) + 2 (1.1) dt dt has p as its invariant density. Thus, assuming that (1.1) is ergodic, x(t) produces sam- ples from the target density p as t → ∞. (For details see, for example, [RC99].) In [SVW04] we give an heuristic approach to generalising the Langevin method to an infinite dimensional setting. We derive stochastic partial differential equations (SPDEs) which are the infinite dimensional analogue of (1.1). These SPDEs sample from paths of stochastic differential equations, conditional on observations. Observa- tions which can be incorporated into this framework include knowledge of the solution at two points (bridges) and a set-up which includes the Kalman-Bucy filter/smoother.
    [Show full text]