Superresolution Via Student-T Mixture Models

Superresolution via Student-t Mixture Models Master Thesis {improved version{ TU Kaiserslautern Department of Mathematics Johannes Hertrich supervised by Prof. Dr. Gabriele Steidl Kaiserslautern, submitted at February 4, 2020 Contents 1. Introduction5 2. Preliminaries6 2.1. Definitions and Notations...........................6 2.1.1. Random Variables and Estimators..................6 2.1.2. Conditional Probabilities, Distribution and Expectation......8 2.1.3. Maximum a Posteriori Estimator................... 12 2.1.4. The Kullback-Leibler Divergence................... 13 2.2. The EM Algorithm............................... 13 2.3. The PALM and iPALM Algorithms...................... 19 2.3.1. Proximal Alternating Linearized Minimization (PALM)...... 20 2.3.2. Inertial Proximal Alternating Linearized Minimization (iPALM). 21 3. Alternatives of the EM Algorithm for Estimating the Parameters of the Student-t Distribution 22 3.1. Likelihood of the Multivariate Student-t Distribution............ 24 3.2. Existence of Critical Points.......................... 27 3.3. Zeros of F .................................... 31 3.4. Algorithms................................... 36 3.5. Numerical Results............................... 45 3.5.1. Comparison of Algorithms....................... 45 3.5.2. Unsupervised Estimation of Noise Parameters............ 46 4. Superresolution via Student-t Mixture Models 52 4.1. Estimating the Parameters.......................... 55 4.1.1. Initialization.............................. 61 4.1.2. Simulation Study............................ 63 4.2. Superresolution................................. 66 4.2.1. Expected Patch Log-Likelihood for Student-t Mixture Models... 66 4.2.2. Joint Student-t Mixture Models.................... 73 4.3. Numerical Results............................... 75 4.3.1. Comparison to Gaussian Mixture models.............. 75 4.3.2. FIB-SEM images............................ 77 5. Conclusion and Future Work 79 3 A. Examples for the EM Algorithm 79 A.1. EM Algorithm for Student-t distributions.................. 79 A.2. EM Algorithm for Mixture Models...................... 87 A.3. EM Algorithm for Student-t Mixture Models................ 89 B. Auxiliary Lemmas 92 C. Derivatives of the Negative Log-Likelihood Function for Student-t Mixture Models 97 1. Introduction Superresolution is a process to reconstruct a high resolution image from a low resolution image. There exist several approaches to use Gaussian mixture models for superresolution (see e.g. [35, 45]). In this thesis, we extend this to Student-t mixture models and focus on the estimation of the parameters of Student-t distributions and Student-t mixture models. For this purpose, we first consider numerical algorithms to compute the maximum likelihood estimator of the parameters for a multivariate Student-t distribution and propose three alternatives to the classical Expectation Maximization (EM) algorithm. Then, we extend out considerations to Student-t mixture models and finally, we apply our algorithms to some numerical examples. The thesis is organized as follows: in Section2 we review preliminary results. Further, we introduce the EM algorithm, the Proximal Alternating Linearized Minimization (PALM) as well as the inertial Proximal Alternating Linearized Minimization (iPALM) in their general forms and cite the corresponding convergence results. Then, in Section3, we consider maximum likelihood estimation of the parameters for a multivariate Student-t distribution. This section (including AppendixB) is already contained in the arXiv preprint [16] and submitted for a journal publication. In Section 3.1, we introduce the Student-t distribution, the negative log-likelihood function L and their derivatives. In Section 3.2, we provide some results concerning the existence of minimizers of L. Section 3.3 deals with the solution of the equation arising when setting the gradient of L with respect to ν to zero. The results of this section will be important for convergence considerations of our algorithms in the Section 3.4. We propose three alternatives to the classical EM algorithm. For fixed degree of freedom ν the first alternative is known as accelerated EM algorithm from the literature. It was considered e.g. in [19, 28, 40]. In our case, since we do not fix ν, it cannot be interpreted as EM algorithm. The other two alternatives differ from this in the ν step of the iteration. We show that the objective function L decreases in each iteration step and provide a simulation study to compare the performance of these algorithms. Finally, we provide two kinds of numerical results in Section 3.4. First, we compare the different algorithms by numerical examples which indicate that the new ν iterations are very efficient for estimating ν of different magnitudes. Second, we come back to the original motivation of this part and estimate the degree of freedom parameter ν from images corrupted by one-dimensional Student-t noise. 5 In Section4, we consider superresolution via Student- t mixture models. Section 4.1 deals with the parameter estimation of Student-t mixture models. We propose three alternatives to the EM algorithm. The first alternative differs from the EM algorithm in the update of the Σ and the ν step. The second and third algorithm are the PALM and iPALM algorithm as proposed in [6] and [34] for the negative log likelihood function L of the Student-t mixture model, which were so far not used in connection with mixture models. We describe some heuristics to initialize the algorithms and to set the parameters in the PALM and iPALM algorithm. Further, we compare the algorithms by a simulation study. In Section 4.2, we adapt two methods for superresolution with Student-t mixture models, which were originally proposed in [35] and [45] for Gaussian mixture models. Finally, in Section 4.3, we compare our methods with Gaussian mixture models and apply them to images generated by Focused Ion Beam and Scanning Electron Microscopy (FIB-SEM). Acknowledgement We would like to thank Professor Thomas Pock from the TU Graz for the fruitful discussions on the usage of PALM and iPALM in Section 4.1. Further, we thank the group of Dominique Bernard from the ICMCB material science lab at the University of Bordeaux for generating the FIB-SEM images within the ANR-DFG project "SUPREMATIM", which we used in Section 4.3. 2. Preliminaries 2.1. Definitions and Notations 2.1.1. Random Variables and Estimators Let (Ω; A;P ) be a probability space and let (Ω0; A0) be a measurable space. We call a measurable mapping X : Ω Ω0 a random element. If Ω0 = Rd and A = , where ! B B denotes the Borel σ-algebra, we call X a random vector. We say X is a random variable, if 0 0 d = 1. For a random element X : Ω Ω we call the probability measure PX : A [0; 1] ! ! defined by −1 0 PX (A) = P (X (A))A A 2 the image measure or distribution of X. Definition 2.1 (Mean, Variance, Standard deviation). Let X : Ω R be a random ! 6 variable. We define the mean of X by Z Z E(X) = EP (X) = X(!) dP (!) = x dPX (x): Ω Ω0 For 1 p we denote the Banach space of (equivalence classes of) random variables ≤ ≤ 1 with E( X p) < by Lp(Ω; A;P ). Note that Lp Lq for 1 p < q . Further we j j 1 ⊂ ≤ ≤ 1 denote for X L2(Ω; A;P ) the variance of X by 2 2 Var(X) = EP ((X EP (X)) ) − p and call Var(X) the standard deviation of X. For X; Y L2(Ω; A;P ) we call 2 Cov(X; Y ) = E((X E(X))(Y E(Y ))) − − T d the covariance of X and Y . For a random vector X = (X1; :::; Xd) : Ω R with 1 ! Xi L (Ω; A;P ) for all i = 1; :::; d we use the notation 2 T E(X) = (E(X1); :::; E(Xd)) 2 for the mean. Further, if Xi L (Ω; A;P ) for all i = 1; :::; d we call 2 d Cov(X) = (Cov(Xi;Xj))i;j=1 the covariance matrix of X. Definition 2.2 (Probability densities). Let X : Ω Rd be a random vector. If there d ! exists some function fX : R R≥0 with ! Z P ( ! Ω: X(!) A ) = PX (A) = fX (x) dx; A A; f 2 2 g A 2 then we call fX the probability density function of X. Now, let (Ω; A) be a measurable space and let Θ Rd. We call a family of probability ⊆ measures (P#)#2Θ a parametric distribution family. Given some independent identically d distributed samples x1; :::; xn of a random vector X : Ω R 1 defined on the probability ! space (Ω; A;P#) we want to recover the parameter # of the underlying measure. Definition 2.3 (Estimators). A measurable mapping T : Rd1×n Θ is called an estima- ! tor of #. 7 A common choice for an estimator is the maximum likelihood (ML) estimator. Assume that X is a random vector with a probability density function or that X is a discrete random vector. Then we define the likelihood function :Θ R by L ! [ f1g n Y (# x1; :::; xn) = p(xi); L j i=1 where 8 <fX (xi); if X has a density; p(x) = :PX (x); if X is a discrete random vector. Now we define the maximum likelihood estimator by #^ argmax (# x1; :::; xn): 2 #2Θ L j 2.1.2. Conditional Probabilities, Distribution and Expectation We give a short introduction to conditional expectations, probabilities and distributions based on [20, Chapter 8] and [5, Chapter IV]. Let (Ω; A;P ) be a probability space. Definition 2.4 (Conditional expectation). Let X L1(Ω; A;P ) and let G A be a 2 ⊆ σ-algebra. We call a G-measurable random variable Z :Ω R with the property that ! Z Z X dP = Z dP for all A G A A 2 (a version of) the conditional expectation of X given G and we denote Z = E(X G). If j X = 1A for some A A, then we denote E(1A G) = P (A G) and call P (A G) (a version 2 j j j of) the conditional probability of A given G.

Superresolution Via Student-T Mixture Models

LECTURE 13 Mixture Models and Latent Space Models

START HERE: Instructions Mixtures of Exponential Family

Robust Mixture Modelling Using the T Distribution

Lecture 16: Mixture Models

Linear Mixture Model Applied to Amazonian Vegetation Classification

Graph Laplacian Mixture Model Hermina Petric Maretic and Pascal Frossard

Comparing Unsupervised Clustering Algorithms to Locate Uncommon User Behavior in Public Travel Data

Semiparametric Estimation in the Normal Variance-Mean Mixture Model∗

Structure of a Mixture Model

Student's T Distribution Based Estimation of Distribution

The Infinite Gaussian Mixture Model

Outlier Detection and Robust Mixture Modeling Using Nonconvex Penalized Likelihood