Optimal Principal Component Analysis of STEM XEDS Spectrum Images Pavel Potapov1,2* and Axel Lubk2
Total Page:16
File Type:pdf, Size:1020Kb
Potapov and Lubk Adv Struct Chem Imag (2019) 5:4 https://doi.org/10.1186/s40679-019-0066-0 RESEARCH Open Access Optimal principal component analysis of STEM XEDS spectrum images Pavel Potapov1,2* and Axel Lubk2 Abstract STEM XEDS spectrum images can be drastically denoised by application of the principal component analysis (PCA). This paper looks inside the PCA workfow step by step on an example of a complex semiconductor structure con- sisting of a number of diferent phases. Typical problems distorting the principal components decomposition are highlighted and solutions for the successful PCA are described. Particular attention is paid to the optimal truncation of principal components in the course of reconstructing denoised data. A novel accurate and robust method, which overperforms the existing truncation methods is suggested for the frst time and described in details. Keywords: PCA, Spectrum image, Reconstruction, Denoising, STEM, XEDS, EDS, EDX Background components [3–9]. In general terms, PCA reduces the Scanning transmission electron microscopy (STEM) dimensionality of a large dataset by projecting it into an delivers images of nanostructures at high spatial resolu- orthogonal basic of lower dimension. It can be shown tion matching that of broad beam transmission electron that among all possible linear projections, PCA ensures microscopy (TEM). Additionally, modern STEM instru- the smallest Euclidean diference between the initial and ments are typically equipped with electron energy-loss projected datasets or, in other words, provides the mini- spectrometers (EELS) and/or X-rays energy-dispersive mal least squares errors when approximating data with a spectroscopy (XEDS, sometimes abbreviated as EDS smaller number of variables [10]. Due to that, PCA has or EDX) detectors, which allows to turn images into found a lot of applications in imaging science for data spectrum-images, i.e. pixelated images, where each pixel compression, denoising and pattern recognition (see for represents an EEL or XED spectrum. In particular, the example [11–18]) including applications to STEM XEDS recent progress in STEM instrumentation and large col- spectrum-imaging [19–24]. lection-angle silicon drift detectors (SDD) [1, 2] made A starting point for the PCA treatment is the con- D possible a fast acquisition of large STEM XEDS spec- version of a dataset into a matrix , where spectra are trum-images consisting of 10–1000 million data points. placed on the matrix rows and each row represents an Tese huge datasets show typically some correlations in individual STEM probe position (pixel). Assume for m n D the data distribution, which might be retrieved by appli- defniteness that the × matrix consists of m pix- cation of statistical analysis and then utilized for improv- els and n energy channels. Although STEM pixels may ing data quality. be originally arranged in 1D (linescan), 2D (datacube) or Te simplest and probably the most popular multivari- in a confguration with higher dimensions, they can be ate statistical technique is a principal component analysis always recasted into the 1D train as the neighborhood (PCA) that expresses available data in terms of orthog- among pixels does not play any role in the PCA treat- onal linearly uncorrelated variables called principal ment. PCA is based on the assumption that there are certain correlations among spectra constituting the data D matrix . Tese correlations appear because the data *Correspondence: [email protected] variations are governed by a limited number of the latent 1 Department of Physics, Technical University of Dresden, Dresden, factors, for example by the presence of chemical phases Germany Full list of author information is available at the end of the article with the fxed composition. Te spectral signatures of © The Author(s) 2019. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Potapov and Lubk Adv Struct Chem Imag (2019) 5:4 Page 2 of 21 Fig. 1 Schematic showing how data matrix D is decomposed into loading matrix P and score matrix T in PCA. A given column of T and a row of PT form a principal component. The components are sorted such as the variances of the data points in columns of T decreases from left to right. The principal components that will be retained after the truncation are flled green D latent factors might be, however, not apparent as they Te data matrix expressed by (1) can be subjected are masked by noise. In this consideration, PCA relates to dimensionality reduction or, in other words, trun- closely to the factor analysis [7] although principal com- cation of components. Such dimensionality reduction ponents generally do not coincide with the latent factors might serve various purposes, for instance, it can be a but represent rather their linear combinations [25]. frst step for more complicated multivariate statistical Principal components can be found, for example, treatment like unmixing data and extraction of latent through the diagonalization of the covariance matrix factors. In the simplest case, dimensionality reduction DDT or by applying the singular value decomposition can be utilized for removing the major part of noise D (SVD) directly to . SVD decomposes data as: from data, i.e. for its denoising. T Te following questions are at the heart of the D UV (1) U= V method. How much the dimensionality of a given data- where and are left and right hand singular vector set can be reduced? How many components must be matrices and is a diagonal matrix with singular values D retained to reproduce adequately the data variation and of on the diagonal. For the purpose of PCA formula (1) how many of them may be truncated to reduce noise? can be rewritten as: Tis paper attempts to address these crucial questions T D TP (2) on the example of typical XEDS spectrum-images P= V obtained in modern STEM instruments. where is an n n loading matrix describing prin- = ×T U At a frst glance, the reasonable number of retained cipal components and is an m n score matrix = × components should be equal to the known (or showing the contribution of components into the dataset. expected) number of latent factors behind the data var- Figure 1 illustrates the principal component decomposi- iations. It will be, however, shown that the situation is tion in the graphical form. Te columns of the loading P PT more complicated and the number of meaningful com- matrix (rows in ) represent spectra of principal com- ponents might strongly difer from the number of latent ponents expressed in the original energy channels. Typi- P factors—typically, there are less components than fac- cally, the columns of are normalized to unity, such that tors. Te reason for this deviation is unavoidable cor- all the scaling information is moved into the score matrix T ruption of data with noise. It is important to sort the principal components in the To explore the topic most comprehensively, we con- order of their signifcance. In PCA, the components are sidered an object with a very large number of latent ranked according their variance, i.e. the variance of the T factors and analyzed its experimental XEDS spectrum data along the corresponding column of . image. In parallel, we generated a twin synthetic object that mimicked the real one in all its essential features. An advantage of the synthetic data is the possibility to Potapov and Lubk Adv Struct Chem Imag (2019) 5:4 Page 3 of 21 Fig. 2 Objects employed for evaluation of PCA in the present paper: a shows the mean image of the experimentally characterized CMOS device and b represent a twin synthetic object generated to reproduce the key features of the real object. The twin object was designed to mimic the mixture of the layers composing the real object but not necessarily their exact geometry. A colored legend identifes all the constituent phases labeled according the notations in Table 1. The simulations of spectrum-images were performed in the noise-free and noisy variants. Correspondingly (c) and (d) show the mean images of the noise-free and noisy datasets exclude noise in simulations and, therefore, compare background for truncation of principal components and the noisy data with the noise-free reference. discuss the existing practical truncation methods. A PCA is often considered as a fxed procedure, where lit- novel method for automatic determination of the opti- tle can be altered or tuned. In reality, there is a number mal number of components is introduced in "Anisotropy of hidden issues hampering the treatment and leading method for truncation of principal components" section. to dubious results. Better understanding of the poten- At the end of "Truncation of principal components and tial issues might help to design the optimal treatment reconstruction" section, the results of the spectrum- fow improving the efciency and avoiding artifacts. Te image reconstruction are shown and the denoising ability systematic comparison between the experimental and of PCA is demonstrated. synthetic data sets on the one hand and between the syn- thetic noisy set and the noise-free reference on the other Results and discussion hand, allowed us to identify the typical obstacles in the Multi‑component object for spectrum‑imaging treatment fow and fnd the solutions for the optimal A modern CMOS device was chosen as an object for principal component decomposition and reconstruction XEDS spectrum imaging. Figure 2a shows a so-called of the denoised data. mean image of the device, i.e. the spectrum imaging sig- Below, it will be demonstrated that certain pre-treat- nal integrated over all available energy channels.