Partial Least Squares Regression. Projection on Latent Structures

Total Page:16

File Type:pdf, Size:1020Kb

Partial Least Squares Regression. Projection on Latent Structures Wiley Interdisciplinary Reviews: Computational Statistics Partial least squares regression For Peer Review Journal: Wiley Interdisciplinary Reviews: Computational Statistics Manuscript ID: EOCS-039.R1 Wiley - Manuscript type: Focus article Date Submitted by the 08-Jan-2009 Author: Complete List of Authors: Abdi, Hervé Partial least squares, Projection to latent structures, Principal Keywords: component analysis, multicollinearity, Singular value decomposition John Wiley & Sons Page 1 of 23 Wiley Interdisciplinary Reviews: Computational Statistics 1 2 3 4 5 6 7 8 9 10 11 12 13 Partial Least Squares Regression 14 15 and 16 17 Projection on Latent Structure Regression 18 19 (PLS-Regression) 20 For Peer Review 21 22 Herv´e Abdi 23 24 25 Abstract 26 Partial least squares (pls) regression (a.k.a projection on latent 27 structures) is a recent technique that combines features from and gen- 28 eralizes principal component analysis (pca) and multiple linear re- 29 gression. Its goal is to predict a set of dependent variables from a set 30 31 of independent variables or predictors. This prediction is achieved by 32 extracting from the predictors a set of orthogonal factors called latent 33 variables which have the best predictive power. These latent variables 34 can be used to create displays akin to pca displays. The quality of 35 the prediction obtained from a pls regression model is evaluated with 36 cross-validation techniques such as the bootstrap and jackknife. There 37 38 are two main variants of pls regression: The most common one sepa- 39 rates the r^oles of independent and independent variables; the second 40 one|used mostly to analyze brain imaging data|gives the same r^oles 41 to dependent and independent variables. 42 43 Keywords: 44 Partial least squares, Projection to latent structures, Principal com- 45 ponent analysis, Principal component regression, Multiple regression, 46 multicollinearity, nipals Eigenvalue decomposition, Singular value de- 47 composition, Bootstrap, Jackknife, small N large P problem. 48 49 50 51 1 Introduction 52 53 Pls regression is an acronym which originally stood for Partial Least Squares 54 Regression, but, recently, some authors have preferred to develop this acronym 55 56 57 1 58 59 60 John Wiley & Sons Wiley Interdisciplinary Reviews: Computational Statistics Page 2 of 23 1 2 3 4 5 6 7 8 as Projection to Latent Structures. In any case, pls regression combines fea- 9 tures from and generalizes principal component analysis and multiple linear 10 11 regression. Its goal is to analyze or predict a set of dependent variables 12 from a set of independent variables or predictors. This prediction is achieved 13 by extracting from the predictors a set of orthogonal factors called latent 14 variables which have the best predictive power. 15 16 Pls regression is particularly useful when we need to predict a set of 17 dependent variables from a (very) large set of independent variables (i.e., 18 predictors). It originated in the social sciences (specifically economy, Herman 19 Wold 1966) but became popular first in chemometrics (i.e., computational 20 For Peer Review 21 chemistry) due in part to Herman's son Svante, (Wold, 2001) and in sensory 22 evaluation (Martens & Naes, 1989). But pls regression is also becoming 23 a tool of choice in the social sciences as a multivariate technique for non- 24 experimental (e.g., Fornell, Lorange, & Roos, 1990; Hulland, 1999; Graham, 25 26 Evenko, Rajan, 1992) and experimental data alike (e.g., neuroimaging, see 27 Worsley, 1997; Mcintosh & Lobaugh, 2004; Giessing et al., 2007; Kovacevic 28 & McIntosh, 2007; Wang et al., 2008). It was first presented as an algorithm 29 akin to the power method (used for computing eigenvectors) but was rapidly 30 interpreted in a statistical framework. (see e.g., Burnham, 1996; Garthwaite, 31 32 1994; H¨oskuldson, 2001; Phatak, & de Jong, 1997; Tenenhaus, 1998; Ter 33 Braak & de Jong, 1998). 34 Recent developments, including, extensions to multiple table analysis, 35 are explored in H¨oskuldson (in press), and in the volume edited by Esposito 36 37 Vinzi, Chin, Henseler, and Wang (2009). 38 39 40 2 Prerequisite notions and notations 41 42 The I observations described by K dependent variables are stored in an 43 I K matrix denoted Y, the values of J predictors collected on these I 44 × 45 observations are collected in an I J matrix X. 46 × 47 48 3 Goal of PLS regression: 49 50 Predict Y from X 51 52 The goal of pls regression is to predict Y from X and to describe their 53 54 common structure. When Y is a vector and X is a full rank matrix, this goal 55 56 57 2 58 59 60 John Wiley & Sons Page 3 of 23 Wiley Interdisciplinary Reviews: Computational Statistics 1 2 3 4 5 6 7 8 could be accomplished using ordinary multiple regression. When the number 9 of predictors is large compared to the number of observations, X is likely to 10 11 be singular and the regression approach is no longer feasible (i.e., because 12 of multicollinearity). This data configuration has been recently often called 13 the \small N large P problem." It is characteristic of recent data analysis 14 domains such as, e.g., bio-informatics, brain imaging, chemometrics, data 15 16 mining, and genomics. 17 18 19 3.1 Principal Component Regression 20 Several approacForhes ha vPeere been develop Reviewed to cope with the multicollinearity 21 22 problem. For example, one approach is to eliminate some predictors (e.g., 23 using stepwise methods, see Draper & Smith, 1998), another one is to use 24 ridge regression (Hoerl & Kennard, 1970). One method, closely related to 25 pls regression is called principal component regression (pcr), it performs a 26 27 principal component analysis (pca) of the X matrix and then use the prin- 28 cipal components of X as the independent variables of a multiple regression 29 model predicting Y. Technically, in pca, X is decomposed using its singular 30 value decomposition (see Abdi 2007a,b for more details) as 31 32 X = R∆VT (1) 33 34 with: 35 RTR = VTV = I; (2) 36 37 (where R and V are the matrices of the left and right singular vectors), and 38 ∆ being a diagonal matrix with the singular values as diagonal elements. The 39 singular vectors are ordered according to their corresponding singular value 40 which is the square root of the variance (i.e., eigenvalue) of X explained by 41 42 the singular vectors. The columns of V are called the loadings. The columns 43 of G = R∆ are called the factor scores or principal components of X, or 44 simply scores or components. The matrix R of the left singular vectors of X 45 (or the matrix G of the principal components) are then used to predict Y 46 47 using standard multiple linear regression. This approach works well because 48 the orthogonality of the singular vectors eliminates the multicolinearity prob- 49 lem. But, the problem of choosing an optimum subset of predictors remains. 50 A possible strategy is to keep only a few of the first components. But these 51 52 components were originally chosen to explain X rather than Y, and so, noth- 53 ing guarantees that the principal components, which \explain" X optimally, 54 will be relevant for the prediction of Y. 55 56 57 3 58 59 60 John Wiley & Sons Wiley Interdisciplinary Reviews: Computational Statistics Page 4 of 23 1 2 3 4 5 6 7 8 3.2 Simultaneous decomposition of 9 10 predictors and dependent variables 11 12 Principal component regression decomposes X in order to obtain components 13 which best explains X. By contrast, pls regression finds components from 14 X that best predict Y. Specifically, pls regression searches for a set of com- 15 ponents (called latent vectors) that performs a simultaneous decomposition 16 17 of X and Y with the constraint that these components explain as much as 18 possible of the covariance between X and Y. This step generalizes pca. It 19 is followed by a regression step where the latent vectors obtained from X are 20 used to predictForY. Peer Review 21 22 Pls regression decomposes both X and Y as a product of a common 23 set of orthogonal factors and a set of specific loadings. So, the independent 24 variables are decomposed as 25 26 X = TPT with TTT = I ; (3) 27 28 29 with I being the identity matrix (some variations of the technique do not 30 require T to have unit norms, these variations differ mostly by the choice of 31 the normalization, they do not differ in their final prediction, but the differ- 32 ences in normalization may make delicate the comparisons between different 33 implementations of the technique). By analogy with pca, T is called the 34 35 score matrix, and P the loading matrix (in pls regression the loadings are 36 not orthogonal). Likewise, Y is estimated as 37 38 Y = TBCT ; (4) 39 40 41 where B is a diagonal matrix withb the \regression weights" as diagonal ele- 42 ments and C is the \weight matrix" of the dependent variables (see below for 43 more details on the regression weights and the weight matrix). The columns 44 of T are the latent vectors. When their number is equal to the rank of X, 45 46 they perform an exact decomposition of X. Note, however, that the latent 47 vectors provide only an estimate of Y (i.e., in general Y is not equal to Y). 48 49 b 50 4 PLS regression and covariance 51 52 The latent vectors could be chosen in a lot of different ways.
Recommended publications
  • An Overview of Probabilistic Latent Variable Models with an Application to the Deep Unsupervised Learning of Chromatin States
    An Overview of Probabilistic Latent Variable Models with an Application to the Deep Unsupervised Learning of Chromatin States Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Tarek Farouni, B.A., M.A., M.S. Graduate Program in Psychology The Ohio State University 2017 Dissertation Committee: Robert Cudeck, Advisor Paul DeBoeck Zhong-Lin Lu Ewy Mathe´ c Copyright by Tarek Farouni 2017 Abstract The following dissertation consists of two parts. The first part presents an overview of latent variable models from a probabilistic perspective. The main goal of the overview is to give a birds-eye view of the topographic structure of the space of latent variable models in light of recent developments in statistics and machine learning that show how seemingly unrelated models and methods are in fact intimately related to each other. In the second part of the dissertation, we apply a Deep Latent Gaussian Model (DLGM) to high-dimensional, high-throughput functional epigenomics datasets with the goal of learning a latent representation of functional regions of the genome, both across DNA sequence and across cell-types. In the latter half of the dissertation, we first demonstrate that the trained generative model is able to learn a compressed two-dimensional latent rep- resentation of the data. We then show how the learned latent space is able to capture the most salient patterns of dependencies in the observations such that synthetic samples sim- ulated from the latent manifold are able to reconstruct the same patterns of dependencies we observe in data samples.
    [Show full text]
  • The Implementation of Nonlinear Principal Component Analysis to Acquire the Demography of Latent Variable Data (A Study Case on Brawijaya University Students)
    Mathematics and Statistics 8(4): 437-442, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.080410 The Implementation of Nonlinear Principal Component Analysis to Acquire the Demography of Latent Variable Data (A Study Case on Brawijaya University Students) Solimun*, Adji Achmad Rinaldo Fernandes, Retno Ayu Cahyoningtyas Department of Statistics, Faculty of Mathematics and Natural Sciences, Brawijaya University, Malang, Indonesia Received March 22, 2020; Revised April 24, 2020; Accepted May 3, 2020 Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License Abstract Nonlinear principal component analysis is that simultaneously analyzes multiple variables on an used for data that has a mixed scale. This study uses a individual or object [1]. By using the multivariate analysis, formative measurement model by combining metric and the influence of several variables toward other variables nonmetric data scales. The variable used in this study is the can be done simultaneously for each object of research [5]. demographic variable. This study aims to obtain the Based on the measurement process, variables can be principal component of the latent demographic variable categorized into manifest variables (observable) and latent and to identify the strongest indicators of demographic variables (unobservable) [2][3]. Generally, latent variables formers with mixed scales using samples of students of are defined by the variables that cannot be measured Brawijaya University based on predetermined indicators. directly, yet those variables must be through the indicator The data used in this study are primary data with research that reflects and constructs it [11].
    [Show full text]
  • PLS-Regression
    Chemometrics and Intelligent Laboratory Systems 58Ž. 2001 109–130 www.elsevier.comrlocaterchemometrics PLS-regression: a basic tool of chemometrics Svante Wold a,), Michael Sjostrom¨¨a, Lennart Eriksson b a Research Group for Chemometrics, Institute of Chemistry, Umea˚˚ UniÕersity, SE-901 87 Umea, Sweden b Umetrics AB, Box 7960, SE-907 19 Umea,˚ Sweden Abstract PLS-regressionŽ. PLSR is the PLS approach in its simplest, and in chemistry and technology, most used formŽ two-block predictive PLS. PLSR is a method for relating two data matrices, X and Y, by a linear multivariate model, but goes beyond traditional regression in that it models also the structure of X and Y. PLSR derives its usefulness from its ability to analyze data with many, noisy, collinear, and even incomplete variables in both X and Y. PLSR has the desirable property that the precision of the model parameters improves with the increasing number of relevant variables and observations. This article reviews PLSR as it has developed to become a standard tool in chemometrics and used in chemistry and engineering. The underlying model and its assumptions are discussed, and commonly used diagnostics are reviewed together with the interpretation of resulting parameters. Two examples are used as illustrations: First, a Quantitative Structure–Activity RelationshipŽ. QSAR rQuantitative Struc- ture–Property RelationshipŽ. QSPR data set of peptides is used to outline how to develop, interpret and refine a PLSR model. Second, a data set from the manufacturing of recycled paper is analyzed to illustrate time series modelling of process data by means of PLSR and time-lagged X-variables.
    [Show full text]
  • Worse Than Measurement Error 1 Running Head
    Worse than Measurement Error 1 Running Head: WORSE THAN MEASUREMENT ERROR Worse than Measurement Error: Consequences of Inappropriate Latent Variable Measurement Models Mijke Rhemtulla Department of Psychology, University of California, Davis Riet van Bork and Denny Borsboom Department of Psychological Methods, University of Amsterdam Author Note. This research was supported in part by grants FP7-PEOPLE-2013-CIG- 631145 (Mijke Rhemtulla) and Consolidator Grant, No. 647209 (Denny Borsboom) from the European Research Council. Parts of this work were presented at the 30th Annual Convention of the Association for Psychological Science. A draft of this paper was made available as a preprint on the Open Science Framework. Address correspondence to Mijke Rhemtulla, Department of Psychology, University of California, Davis, One Shields Avenue, Davis, CA 95616. [email protected]. [This version accepted for publication in Psychological Methods on March 11, 2019] Worse than Measurement Error 2 Abstract Previous research and methodological advice has focussed on the importance of accounting for measurement error in psychological data. That perspective assumes that psychological variables conform to a common factor model. We explore what happens when data that are not generated from a common factor model are nonetheless modeled as reflecting a common factor. Through a series of hypothetical examples and an empirical re-analysis, we show that when a common factor model is misused, structural parameter estimates that indicate the relations among psychological constructs can be severely biased. Moreover, this bias can arise even when model fit is perfect. In some situations, composite models perform better than common factor models. These demonstrations point to a need for models to be justified on substantive, theoretical bases, in addition to statistical ones.
    [Show full text]
  • Introduction to Latent Variable Models
    Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT [email protected] { Typeset by FoilTEX { 1 [2/24] Outline • Latent variables and their use • Some example datasets • A general formulation of latent variable models • The Expectation-Maximization algorithm for maximum likelihood estimation • Finite mixture model (with example of application) • Latent class and latent regression models (with examples of application) { Typeset by FoilTEX { 2 Latent variables and their use [3/24] Latent variable and their use • A latent variable is a variable which is not directly observable and is assumed to affect the response variables (manifest variables) • Latent variables are typically included in an econometric/statistical model (latent variable model) with different aims: . representing the effect of unobservable covariates/factors and then accounting for the unobserved heterogeneity between subjects (latent variables are used to represent the effect of these unobservable factors) . accounting for measurement errors (the latent variables represent the \true" outcomes and the manifest variables represent their \disturbed" versions) { Typeset by FoilTEX { 3 Latent variables and their use [4/24] . summarizing different measurements of the same (directly) unobservable characteristics (e.g., quality-of-life), so that sample units may be easily ordered/classified on the basis of these traits (represented by the latent variables) • Latent variable models have now a wide range of applications, especially in the presence of repeated observations, longitudinal/panel data, and multilevel data • These models are typically classified according to: . nature of the response variables (discrete or continuous) . nature of the latent variables (discrete or continuous) .
    [Show full text]
  • Latent Variable Methods in Process Systems Engineering
    Latent Variable Methods in Process Systems Engineering John F. MacGregor ProSensus, Inc. and McMaster University Hamilton, ON, Canada www.prosensus.ca (c) 2004-2008, ProSensus, Inc. OUTLINE • Presentation: – Will be conceptual in nature – Will cover many areas of Process Systems Engineering – Will be illustrated with numerous industrial examples – But will not cover any topic in much detail • Objective: – Provide a feel for Latent Variable (LV) models, why they are used, and their great potential in many important problems (c) 2004-2008, ProSensus, Inc. Process Systems Engineering? • Process modeling, simulation, design, optimization, control. • But it also involves data analysis – learning from industrial data • An area of PSE that is poorly taught in many engineering programs • This presentation is focused on this latter topic – The nature of industrial data – Latent Variable models – How to extract information from these messy data bases for: • Passive applications: Gaining process understanding, process monitoring, soft sensors • Active applications: Optimization, Control, Product development – Will illustrate concepts with industrial applications (c) 2004-2008, ProSensus, Inc. A. Types of Processes and Data Structures • Continuous Processes • Batch Processes • Data structures • Data structures End Properties e m ti variables X1 X3 Y Z X Y X2 batches Initial Conditions Variable Trajectories (c) 2004-2008, ProSensus, Inc. Nature of process data • High dimensional • Many variables measured at many times • Non-causal in nature • No cause and effect information among individual variables • Non-full rank • Process really varies in much lower dimensional space • Missing data • 10 – 30 % is common (with some columns/rows missing 90%) • Low signal to noise ratio • Little information in any one variable • Latent variable models are ideal for these problems (c) 2004-2008, ProSensus, Inc.
    [Show full text]
  • Classical Latent Variable Models for Medical Research
    Statistical Methods in Medical Research 2008; 17: 5–32 Classical latent variable models for medical research Sophia Rabe-Hesketh Graduate School of Education and Graduate Group in Biostatistics, University of California, Berkeley, USA and Institute of Education, University of London, London, UK and Anders Skrondal Department of Statistics and The Methodology Institute, London School of Economics, London, UK and Division of Epidemiology, Norwegian Institute of Public Health, Oslo, Norway Latent variable models are commonly used in medical statistics, although often not referred to under this name. In this paper we describe classical latent variable models such as factor analysis, item response theory,latent class models and structural equation models. Their usefulness in medical research is demon- strated using real data. Examples include measurement of forced expiratory flow,measurement of physical disability, diagnosis of myocardial infarction and modelling the determinants of clients’ satisfaction with counsellors’ interviews. 1 Introduction Latent variable modelling has become increasingly popular in medical research. By latent variable model we mean any model that includes unobserved random variables which can alternatively be thought of as random parameters. Examples include factor, item response, latent class, structural equation, mixed effects and frailty models. Areas of application include longitudinal analysis1, survival analysis2, meta-analysis3, disease mapping4, biometrical genetics5, measurement of constructs such as quality of life6, diagnostic testing7, capture–recapture models8, covariate measurement error models9 and joint models for longitudinal data and dropout.10 Starting at the beginning of the 20th century, ground breaking work on latent vari- able modelling took place in psychometrics.11–13 The utility of these models in medical research has only quite recently been recognized and it is perhaps not surprising that med- ical statisticians tend to be unaware of the early,and indeed contemporary,psychometric literature.
    [Show full text]
  • Causal Effect Inference with Deep Latent-Variable Models
    Causal Effect Inference with Deep Latent-Variable Models Christos Louizos Uri Shalit Joris Mooij University of Amsterdam New York University University of Amsterdam TNO Intelligent Imaging CIMS [email protected] [email protected] [email protected] David Sontag Richard Zemel Max Welling Massachusetts Institute of Technology University of Toronto University of Amsterdam CSAIL & IMES CIFAR∗ CIFAR∗ [email protected] [email protected] [email protected] Abstract Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of confounders, factors that affect both an intervention and its outcome. A carefully designed observational study attempts to measure all important confounders. However, even if one does not have direct access to all confounders, there may exist noisy and uncertain measurement of proxies for confounders. We build on recent advances in latent variable modeling to simultaneously estimate the unknown latent space summarizing the confounders and the causal effect. Our method is based on Variational Autoencoders (VAE) which follow the causal structure of inference with proxies. We show our method is significantly more robust than existing methods, and matches the state-of-the-art on previous benchmarks focused on individual treatment effects. 1 Introduction Understanding the causal effect of an intervention t on an individual with features X is a fundamental problem across many domains. Examples include understanding the effect of medications on a patient’s health, or of teaching methods on a student’s chance of graduation.
    [Show full text]
  • Latent Variable Theory
    Measurement, 6: 25–53, 2008 Copyright © Taylor & Francis Group, LLC ISSN 1536-6367 print / 1536-6359 online DOI: 10.1080/15366360802035497 Latent Variable Theory Denny Borsboom University of Amsterdam This paper formulates a metatheoretical framework for latent variable modeling. It does so by spelling out the difference between observed and latent variables. This difference is argued to be purely epistemic in nature: We treat a variable as observed when the inference from data structure to variable structure can be made with certainty and as latent when this inference is prone to error. This difference in epistemic accessibility is argued to be directly related to the data- generating process, i.e., the process that produces the concrete data patterns on which statistical analyses are executed. For a variable to count as observed through a set of data patterns, the relation between variable structure and data structure should be (a) deterministic, (b) causally isolated, and (c) of equivalent cardinality. When any of these requirements is violated, (part of) the variable structure should be considered latent. It is argued that, on these criteria, observed variables are rare to nonexistent in psychology; hence, psychological variables should be considered latent until proven observed. Key words: latent variables, measurement theory, philosophy of science, psychometrics, test theory In the past century, a number of models have been proposed that formulate probabilistic relations between theoretical constructs and empirical data. These models posit a hypothetical structure and specify how the location of an object in this structure relates to the object’s location on a set of indicator variables.
    [Show full text]
  • Introduction to Latent Variable Models
    Introduction to latent variable models Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT [email protected] – Typeset by FoilTEX – 1 [2/20] Outline • Latent variables and unobserved heterogeneity • A general formulation of latent variable models • The Expectation-Maximization algorithm for maximum likelihood estimation • Generalized linear mixed models • Latent class and latent regression models – Typeset by FoilTEX – 2 Latent variables and unobserved heterogeneity [3/20] Latent variable and unobserved heterogeneity • A latent variable is a variable which is not directly observable and is assumed to affect the response variables (manifest variables) • Latent variables are typically included in an econometric/statistical model to represent the effect of unobservable covariates/factors and then to account for the unobserved heterogeneity between subjects • Many models make use of latent variables, especially in the presence of repeated observations, longitudinal/panel and multilevel data • These models are typically classified according to: . the type of response variables . the discrete or continuous nature of the latent variables . inclusion or not of individual covariates – Typeset by FoilTEX – 3 Latent variables and unobserved heterogeneity [4/20] Most well-known latent variable models • Item Response Theory models: models for items (categorical responses) measuring a common latent trait assumed to be continuous and typically representing an ability or a psychological attitude; the most important
    [Show full text]
  • Latent Variable Analyses of Age Trends of Cognition in the Health and Retirement Study, 1992–2004
    Psychology and Aging Copyright 2007 by the American Psychological Association 2007, Vol. 22, No. 3, 525–545 0882-7974/07/$12.00 DOI: 10.1037/0882-7974.22.3.525 Latent Variable Analyses of Age Trends of Cognition in the Health and Retirement Study, 1992–2004 John J. McArdle Gwenith G. Fisher University of Southern California University of Michigan Kelly M. Kadlec University of Southern California The present study was conducted to better describe age trends in cognition among older adults in the longitudinal Health and Retirement Study (HRS) from 1992 to 2004 (N Ͼ 17,000). The authors used contemporary latent variable models to organize this information in terms of both cross-sectional and longitudinal inferences about age and cognition. Common factor analysis results yielded evidence for at least 2 common factors, labeled Episodic Memory and Mental Status, largely separable from vocabulary. Latent path models with these common factors were based on demographic characteristics. Multilevel models of factorial invariance over age indicated that at least 2 common factors were needed. Latent curve models of episodic memory were based on age at testing and showed substantial age differences and age changes, including impacts due to retesting as well as several time-invariant and time-varying predictors. Keywords: longitudinal data, cognitive aging, structural factor analysis, latent variable structural equation modeling, latent growth-decline curve models Classical research in gerontology has sought to determine the Neisser, 1998). In research on older populations, some investiga- nature of adult age-related changes in cognitive functioning (e.g., tors have found a consistent cohort-related trend of decreased Baltes & Schaie, 1976; Bayley, 1966; Botwinick, 1977; Bradway decline in cognitive functioning (Freedman & Martin, 1998, 2000).
    [Show full text]
  • Probabilistic Principal Component Analysis
    Microsoft Research St George House 1 Guildhall Street Cambridge CB2 3NH United Kingdom Tel: +44 (0)1223 744 744 Fax: +44 (0)1223 744 777 September 27, 1999 ØØÔ»»Ö×Ö ºÑ ÖÓ×ÓØº Ó Ñ» Probabilistic Principal Component Analysis Michael E. Tipping Christopher M. Bishop ÑØÔÔÒÑÖÓ×Ó ØºÓÑ Ñ ×ÓÔÑÖÓ×Ó ØºÓÑ Abstract Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this pa- per we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the princi- pal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach to PCA. Keywords: Principal component analysis; probability model; density estimation; maximum-likelihood; EM algorithm; Gaussian mixtures. This copy is supplied for personal research use only. Published as: “Probabilistic Principal Component Analysis”, Journal of the Royal Statistical Society, Series B, 61, Part 3, pp. 611–622. Probabilistic Principal Component Analysis 2 1 Introduction Principal component analysis (PCA) (Jolliffe 1986) is a well-established technique for dimension- ality reduction, and a chapter on the subject may be found in numerous texts on multivariate analysis. Examples of its many applications include data compression, image processing, visual- isation, exploratory data analysis, pattern recognition and time series prediction. The most common derivation of PCA is in terms of a standardised linear projection which max- imises the variance in the projected space (Hotelling 1933).
    [Show full text]