Latent Class Analysis and Finite Mixture Modeling

OXFORD LIBRARY OF PSYCHOLOGY Editor-in-Chief PETER E. NATHAN The Oxford Handbook of Quantitative Methods Edited by Todd D. Little VoLUME 2: STATISTICAL ANALYSIS OXFORD UNIVERSITY PRESS OXFORD UN lVI!RSlTY PRESS l )xfiml University Press is a department of the University of Oxford. It limhcrs the University's objective of excellence in research, scholarship, nml cduc:uion by publishing worldwide. Oxlilrd New York Auckland Cape Town Dares Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Arf\CIIIina Austria Brazil Chile Czech Republic France Greece ( :umcnmln Hungary Italy Japan Poland Portugal Singapore Smuh Knr~'a Swir£crland Thailand Turkey Ukraine Vietnam l lxllml is n registered trademark of Oxford University Press in the UK and certain other l·nttntrics. Published in the United States of America by Oxfiml University Press 198 M<tdison Avenue, New York, NY 10016 © Oxfi.ml University Press 2013 All rights reserved. No parr of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, hy license, nr under terms agreed with the appropriate reproduction rights organization. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. Library of Congress Cataloging-in-Publication Data The Oxford handbook of quantitative methods I edited by Todd D. Little. v. em.- (Oxford library of psychology) ISBN 978--0-19-993487-4 ISBN 978-0-19-993489-8 1. Psycholoro<-Statistical methods. 2. Psychology-Mathematical models. I. Litde, Todd D. BF39.0927 2012 150.7211-dc23 2012015005 9 8 7 6 5 4 3 2 Printed in the United States ofAmerica on acid-free paper CHAPTER Latent Class Analysis and Finite 25 Mixture Modeling Katherine E. Masyn Abstract Finite mixture models, which are a type of latent variable model, express the overall distribution of one or more variables as a mixture of a finite number of component distributions. In direct applications, one assumes that the overall population heterogeneity with respect to a set of manifest variables results from the existence of two or more distinct homogeneous subgroups, or latent classes, of individuals. This chapter presents the prevailing "best practices" for direct applications of basic finite mixture modeling, specifically latent class analysis (LCA} and latent profile analysis (LPA), in terms of model assumptions, specification, estimation, evaluation, selection, and interpretation. In addition, a brief introduction to structural equation mixture modeling in the form of latent class regression is provided as well as a partial overview of the many more advanced mixture models currently in use. The chapter closes with a cautionary note about the limitations and common misuses of latent class models and a look toward promising future developments in mixture modeling. KeyWords: Finite mixture, latent class, latent profile, latent variable Introduction models, semi-parametric group-based models, semi Like many modern statistical techniques, mix non parametric group-mixed models, regression ture modeling has a rich and varied histmy-it mixture models, latent stare models, latent structure is known by different names in different fields; it analysis, and hidden Markov models. has been implemented using different parameteriza Despite the equivocal label, all of the differ tions and estimation algorithms in different software ent mixture models listed above have rwo com packages; and it has been applied and extended mon feamres. First, they are all finite mixture in various ways according to the substantive inter models in that they express the overall distribu:.. ests and empirical demands of differenr disciplines tion of one or more variables as a mixture of or as well as the varying curiosities of quantitative composite of a finite number of component dis methodologists, statisticians, biosratisticians, psy tributions, usually simpler and more tractable in chometricians, and econometricians. As such, the form than the overall distribution. As an example, label mixture model is quite equivocal, subsuming a consider the distribution of adult heights li1 the range of specific models, including, but not limited general population. Knowing that males are taller, to: larent class analysis (LCA), latent profile analy on average, than females, one could choose to sis (LPA}, latent class duster analysis, discrete latent express the distribution of heights as a mixture trait analysis, factor mixture models, growth mixture of rwo component distributions for males and 551 females, respectively. Iff(height) is the probability where the number of mixing components, K, in density function of the distribution of heights the number of categories or classes of c (c = in the overall population, it could be expressed 1, ... , K); the mixing proportions are the class pro as: portions, Pr(c = 1), ... , Pr(c = K); and the /(height) = Pmale · fmate(height) component distribution density functions are the distribution functions of the response variable, con + PJemale ffernate(height), (1) ditional on latent class membership, f(heightic = where Pmale and Pftmale are the proportions of males 1), ... ,f(heightjc = K). and females in the overall population, respectively, Recognizing mixture models as latent variable and fma/e(height) and fiemate(height) are the dis models allows use of the discourse language of the tributions of heights within the male and female latent variable modeling world. There are two pri mary types of variables: (1) variables (e.g., the subpopulations, respectively. Pm11k and Pftmale are latent referred to as the mixingproportions andfinale (height) latent class variable, c) that are not directly observed and /rernate(height) are the component distribution or measured, and (2) manifest variables (e.g., the density functions. response variables) that are observable and are pre The second common feature for all the differ sumed to be influenced by or caused by the latent ent kinds of mixture models previously listed is variable. The manifest variables are also referred to that the components themselves are not directly as indicator variables, as their observed values for a observed-that is, mixture component membership given individual are imagined to be imperfect indi is unobserved or latent for some or all individ cations of the individual's "true" underlying latent uals in the overall population. So, rather than class membership. Framed as a latent variable model, expressing the overall population distribution as a there are two parts to any mixture model: (1) the mixture of known groups, as with the height exam measurementmodel, and {2) the structural model. The ple, mixture models express the overall population statistical measurement model specifies the relation distribution as a finite mixture of some number, K, ship between the underlying latent variable and the of unknown groups or components. For the dis corresponding manifest variables. In the case of mix tribution of height, this finite mixture would be ture models, the measurement model encompasses expressed as: the number oflatent classes and the class-specific dis tributions of the indicator variables. The structural /(height)= PI ·fi (height)+ P2 ·fi(height) model specifies the distribution of the latent vari + · · · + PK · /K(height), (2) able in the population and the relationships between where the number of components, K, the mix latent variables and between latent variables and cor ing proportions, PI> ... ,pK, and the component responding observed predictors and outcomes (i.e., specific height distributions, fi (height), ... ,fK latent variable antecedent and consequent variables). (height), are all unknown but can be estimated, In the case of unconditional mixture models, the under certain identifYing assumptions, using height structural model encompasses just the latent class data measured on a representative sample from the proportions. total population. Finite Mixture Modeling As a Finite Mixture Models As Latent Person-Centered Approach Variable Models Mixture models are obviously distinct from the It is the unknown nature of the mixing more familiar latent variable factor models in which components--in number, proportion, and form the underlying latent structure is made up of one that situates finite mixture models in the broader or more continuous latent variables. The designa category of latent variable models. The finite mix tion for mixture modeling often used in applied ture distribution given in Equation 2 can be re literature to highlight this distinction from factor expressed in terms of a latent unordered categorical analytic models does not involve the overt cate variable, usually referred to as a latent class variable gorical versus contimtous latent variable scale com and denoted by c, as follows: parison but instead references mixture modeling as a person-certtered or person-oriented approach {in /(height)= Pr(c = 1) · f(heightjc = 1) contrast to variable-centered or variable-oriented). + · · · + Pr(c = K) ·f(heightlc = K), Person-centered approaches describe similarities and (3) differences among individuals with respect to how 552 LATENT CLASS ANALYSIS

Latent Class Analysis and Finite Mixture Modeling

Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods

Theoretical Model Testing with Latent Variables

Advances in Nonnegative Matrix Decomposition with Application to Cluster Analysis Aalto Un Iversity

Conducting Confirmatory Latent Class Analysis Using Mplus W

Latent Variable Modelling: a Survey*

General Latent Variable Modeling

25Th , 2017 University of Connecticut

Download Paper

UNIVERSITY of CALIFORNIA Santa Barbara the Intersection of Sample

An Introduction to Latent Class Growth Analysis and Growth Mixture Modeling Tony Jung and K

Latent Variables in Psychology and the Social Sciences

Latent Variable Analysis