Wright, AGC (In Press). Latent Variable Models in Clinical

1 Latent Variable Models in Clinical Psychology Aidan G.C. Wright University of Pittsburgh Please cite as: Wright, A.G.C. (in press). Latent variable models in clinical psychology. In Wright, A.G.C., & Hallquist, M.N. (Eds.). Cambridge handbook of research methods in clinical psychology. New York, NY: Cambridge University Press. 2 Correspondence concerning this article should be addressed to Aidan Wright, Department of Psychology, University of Pittsburgh, 4119 Sennott Square, 210 S. Bouquet St., Pittsburgh, PA, 15260. E-mail: [email protected] Abstract Most of what clinical psychology concerns itself with is directly unobservable. Concepts like neuroticism and depression, but also learning and development, represent dispositions, states, or processes that must be inferred and cannot (currently) be directly measured. Latent variable modeling, as a statistical framework, encompasses a range of techniques that involve estimating the presence and effect of unobserved variables from observed data. This chapter provides a non-technical overview of latent variable modeling in clinical psychology. Dimensional latent variable models are emphasized, although categorical and hybrid models are touched on briefly. Challenges with specific models, such as the bifactor model are discussed. Examples draw from the psychopathology literature. Keywords: Latent Variable Models; Factor Analysis; Exploratory Structural Equation Modeling; Bifactor Models; Factor Mixture Models 4 Latent Variable Models in Clinical Psychology Most of what clinical psychology concerns itself with is directly unobservable. Concepts like neuroticism and depression, but also learning and development, represent dispositions, states, or processes that must be inferred and cannot (currently) be directly measured. Latent variable modeling encompasses a range of techniques that involve estimating the presence and effect of unobserved variables from observed data. In other words, latent variable models are a class of statistical techniques that allow investigators to work at the construct level, even when the constructs in question elude direct measurement. They can also be used to directly compare distinct and often competing conceptualizations of mental disorder. The basic logic underpinning these approaches is similar to the diagnostician’s task of inferring an inaccessible disease state from outwardly available signs and symptoms. For instance, if an individual patient were to complain of lack of interest and/or pleasure, persistent low mood, guilt, as well as a number of vegetative symptoms like fatigue and appetite disturbance, the clinician might infer that they are suffering from depression. In this example, depression is a clinical concept that is presumed to drive the manifestation of these debilitating mental and physical states. If one wants to study depression, one would ideally have access not just to those observable features, all of which have other potential causes, but to what they share in the form of the unobserved depression episode—latent variable models provide investigators direct 5 access to the inferred construct. This is the crux of this set of techniques, although as I discuss below, they can be leveraged in creative ways to understand the very nature of psychopathology. This chapter provides a non-technical introduction of latent variable models with an emphasis on how they can be used to answer challenging questions relevant to clinical psychology. Therefore, the emphasis is largely conceptual, and the reader is directed elsewhere for technical treatments and detailed instructions for applications (e.g., Bollen, 1989; Brown, 2015; Collins & Lanza, 2010; Loehlin, 2004; Mulaik, 2010). I begin with a discussion of the definition of latent variables, then follow with an elaboration of several exemplar models including factor analytic techniques and mixture modeling. Portions of the model overviews borrow heavily from Wright (2017) and Wright & Zimmerman (2015), which can be consulted for additional detail. Throughout, examples from the clinical literature are provided. Definitions and Conceptual Underpinnings How should we think about latent variables? What are their defining features? What role can they play in applied clinical research programs? Scholars and methodologists have answered these questions in different ways over the years, with some providing formal but narrow definitions in specific quantitative terms and others providing informal and loosely defined criteria. In a review on the use of latent variable models in social science, Bollen (2002) summarized several of the most common definitions, 6 and in so doing contrasted informal and formal definitions. On the informal side, Bollen (2002) noted that several authors have argued that latent variables are “hypothetical variables” or “hypothetical constructs” (Edwards & Bagozzi, 2000; Harman, 1960; Nunnally, 1978). Although this definition is certainly accurate, it is only conceptual, and does not link back to the formal statistical models in any clear way. Another common informal definition that Bollen (2002) identified is that latent variables are “unobservable or unmeasurable.” The major concern with this definition is that it presupposes advances in technology that may make what currently defies measurement feasible to measure some day in the future. Finally, some have argued that latent variables are merely summaries of the observed variables serving little more than a descriptive function. Adopting this perspective unencumbers the researcher of making any challenging assumptions about the true nature of latent variables, but it also defangs the models and reduces them to their weakest form. Although Bollen (2002) also reviewed a number of formal definitions that have been offered for latent variables over the years (e.g., local independence definition), he noted that each of these are overly restrictive or too narrow for various reasons. Ultimately, he offered a new definition that is simultaneously formal yet non-restrictive—a latent variable is a variable (i.e., not a constant) for which there is no sample realization in a given sample. What this means, is that any variable that is not directly observed in (at least some portion of) a sample is latent. One attractive 7 feature of this definition is that it acknowledges that a variable may be latent in one sample (because it is unmeasured), but observed in another, and that this may be something that changes over time (e.g., as new technology is developed). This approach is also useful for accommodating certain technical aspects of latent variable models, such as their use in handling missing data, allowing for correlated residuals, and treating residuals in regressions and random effects in mixed effects models as latent variables. Finally, although very abstract, this definition does provide the necessary link between the unobserved and the observed data. Thus, this definition encompasses all of the informal definitions listed above, and other formal definitions often serve as special cases of this more general definition. The “sample realization” definition is useful for establishing when a variable is latent, but it does not speak to their ontology. That is, how should we conceive of latent variables? Borsboom, Mellenbergh, and van Heerden (2003) took up this question in what has become a classic treatment on the conceptualization of latent variables. Borsboom and colleagues raise a number of technical points to motivate this question, but the fundamental issue they consider is whether latent variables exist independent of the data used to estimate them. That is to say, given any set of data, one can run and estimate a latent variable model and if it fits well, the latent variable can be interpreted. However, that’s not to say that anything meaningfully independent of the data has been ascertained. One 8 needs to make an ontological assumption to link the operational latent variable estimated from the observed data to the formal latent variable of theoretical interest. Borsboom and colleagues describe two distinct ontological stances, although one of these has layers of stringency. The first stance is the realist stance, that assumes that the latent variable in question is something real in nature distinct from the data that are used to measure it. This can be contrasted with the constructivist stance, which regards latent variables as constructed by the human mind and therefore do not exist independent of their measurement. An extreme variation on this latter perspective is that the latent variable is just a data reduction method, much like Bollen (2002) discussed as one informal definition of latent variables. Borsboom and colleagues (2003) argue that to take latent variable modeling seriously, one must adopt a realist stance. Which is to say, one must assume that the latent variable is something that exists in nature independent of the data and causes the patterns in observed data. In contrast, an operationalist perspective is fundamentally at odds with latent variable theory, and assumes nothing more than a summary of the available data. That is, any latent variable estimated from the data is not independent from it, and it is just a construction of our minds that does not otherwise exist in reality. Borsboom and colleagues’ arguments are well conceived and articulated, but they would seem to place clinical psychologists in an 9 uncomfortable position by drawing a crisp distinction between realism and constructivism. Either

Load more