Chapter 8 The Ising Model in Psychometrics Abstract This chapter provides a general introduction of network modeling in psychometrics. The chapter starts with an introduction to the statistical model formulation of pairwise Markov random fields (PMRF), followed by an introduction of the PMRF suitable for binary data: the Ising model. The Ising model is a model used in ferromagnetism to explain phase transitions in a field of particles. Following the description of the Ising model in statistical physics, the chapter continues to show that the Ising model is closely related to models used in psychometrics. The Ising model can be shown to be equivalent to certain kinds of logistic regression models, loglinear models and multi-dimensional item response theory (MIRT) models. The equivalence between the Ising model and the MIRT model puts standard psychometrics in a new light and leads to a strikingly di↵erent interpretation of well-known latent variable models. The chapter gives an overview of methods that can be used to estimate the Ising model, and concludes with a discussion on the interpretation of latent variables given the equivalence between the Ising model and MIRT. 8.1 Introduction In recent years, network models have been proposed as an alternative way of looking at psychometric problems (Van Der Maas et al., 2006; Cramer et al., 2010; Borsboom & Cramer, 2013). In these models, psychometric item responses are conceived of as proxies for variables that directly interact with each other. For example, the symptoms of depression (such as loss of energy, sleep problems, and low self esteem) are traditionally thought of as being determined by a common latent variable (depression, or the liability to become depressed; Aggen, Neale, & This chapter has been adapted from: Epskamp, S., Maris, G., Waldorp, L.J., and Borsboom, D. (in press). Network Psychometrics. In Irwing, P., Hughes, D., and Booth, T. (Eds.), Handbook of Psychometrics. New York: Wiley. 143 8. The Ising Model in Psychometrics Kendler, 2005). In network models, these symptoms are instead hypothesized to form networks of mutually reinforcing variables (e.g., sleep problems may lead to loss of energy, which may lead to low self esteem, which may cause rumination that in turn may reinforce sleep problems). On the face of it, such network models o↵er an entirely di↵erent conceptualization of why psychometric variables cluster in the way that they do. However, it has also been suggested in the literature that latent variables may somehow correspond to sets of tightly intertwined observables (e.g., see the Appendix of Van Der Maas et al., 2006). In the current chapter, we aim to make this connection explicit. As we will show, a particular class of latent variable models (namely, multidimensional Item Response Theory models) yields exactly the same probability distribution over the observed variables as a particular class of network models (namely, Ising models). In the current chapter, we exploit the consequences of this equivalence. We will first introduce the general class of models used in network analysis called Markov Random Fields. Specifically, we will discuss the Markov random field for binary data called the Ising Model, which originated from statistical physics but has since been used in many fields of science. We will show how the Ising Model relates to psychometrical practice, with a focus on the equivalence between the Ising Model and multidimensional item response theory. We will demonstrate how the Ising model can be estimated and finally, we will discuss the conceptual implications of this equivalence. Notation Throughout this chapter we will denote random variables with capital letters and possible realizations with lower case letters; vectors will be represented with bold- faced letters. For parameters, we will use boldfaced capital letters to indicate matrices instead of vectors whereas for random variables we will use boldfaced capital letters to indicate a random vector. Roman letters will be used to denote observable variables and parameters (such as the number of nodes) and Greek letters will be used to denote unobservable variables and parameters that need to be estimated. In this chapter we will mainly model the random vector X: X> = X1 X2 ... XP , containing P binary variables that⇥ take the values 1⇤ (e.g., correct, true or yes) and 1 (e.g., incorrect, false or no). We will denote a realization, or state, of − X with x> = x1 x2 ... xp . Let N be the number of observations and n(x) the number of observations that have response pattern x. Furthermore, let i denote the⇥ subscript of a random⇤ variable and j the subscript of a dif- ferent random variable (j = i). Thus, Xi is the ith random variable and xi its realization. The superscript6 (...) will indicate that elements are removed −(i) from a vector; for example, X− indicates the random vector X without Xi: (i) (i) X− = X1,...,Xi 1,Xi+1,....XP , and x− indicates its realization. Sim- − (i,j) X (i,j) ilarly, X−⇥ indicates X without X⇤ i and Xj and x− its realization. An overview of all notations used in this chapter can be seen in Appendix B. 144 8.2. Markov Random Fields X2 X1 X3 Figure 8.1: Example of a PMRF of three nodes, X1, X2 and X3 , connected by two edges, one between X1 and X2 and one between X2 and X3. 8.2 Markov Random Fields A network, also called a graph, can be encoded as a set G consisting of two sets: V , which contains the nodes in the network, and E, which contains the edges that connect these nodes. For example, the graph in Figure 8.1 contains three nodes: V = 1, 2, 3 , which are connected by two edges: E = (1, 2), (2, 3) . We will use this{ type} of network to represent a pairwise Markov{ random field} (PMRF; Lauritzen, 1996; Murphy, 2012), in which nodes represent observed ran- dom variables1 and edges represent (conditional) association between two nodes. More importantly, the absence of an edge represents the Markov property that two nodes are conditionally independent given all other nodes in the network: (i,j) (i,j) X X X− = x− (i, j) E (8.1) i ?? j | () 62 Thus, a PMRF encodes the independence structure of the system of nodes. In the case of Figure 8.1, X1 and X3 are independent given that we know X2 = x2. This could be due to several reasons; there might be a causal path from X1 to X3 or vise versa, X2 might be the common cause of X1 and X3, unobserved variables might cause the dependencies between X1 and X2 and X2 and X3, or the edges in the network might indicate actual pairwise interactions between X1 and X2 and X2 and X3. Of particular interest to psychometrics are models in which the presence of latent common causes induces associations among the observed variables. If such a common cause model holds, we cannot condition on any observed variable to completely remove the association between two nodes (Pearl, 2000). Thus, if an unobserved variable acts as a common cause to some of the observed variables, we should find a fully connected clique in the PMRF that describes the associations 1Throughout this chapter, nodes in a network designate variables, hence the terms are used interchangeably. 145 8. The Ising Model in Psychometrics among these nodes. The network in Figure 8.1, for example, cannot represent associations between three nodes that are subject to the influence of a latent common cause; if that were the case, it would be impossible to obtain conditional independence between X1 and X3 by conditioning on X2. Parameterizing Markov Random Fields A PMRF can be parameterized as a product of strictly positive potential functions φ(x) (Murphy, 2012): 1 Pr (X = x)= φ (x ) φ (x ,x ) , (8.2) Z i i ij i j i <ij> Y Y in which i takes the product over all nodes, i =1, 2,...,P, <ij> takes the product over all distinct pairs of nodes i and j (j>i), and Z is a normalizing constant suchQ that the probability function sums to unity over allQ possible patterns of observations in the sample space: Z = φi (xi) φij (xi,xj) . x i <ij> X Y Y Here, x takes the sum over all possible realizations of X. All φ(x) functions result in positive real numbers, which encode the potentials: the preference for P the relevant part of X to be in some state. The φi(xi) functions encode the node potentials of the network; the preference of node Xi to be in state xi, regardless of the state of the other nodes in the network. Thus, φi(xi) maps the potential for Xi to take the value xi regardless of the rest of the network. If φi(xi)=0, for instance, then Xi will never take the value xi, while φi(xi) = 1 indicates that there is no preference for X to take any particular value and φ (x )= indicates i i i 1 that the system always prefers Xi to take the value xi. The φij (xi,xj) functions encode the pairwise potentials of the network; the preference of nodes Xi and Xj to both be in states xi and xj. As φij (xi,xj) grows higher we would expect to observe Xj = xj whenever Xi = xi. Note that the potential functions are not identified; we can multiply both φi(xi) or φij (xi,xj) with some constant for all possible outcomes of xi, in which case this constant becomes a constant multiplier to (8.2) and is cancelled out in the normalizing constant Z. A typical identification constraint on the potential functions is to set the marginal geometric means of all outcomes equal to 1; over all possible outcomes of each argument, the logarithm of each potential function should sum to 0: ln φ (x )= ln φ (x ,x )= ln φ (x ,x )=0 x ,x (8.3) i i ij i j ij i j 8 i j x x x Xi Xi Xj in which denotes the sum over all possible realizations for X , and xi i xj denotes the sum over all possible realizations of Xj.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages31 Page
-
File Size-