Tilburg University Using Scalability Coefficients and Conditional

Tilburg University Using scalability coefficients and conditional association to assess monotone homogeneity Straat, J.H. Publication date: 2012 Link to publication in Tilburg University Research Portal Citation for published version (APA): Straat, J. H. (2012). Using scalability coefficients and conditional association to assess monotone homogeneity. Ridderprint. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 29. sep. 2021 Using Scalability Coefficients and Conditional Association to Assess Monotone Homogeneity Printed by Ridderprint BV, Ridderkerk ISBN/EAN: 978-90-5335-598-5 Copyright c 2012 J. Hendrik Straat. All rights reserved. Using Scalability Coefficients and Conditional Association to Assess Monotone Homogeneity Proefschrift ter verkrijging van de graad van doctor aan Tilburg University op gezag van de rector magnificus, prof. dr. Ph. Eijlander, in het openbaar te verdedigen ten overstaan van een door het college voor promoties aangewezen commissie in de aula van de Universiteit op vrijdag 23 november 2012 om 14.15 uur door Johannes Hendrikus Straat geboren op 24 juli 1983 te Eindhoven Promotiecommissie: Promotor: prof. dr. K. Sijtsma Copromotor: dr. L. A. van der Ark Overige leden: prof. dr. J. K. L. Denollet prof. dr. W. J. Heiser prof. dr. R. R. Meijer dr. A. A. Béguin dr. W. H. M. Emons Contents 1 Introduction 1 1.1 Outline of the thesis . .6 2 Comparing Optimization Algorithms for Item Selection in Mokken Scale Analysis 9 2.1 Introduction . 10 2.2 Mokken's Nonparametric Item Response Theory . 13 2.3 New Item Selection Methods . 16 2.4 Comparing Three Item Selection Methods . 18 2.5 Real Data Example . 27 2.6 Discussion . 28 Appendix 1 . 30 Appendix 2 . 31 3 Multi-Method Analysis of the Internal Structure of the Type D Scale-14 (DS14) 37 3.1 Introduction . 38 3.2 Factor Analysis Results for Type D . 41 3.3 Mokken Scale Analysis . 45 3.4 Method . 47 3.5 Results . 49 3.6 Discussion . 54 4 Methodological Artifacts in Dimensionality Assessment of the Hospital Anxiety and Depression Scale (HADS) 57 4.1 Introduction . 58 4.2 Hierarchical Structure of Psychological Attributes . 61 4.3 Method . 63 4.4 Results . 67 4.5 Discussion . 69 5 Minimum Sample Size Requirements for Mokken Scale Analysis 73 5.1 Introduction . 74 5.2 Monotone Homogeneity Model . 75 5.3 Method . 78 5.4 Results . 81 5.5 Discussion . 83 Appendix . 84 6 Using Conditional Association to Identify Locally Independent Item Sets 85 6.1 Introduction . 86 6.2 Conditional Association . 88 6.3 A Procedure to Identify a Locally Independent Item Set . 93 6.4 Comparison of Methods Assessing Fit to UMLVM . 95 6.5 Empirical Example: The Type D Scale-14 . 99 6.6 Discussion . 102 Appendix . 103 7 Epilogue 107 References 111 Summary 127 Samenvatting 131 Dankwoord 137 Chapter 1 Introduction In the social and behavioral sciences, researchers commonly use tests and questionnaires to measure attributes such as cognitive abilities including aspects of intelligence, personality traits, and attitudes. Measurement of these attributes is liable to more random measurement error than measurement of, for example, temperature or distance and also to systematic but undesirable influences, such as social desirability, tiredness, and cheating. Consequently, a single item does not provide a reliable and valid measurement of the attribute and researchers have to construct multiple items to control for random measurement error and to cover different aspects of the attribute well. The responses to the items contain information about the attribute that a researcher intends to measure. If the items have been constructed adequately, test takers with a higher attribute score (e.g., being more verbally intelligent, more extravert, or having a more positive attitude towards euthanasia) are expected to score higher on an item than test takers with lower attribute scores. Researchers are predominantly interested in the positions of test takers on the scale for the attribute of interest rather than their scores on single items. Psychometric measurement models summarize a pattern of item scores into a score on a latent variable that represents the attribute. In validity research, researchers have to evaluate whether the latent variable is a valid representation of the attribute of interest and whether the latent variable covers all relevant aspects of the attribute. Measurement models restrict the relation of the item scores and the latent variable. A general class of measurement models is item 1 2 response theory (IRT; Embretson & Reise, 2000; Van der Linden & Hambleton, 1997). The evaluation of the nonparametric IRT model known as the monotone homogeneity model (Mokken, 1971; Molenaar, 1997) in real data is the central theme of this study. The monotone homogeneity model is also known as the unidimensional monotone latent variable model (Holland & Rosenbaum, 1986). Three assumptions define the monotone homogeneity model: unidimensionality, local independence, and monotonicity. The unidimensionality assumption posits that the items in the test measure only one latent variable. This assumption reflects the ideal that items should measure one attribute so as to simplify the test performance's interpretation. The local independence assumption posits that the unidimensional latent variable is the only source of association between the items, so that the multivariate, distribution of the item scores conditional on the latent variable equals the product of the univariate, conditional distributions. The local independence assumption implies that for test takers with the same score on the latent variable, the item scores are independent, hence local independence. In combination with unidimensionality, local independence ascertains that the test measures one latent variable and nothing else. The monotonicity assumption encompasses the intuitively appealing idea that the expected item score is a monotone nondecreasing function of the latent variable. This means that as the latent-variable value increases, for each item the expected score remains the same or increases. Many well-known parametric IRT models assume unidimensionality and local independence and assume a parametric function such as the logistic to describe the relation between the expected item score and the latent variable. These parametric IRT models are special cases of the nonparametric monotone homogeneity model that only restricts the relation between the expected item score and the latent variable to be nondecreasing. Examples of parametric IRT models for dichotomous items are the Rasch model (Rasch, 1960), the 2-parameter logistic model and the 3-parameter logistic model (Birnbaum, 1968), and examples of parametric IRT models for polytomous items are the partial credit model (Masters, 1982), and the graded response model (Samejima, 1969). The parametric IRT models reject items that have monotone relations with the latent variable that are not logistic. Because the monotone homogeneity model is less stringent with respect to the relation between the 3 item scores and the latent variable, if different models could select items from a large pool then the monotone homogeneity model would select more items than parametric IRT models that are special cases of the former model. Including many items in a scale may be considered to be a desirable property of a measurement model. Moreover, because the unidimensional parametric IRT models are special cases of the monotone homogeneity model, assessment of the fit of the monotone homogeneity model to the data also provides information about a parametric model's data fit. An important question is how the latent variable can summarize a person's pattern of item scores. The monotone homogeneity model justifies the use of the easily interpretable total score, which is the unweighed sum of the item scores, as an ordinal estimator of the latent variable. Thus, test takers with a higher total score on average have a higher score on the latent variable than test takers with a lower total score. Grayson (1988; also, Huynh, 1994) showed that the ordering of the latent variable by the total score holds for dichotomous items, but Hemker, Sijtsma, Molenaar, and Junker (1997) proved that for polytomous items the total score strictly is not an ordinal estimator of the latent variable. However, Van der Ark (2005) demonstrated that violations of ordinal measurement of the latent variable are rare. Moreover, Van der Ark and Bergsma (2010) proved that for polytomous items a weaker form of the ordering of the latent variable by the total score holds. Hence, the monotone homogeneity model suffices as a measurement model when the measurement purpose requires ordinal measurement, for example, to identify the most capable applicants in personnel selection (ranking of total scores) or patients who need a particular treatment more than others (dichotomization of the total-score scale). This thesis discusses methods that use observable consequences of the monotone homogeneity model to assess the fit of the model to the data and the measurement quality of items. Observable consequences provide necessary but not sufficient conditions for the measurement model. Hence, observable consequences are particularly useful to investigate whether one or more items are inconsistent with the monotone homogeneity model. The property of conditional association (Holland & Rosenbaum, 1986; Rosenbaum, 1984) defines a large set of observable consequences of the monotone homogeneity model.

Tilburg University Using Scalability Coefficients and Conditional

Psychometric Analysis of the Flow Short Scale Translated to Finnish

A Tutorial on How to Do a Mokken Scale Analysis on Your Test and Questionnaire Data Klaas Sijtsma1* and L

Examining Invariant Item Ordering Using Mokken Scale Analysis for Polytomously Scored Items*

Investigating Invariant Item Ordering Using Mokken Scale Analysis for Dichotomously Scored Items

An Instructional Module on Mokken Scale Analysis

A Mokken Scale to Assess Secondary Pupils' Experience of Violence In

Using Mokken Scaling to Compare the Italian and English Versions of the MOLES Index

Item Response Theory: How Mokken Scaling Can Be Used in Clinical Practice

An Instructional Module on Mokken Scale Analysis

A Tutorial on How to Do a Mokken Scale Analysis on Your Test and Questionnaire Data

Package 'Mokken'

Measuring Unobserved Variables in Educational Inequality Research: Mathematics and Language ‘Abilities’ in Early Childhood