Cross-Classified and Multiple Membership Structures in Multilevel Models: an Introduction and Review
Total Page:16
File Type:pdf, Size:1020Kb
RESEARCH Cross-classified and Multiple Membership Structures in Multilevel Models: An Introduction and Review Antony Fielding University of Birmingham Harvey Goldstein University of Bristol Research Report RR791 Research Report No 791 Cross-classified and Multiple Membership Structures in Multilevel Models: An Introduction and Review Antony Fielding University of Birmingham Harvey Goldstein University of Bristol The views expressed in this report are the authors’ and do not necessarily reflect those of the Department for Education and Skills. © University of Birmingham 2006 ISBN 1 84478 797 2 ACKNOWLEGEMENTS The authors of this report would like to thank all the members and visiting fellows of the Centre for Multilevel Modelling, Graduate School of Education, and University of Bristol for the many discussions in the area which have informed this review. We are also extremely grateful to Professor Hywel Thomas and Dr Ian Davison of the University Of Birmingham School Of Education. The first effectively co-directed a project in which this review forms a part and was a constant source of advice. The second undertook valuable literature searches. Finally we are very grateful to the DfES researchers and officials for their patience and invaluable guidance. 1 CONTENTS 1 Introduction…………………………………………………….…………………... 3 2 Basic Multilevel Modelling in Hierarchical Social Structures …..……….……...... 4 2.1 Explanatory Models Using Multiple Regression.………….………………...... 4 2.2 Hierarchical Data Structures And Multilevel Models..………………….…...... 7 2.3 Basic Ideas Through An Example ……………………………………….…..... 9 2.4 An Example Of A Basic Two Level Variance Components Model.………....... 15 2.5 Extending Hierarchical Models…………………..……………….…………… 17 3 Cross-Classified Data Structures.……………………………………………………....... 20 3.1 The Nature Of Cross-Classifications And Their Effects………………………. 20 3.2 Some Objectives Of Analysing Cross-Classified Multilevel Models …………. 23 3.2.1 Improving the quality of estimates of explanatory variable effects………………………………………………………………… 23 3.2.2 Identifying components of variance in the outcomes……..………….. 24 3.2.3 The study of differential effect. ………………………………….…... 24 3.2.4 Estimating level 2 effects …………………………………..….……. 24 4 Further Examples of Cross-Classified Structures And Their Analysis …….……………. 25 4.1 Some Examples In Education And Repeated Measures Studies ……………… 25 4.2 Some Notation For Cross-Classified Models …………………………………. 29 4.3 An Example Analysis: Sixteen Year Examination Performance ……………… 30 5 More Complex Structures: Multiple Membership ………………………………………. 33 5.1 The Idea of Multiple Membership....………………………………………....... 33 5.2 Classification Diagrams And A More General Notation For Cross- Classified And Multiple Membership Structures……………………………… 35 5.3 Examples Of Application Of Multiple Membership And More Complex Structures……………………………………………………………. 38 5.3.1 Teachers, teaching groups and students in GCE Advanced Level Results (Fielding (2002) ……………………………………… 38 5.3.2 Spatial models using multiple membership relation………...……….. 41 6 Estimation Methodology And Software Issues …………………………………………. 44 6.1 Introduction …………………………………………………………………… 44 6.2 Approaches To Estimating Complex Multilevel Models.……………………... 44 6.3 Software ……………………………………………………………………….. 45 6.4 Brief Comments on Generalised Models For Discrete Responses…………….. 47 7 More Applications In The Literature And Potentiality For Similar Approaches in Education Research ……………………………….…………………….. 48 7.1 Health Research ……………………………………………………………….. 48 7.2 Survey Methodology And Interviewer Response Variance…………………….49 7.3 Social Networks ……………………………………………………………….. 50 7.4 Veterinary Epidemiology, Animal Ecology and Genetics.…………………….. 50 7.5 Transportation research.………………………………………………………... 52 7.6 Missing identification of units………………………………………………..... 52 7.7 Generalisability theory………………………………………………………..... 53 7.8 Psychometrics ………………………………………………………………..... 53 7.9 Further examples in education………………………………………………..... 53 8 Conclusion and Additional Comments…………………………………………….. 54 Appendix…………………………………………………………………………………… 57 References………………………………………………………………………………….. 59 2 1 Introduction The aim of this report is partly to introduce in a fairly readable way some of the key ideas of fairly recent statistical methodology for modelling data on complex social structures including those in education. It reviews the ‘state of the art’ in the development of such methodology, and its software implementation. It also considers a wide range of examples and published applications which are either drawn directly from education or suggests potentialities in that area. Since many of the key ideas of statistical modelling of effects and the necessity for statistical control of variables are well established in traditional explanatory multiple regression this is considered first. This establishes important notions which are essential to understand as the statistical models become more complex. Section 2 then goes on to consider how data can arise from hierarchical structures such as pupils within schools and why standard regression models should be extended to encompass multilevel models. Examples from educational progress research are then considered to illustrate the applicability of such models and to further introduce major concepts such as variance components. A variety of relevant extensions and applications are then introduced to fix ideas further, Section 3 then considers that hierarchical structures and models to handle them are only the starting point for statistical modelling of complex reality. For instance it may be seen that not only do students nest themselves within schools but may also be lodged in a parallel hierarchy of area of residence which cuts across the school hierarchy. The example of education production functions incorporating both school and area effects are given. Further examples are given and then cross-classified random effects models are introduced as an appropriate way of handling data on such structures. We then examine some of the aims of such analyses. By considering some fairly complex repeated measures designs Section 4.1 reveals even more detailed structural complexity that falls into a cross-classified model framework. To formulate and understand the statistical aspects of the models some fairly detailed structured algebraic notation is required. This is outlined in Section 4.2. The detailed examination of a published application and its results in Section 4.3 illustrates the variety of detailed answers to research questions which may be revealed. This example which crosses-classifies students by secondary school attended with their previous primary school shows that achievement at secondary school may depend not only on the secondary school but also large carry over effects of prior primary school. Further complexity is introduced into models in Section 5 by introducing the idea of multiple membership. For instance, in an educational setting students can attend more than one institution, so that a strict hierarchy of students within institutions is no longer applicable. Effects on a response variable may thus consist of contributions from more than one unit at the institutional level. It is shown that by conceptualising these random effects as weighted contributions from these several units the multilevel modelling framework may be further extended. Classification diagrams and a more simplified notation are then discussed and together form a heuristic way of grasping the essential features of such complex structures. Section 5 then concludes by consideration of detailed examples where the multiple membership ideas are seen in practical operation. It is also seen how units with multiple memberships may also be combined with existing cross-classifications in illuminating ways. In an educational setting a set of students may be crossed with a set of teaching groups for the purposes of studying their GCE A levels. The various A level grades are nested within a cross-classification of students and teaching groups. Teachers may make contributions to several groups and also each group may be handled by several teachers during its operation. By conceptualising each grade response as being in multiple membership relation with the set 3 of teachers alongside a crossing of students and groups it is shown how the model framework enables the disentangling of the separate effects of individual student characteristics, group features, and teachers. Section 6 discusses the contrasting approaches that are taken to the estimation and statistical fitting of the complex models that have been discussed. In particular the focus is on the two approaches, Maximum Likelihood (ML) and Monte-Carlo Markov Chains, which are contrasted. The wide range of statistical software which has facilities for handling the model frameworks is also briefly evaluated. Some crucial features of MLwiN which is most widely used in the UK research community are outlined. The penultimate Section 7 considers a range of quite complex applications that have appeared in the literature. This comprehensive up to date review covers the following areas; health research, survey methodology and interviewer variance, social networks, veterinary epidemiology, animal ecology, genetics, transportation, missing unit identification, generalisability theory, psychometrics and additional education applications. Where possible attention is drawn to parallel structures in education where some of the methodology of