Guidelines for Reporting Reliability and Agreement Studies (GRRAS) Were Proposed Jan Kottnera,*, Laurent AudigEb, Stig Brorsonc, Allan Donnerd, Byron J

Total Page:16

File Type:pdf, Size:1020Kb

Guidelines for Reporting Reliability and Agreement Studies (GRRAS) Were Proposed Jan Kottnera,*, Laurent AudigEb, Stig Brorsonc, Allan Donnerd, Byron J Journal of Clinical Epidemiology 64 (2011) 96e106 Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed Jan Kottnera,*, Laurent Audigeb, Stig Brorsonc, Allan Donnerd, Byron J. Gajewskie, Asbjørn Hro´bjartssonf, Chris Robertsg, Mohamed Shoukrih, David L. Streineri aDepartment of Nursing Science, Centre for Humanities and Health Sciences, Charite-Universita ¨tsmedizin Berlin, Berlin, Germany bAO Clinical Investigation and Documentation, Du¨bendorf, Switzerland cDepartment of Orthopaedic Surgery, Herlev University Hospital, Herlev, Denmark dDepartment of Epidemiology and Biostatistics, Schulich School of Medicine and Dentistry, The University of Western Ontario, London, Ontario, Canada eDepartment of Biostatistics, University of Kansas Schools of Medicine & Nursing, Kansas City, KS, USA fThe Nordic Cochrane Centre, Rigshospitalet, Copenhagen, Denmark gSchool of Community Based Medicine, The University of Manchester, Manchester, UK hDepartment of Biostatistics, Epidemiology and Scientific Computing, King Faisal Specialist Hospital and Research Center, The University of Western Ontario, London, Ontario, Canada iDepartment of Psychiatry, University of Toronto, Toronto, Ontario, Canada Accepted 2 March 2010 Abstract Objective: Results of reliability and agreement studies are intended to provide information about the amount of error inherent in any diagnosis, score, or measurement. The level of reliability and agreement among users of scales, instruments, or classifications is widely unknown. Therefore, there is a need for rigorously conducted interrater and intrarater reliability and agreement studies. Information about sample selection, study design, and statistical analysis is often incomplete. Because of inadequate reporting, interpretation and synthesis of study results are often difficult. Widely accepted criteria, standards, or guidelines for reporting reliability and agreement in the health care and medical field are lacking. The objective was to develop guidelines for reporting reliability and agreement studies. Study Design and Setting: Eight experts in reliability and agreement investigation developed guidelines for reporting. Results: Fifteen issues that should be addressed when reliability and agreement are reported are proposed. The issues correspond to the headings usually used in publications. Conclusion: The proposed guidelines intend to improve the quality of reporting. Ó 2011 Elsevier Inc. All rights reserved. Keywords: Agreement; Guideline; Interrater; Intrarater; Reliability; Reproducibility 1. Background conceptually distinct [7e12]. Reliability may be defined as the ratio of variability between subjects (e.g., patients) or ob- Reliability and agreement are important issues in classifi- jects (e.g., computed tomography scans) to the total variabil- cation, scale and instrument development, quality assurance, e ity of all measurements in the sample [1,6]. Therefore, and in the conduct of clinical studies [1 3]. Results of reliability can be defined as the ability of a measurement to reliability and agreement studies provide information about differentiate between subjects or objects. On the other hand, the amount of error inherent in any diagnosis, score, or agreement is the degree to which scores or ratings are iden- measurement, where the amount of measurement error deter- tical. Both concepts are important, because they provide mines the validity of the study results or scores [1,3e6]. information about the quality of measurements. Further- The terms ‘‘reliability’’ and ‘‘agreement’’ are often more, the study designs for examining the two concepts are used interchangeably. However, the two concepts are similar. We focus on two aspects of these concepts: e Interrater agreement/reliability (different raters, using * Corresponding author. Department of Nursing Science, Centre for the same scale, classification, instrument, or proce- Humanities and Health Sciences, Charite-Universitatsmedizin Berlin, ¨ dure, assess the same subjects or objects) Augustenburger Platz 1, 13353 Berlin, Germany. Tel.: þ49-30-450-529- e 054; fax: þ49-30-450-529-900. Intrarater agreement/reliability (also referred to as E-mail address: [email protected] (J. Kottner). testeretest) (the same rater, using the same scale, 0895-4356/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi: 10.1016/j.jclinepi.2010.03.002 J. Kottner et al. / Journal of Clinical Epidemiology 64 (2011) 96e106 97 widely accepted formal criteria, standards, or guidelines What is new? for reliability and agreement reporting in the health care and medical fields are lacking [2,22,23]. Authors of reviews Key finding have often established their own criteria for data extraction e Reporting of interater/intrarater reliability and and quality assessment [12,15,17,19,21e25]. The Ameri- agreement is often incomplete and inadequate. can Educational Research Association, American Psycho- logical Association (APA), and the National Council on e Widely accepted criteria, standards, or guide- Measurement in Education have attempted to improve the lines for reliability and agreement reporting in reporting of reliability and agreement, but they focus on the health care and medical fields are lacking. psychological testing [26]. e Fifteen issues that should be addressed when reli- ability and agreement are reported are proposed. 2. Project What this adds to what is known e There is a need for rigorous interrater and intra- In the absence of standards for reporting reliability and rater reliability and agreement studies to be agreement studies in the medical field, we evolved the idea performed in the future. that formal guidelines might be useful for researchers, authors, reviewers, and journal editors. The lead author e Systematic reviews and meta-analysis of reli- initially contacted 13 experts in reliability and agreement ability and agreement studies will become investigation and asked whether they saw a need for such more frequent. guidelines and whether they wished to take part in this project. The experts were informally identified based on What is the implication and what should change substantial contributions and publications in the field of now? agreement and reliability. Each expert was also asked e Reporting of reliability and agreement studies whether he knew an additional expert who would be inter- should be improved. ested in participating. Finally, from all these individuals, e The proposed guidelines help to improve an international group of eight researchers, who were reporting. experienced in instrument development and evaluation (L.A., D.L.S., S.B., and A.H.), reliability and agreement estimation (A.D., B.J.G., C.R., and M.S.), or in systematic reviews (L.A., S.B., and A.H.) of reliability studies, was classification, instrument, or procedure, assesses the formed. same subjects or objects at different times). The objective was to develop guidelines for reporting e Issues regarding internal consistency are not ad- reliability and agreement studies. The specific aim was to dressed here. establish items that should be addressed when reliability When reporting the results of reliability and agreement and agreement studies are reported. studies, it is necessary to provide information sufficient to The development process contained elements of Glaser’s understand how the study was designed and conducted and state-of-the-art method [27] and of the nominal group how the results were obtained. Reliability and agreement technique [28]. Based on an extensive literature review, are not fixed properties of measurement tools but, rather, the project coordinator (J.K.) produced a first draft of are the product of interactions between the tools, the sub- guidelines, including an initial list of possible items. It jects/objects, and the context of the assessment. was sent to all team members, and they were asked to Reliability and agreement estimates are affected by vari- review, comment, change the document and wording, and ous sources of variability in the measurement setting to indicate whether they agree or disagree. Feedback indi- (e.g., rater and sample characteristics, type of instrument, cated that there was a need for clarification of key concepts administration process) and the statistical approach (e.g., that are related to reliability and agreement studies. There- assumptions concerning measurement level, statistical fore, definitions of key concepts were discussed among model). Therefore, study results are only interpretable the team members, and definitions were established when the measurement setting is sufficiently described (Appendix). Based on the comments from the first review, and the method of calculation or graphical presentation a second draft was created and again sent to team members, is fully explained. accompanied by a summary report of all criticisms, discus- After reviewing many reliability and agreement studies, sions, and suggestions that had been made. Based on the it becomes apparent that important information about the critiques of the second draft, a third draft was created, study design and statistical analysis is often incomplete reviewed, and discussed again. After the review of the [2,12e21]. Because of inadequate reporting, interpretation fourth draft, consensus was achieved, and team members and synthesis of the results are often difficult. Moreover, approved these guidelines in their current form. 98 J. Kottner et al. / Journal of Clinical Epidemiology 64 (2011) 96e106 Table 1 Guidelines for Reporting Reliability
Recommended publications
  • Volume Quantification by Contrast-Enhanced Ultrasound
    Herold et al. Cardiovascular Ultrasound 2013, 11:36 http://www.cardiovascularultrasound.com/content/11/1/36 CARDIOVASCULAR ULTRASOUND RESEARCH Open Access Volume quantification by contrast-enhanced ultrasound: an in-vitro comparison with true volumes and thermodilution Ingeborg HF Herold1*, Gianna Russo2, Massimo Mischi2, Patrick Houthuizen3, Tamerlan Saidov2, Marcel van het Veer3, Hans C van Assen2 and Hendrikus HM Korsten1,2 Abstract Background: Contrast-enhanced ultrasound (CEUS) has recently been proposed as a minimally- invasive, alternative method for blood volume measurement. This study aims at comparing the accuracy of CEUS and the classical thermodilution techniques for volume assessment in an in-vitro set-up. Methods: The in-vitro set-up consisted of a variable network between an inflow and outflow tube and a roller pump. The inflow and outflow tubes were insonified with an ultrasound array transducer and a thermistor was placed in each tube. Indicator dilution curves were made by injecting indicator which consisted of an ultrasound- contrast-agent diluted in ice-cold saline. Both acoustic intensity- and thermo-dilution curves were used to calculate the indicator mean transit time between the inflow and outflow tube. The volumes were derived by multiplying the estimated mean transit time by the flow rate. We compared the volumes measured by CEUS with the true volumes of the variable network and those measured by thermodilution by Bland-Altman and intraclass-correlation analysis. Results: The measurements by CEUS and thermodilution showed a very strong correlation (rs=0.94) with a modest volume underestimation by CEUS of −40 ± 28 mL and an overestimation of 84 ± 62 mL by thermodilution compared with the true volumes.
    [Show full text]
  • On the Sampling Variance of Intraclass Correlations and Genetic Correlations
    Copyright 1998 by the Genetics Society of America On the Sampling Variance of Intraclass Correlations and Genetic Correlations Peter M. Visscher University of Edinburgh, Institute of Ecology and Resource Management, Edinburgh EH9 3JG, Scotland Manuscript received February 3, 1997 Accepted for publication March 23, 1998 ABSTRACT Widely used standard expressions for the sampling variance of intraclass correlations and genetic correlation coef®cients were reviewed for small and large sample sizes. For the sampling variance of the intraclass correlation, it was shown by simulation that the commonly used expression, derived using a ®rst-order Taylor series performs better than alternative expressions found in the literature, when the between-sire degrees of freedom were small. The expressions for the sampling variance of the genetic correlation are signi®cantly biased for small sample sizes, in particular when the population values, or their estimates, are close to zero. It was shown, both analytically and by simulation, that this is because the estimate of the sampling variance becomes very large in these cases due to very small values of the denominator of the expressions. It was concluded, therefore, that for small samples, estimates of the heritabilities and genetic correlations should not be used in the expressions for the sampling variance of the genetic correlation. It was shown analytically that in cases where the population values of the heritabili- ties are known, using the estimated heritabilities rather than their true values to estimate the genetic correlation results in a lower sampling variance for the genetic correlation. Therefore, for large samples, estimates of heritabilities, and not their true values, should be used.
    [Show full text]
  • HERITABILITY ESTIMATES from TWIN STUDIES I. Formulae of Heritability Estimates
    HERITABILITY ESTIMATES FROM TWIN STUDIES I. Formulae of Heritability Estimates K.W. KANG. J.C. CHRISTIAN, J.A. NORTON, Jr. Departments of Medical Genetics and Psychiatry, Indiana University School of Medicine, Indianapolis, Indiana, USA Over the past 50 years a large number of methods have been proposed for estimating herita­ bility from twin studies. The present paper describes the most commonly cited of these esti­ mates as a first step in evaluating their usefulness. A critical review will then follow. Family studies of human quantitative traits have, in general, three goals: first, to determine whether a given phenotypic trait is genetically influenced; second, to determine if discrete segregating genetic factors influence the trait; and third, to determine whether the trait is linked with some other genetic trait. Twin data are often used to attain the first goal by measuring the relative importance of heredity and environment on the development of quan­ titative traits. The concept of heritability originated in an attempt to describe the degree to which the differences actually observed between individuals arose from differences in genetic makeup between individuals as contrasted to the effects of different environmental forces. For a review of the origin of the concept of heritability, see " Heritability in retro­ spect " by Bell (1977). The term heritability is used in both a broad and narrow sense. For the broad sense, the genotype is considered as a unit in relation to the environment. Genes come together in new combinations exhibiting intraallelic interaction (dominance) and interallelic interaction (epistasis). Heritability in the narrow sense considers only the additive portion of the genetic variability in relation to the phenotypic variability.
    [Show full text]
  • Title: Assessing Test-Retest Reliability of Psychological Measures: Persistent
    Title: Assessing test-retest reliability of psychological measures: persistent methodological problems Running header: Reliability of psychological measures Victoria K. Aldridge1, Terence M. Dovey2, & Angie Wade1 1Clinical Epidemiology, Nutrition and Biostatistics Section UCL Great Ormond Street Institute of Child Health, 30 Guilford Street London WC1N 1EH, UK 2Department of Psychology Marie Jahoda Building Brunel University Kingston Lane Uxbridge Middlesex UB8 3PH, UK *Author for Correspondence ([email protected]) Health and Life Sciences De Montfort University Leicester LE1 9BH, UK +44 116 2078158 This version of the article may not completely replicate the final version published in European Psychologist. It is not the version of record and is therefore not suitable for citation. Final published version of this manuscript: https://doi.org/10.1027/1016-9040/a000298 1 Abstract Psychological research and clinical practice relies heavily on psychometric testing for measuring psychological constructs that represent symptoms of psychopathology, individual difference characteristics, or cognitive profiles. Test-retest reliability assessment is crucial in the development of psychometric tools, helping to ensure that measurement variation is due to replicable differences between people regardless of time, target behaviour, or user profile. While psychological studies testing the reliability of measurement tools are pervasive in the literature, many still discuss and assess this form of reliability inappropriately with regard to the specified
    [Show full text]
  • Inference from Complex Samples
    Inference from Complex Samples By LESLIE KISH and MARTIN RICHARD FRANKEL The University ofMichigan and The University ofChicago [Read before the ROYAL STATISTICAL SOCIETY at a meeting organized by the RESEARCH SECTION on Wednesday, October 17th, 1973, Professor J. GANI in the Chair] SUMMARY The design of complex samples induces correlations between element values. In stratification negative correlation reduces the variance; but that gain is less for subclass means, and even less for their differences and for complex statistics. Clustering induces larger and positive correlations between element values. The resulting increase in variance is measured by the ratio deff, and is often severe. This is reduced but persists for subclass means, their differences, and for analytical statistics. Three methods for computing variances are compared in a large empirical study. The results are encouraging and useful. Keywords: CLUSTERS; COMPLEX SAMPLE; SAMPLING ERROR; DESIGN EFFECT; BRR; JACKKNIFE; INFERENCE; STANDARD ERROR; REPLICATION; INTRACLASS CORRELATION; SAMPLE DESIGN; SAMPLING VARIANCE; SUBCLASS ANALYSIS; STRATIFICATION; REPLICATION 1. INTRODUCTION STANDARD statistical methods have been developed on the assumption of simple random sampling. The assumption of the independent selection of elements (hence independence of observations) greatly facilitates obtaining theoretical results of interest. It is essential for most measures of reliability used in probability statements, such as aNn, chi-squared contingency tests, analysis of variance, the nonparametric literature and standarderrors for regression coefficients. Assumptions ofindependence yield the mathematical simplicity that becomes more desirable-and at present necessary-as we move from simple statistics such as means, to the complex statistics typified by regression analysis. Independence is often assumed automatically and needlessly, even when its relaxation would permit broader conclusions.
    [Show full text]
  • We Need to Talk About Reliability: Making Better Use of Test-Retest Studies for Study Design and Interpretation
    We need to talk about reliability: making better use of test-retest studies for study design and interpretation Granville J. Matheson Department of Clinical Neuroscience, Center for Psychiatry Research, Karolinska Institutet and Stockholm County Council, Stockholm, Sweden ABSTRACT Neuroimaging, in addition to many other fields of clinical research, is both time- consuming and expensive, and recruitable patients can be scarce. These constraints limit the possibility of large-sample experimental designs, and often lead to statistically underpowered studies. This problem is exacerbated by the use of outcome measures whose accuracy is sometimes insufficient to answer the scientific questions posed. Reliability is usually assessed in validation studies using healthy participants, however these results are often not easily applicable to clinical studies examining different populations. I present a new method and tools for using summary statistics from previously published test-retest studies to approximate the reliability of outcomes in new samples. In this way, the feasibility of a new study can be assessed during planning stages, and before collecting any new data. An R package called relfeas also accompanies this article for performing these calculations. In summary, these methods and tools will allow researchers to avoid performing costly studies which are, by virtue of their design, unlikely to yield informative conclusions. Subjects Neuroscience, Psychiatry and Psychology, Radiology and Medical Imaging, Statistics Keywords Reliability, Positron Emission Tomography, Neuroimaging, Study design, R package, Power analysis, Intraclass correlation coefficient Submitted 27 September 2018 Accepted 7 April 2019 Published 24 May 2019 INTRODUCTION Corresponding author Granville J. Matheson, In the assessment of individual differences, reliability is typically assessed using test- [email protected], retest reliability, inter-rater reliability or internal consistency.
    [Show full text]
  • The Intraclass Covariance Matrix
    Behavior Genetics, Vol. 35, No. 5, September 2005 (Ó 2005) DOI: 10.1007/s10519-005-5877-1 The Intraclass Covariance Matrix Gregory Carey1,2 Received 03 March 2005—Final 10 May 2005 Introduced by C.R. Rao in 1945, the intraclass covariance matrix has seen little use in behavioral genetic research, despite the fact that it was developed to deal with family data. Here, I reintroduce this matrix, and outline its estimation and basic properties for data sets on pairs of relatives. The intraclass covariance matrix is appropriate whenever the research design or mathematical model treats the ordering of the members of a pair as random. Because the matrix has only one estimate of a population variance and covariance, both the observed matrix and the residual matrix from a fitted model are easy to inspect visually; there is no need to mentally average homologous statistics. Fitting a model to the intraclass matrix also gives the same log likelihood, likelihood-ratio (LR) v2, and parameter estimates as fitting that model to the raw data. A major advantage of the intraclass matrix is that only two factors influence the LR v2—the sampling error in estimating population parameters and the discrepancy between the model and the observed statistics. The more frequently used interclass covariance matrix adds a third factor to the v2—sampling error of homologous statistics. Because of this, the degrees of freedom for fitting models to an intraclass matrix differ from fitting that model to an interclass matrix. Future research is needed to establish differences in power—if any—between the interclass and the intraclass matrix.
    [Show full text]
  • Psychometrics Using Stata
    Psychometrics With Stata Introduction Psychometrics Using Stata Chuck Huber Senior Statistician StataCorp LP San Diego, CA C. Huber (StataCorp) July 27, 2012 1 / 127 Psychometrics With Stata Introduction Humor involves two risks May not be funny May offend Disclaimer The use of clinical depression and its diagnosis in this talk is not meant to make light of this very serious condition. The examples use a depression index and the examples are silly (and fictitious!). This is not meant to imply that depression is silly. C. Huber (StataCorp) July 27, 2012 2 / 127 Psychometrics With Stata Introduction Given the current state of the world economy, the staff at StataCorp began to worry about the emotional well-being of our users. In early 2010, the marketing department hired a consultant to design a depression index to assess our users. The index was pilot-tested on StataCorp employees to determine its psychometric properties. The index was then sent to 1000 randomly selected Stata users. (All data are simulated and this is completely fictitious!) C. Huber (StataCorp) July 27, 2012 3 / 127 Psychometrics With Stata The Pilot Study C. Huber (StataCorp) July 27, 2012 4 / 127 Psychometrics With Stata The Pilot Study C. Huber (StataCorp) July 27, 2012 5 / 127 Psychometrics With Stata The Pilot Study C. Huber (StataCorp) July 27, 2012 6 / 127 Psychometrics With Stata The Pilot Study Administered to all StataCorp employees Explore the psychometric properties of the index Descriptive Statistics Item Response Characteristics Reliability Validity Dimensionality and Exploratory Factor Analysis C. Huber (StataCorp) July 27, 2012 7 / 127 Psychometrics With Stata The Pilot Study Descriptive Statistics .
    [Show full text]
  • Confidence Interval Estimation for Intraclass Correlation Coefficient
    Journal of Modern Applied Statistical Methods Volume 8 | Issue 2 Article 16 11-1-2009 Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes Madhusudan Bhandary Columbus State University, [email protected] Koji Fujiwara North Dakota State University, [email protected] Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Bhandary, Madhusudan and Fujiwara, Koji (2009) "Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes," Journal of Modern Applied Statistical Methods: Vol. 8 : Iss. 2 , Article 16. DOI: 10.22237/jmasm/1257034500 Available at: http://digitalcommons.wayne.edu/jmasm/vol8/iss2/16 This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright © 2009 JMASM, Inc. November 2009, Vol. 8, No. 2, 520-525 1538 – 9472/09/$95.00 Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes Madhusudan Bhandary Koji Fujiwara Columbus State University North Dakota State University Confidence intervals (based on the χ 2 -distribution and (Z) standard normal distribution) for the intraclass correlation coefficient under unequal family sizes based on a single multinormal sample have been proposed. It has been found that the confidence interval based on the χ 2 -distribution consistently and reliably produces better results in terms of shorter average interval length than the confidence interval based on the standard normal distribution: especially for larger sample sizes for various intraclass correlation coefficient values.
    [Show full text]
  • Statistical Methods for Research Workers BIOLOGICAL MONOGRAPHS and MANUALS
    BIOLOGICAL MONOGRAPHS AND MANUALS General Editors: F. A. E. CREW, Edinburgh D. WARD CUTLER, Rothamsted No. V Statistical Methods for Research Workers BIOLOGICAL MONOGRAPHS AND MANUALS THE PIGMENTARY EFFECTOR SYSTEM By L. T. Hogben, London School of Economics. THE ERYTHROCYTE AND THE ACTION OF SIMPLE HlEl\IOLYSINS By E. Ponder, New York University. ANIMAL GENETICS: AN INTRODUCTION TO THE SCIENCE OF ANIMAL BREEDING By F. A. E. Crew, Edinburgh University. REPRODUCTION IN THE RABBIT By J. Hammon~, School of Agriculture, Cambridge. STATISTICAL METHODS FOR RESI<:ARCH WORKERS-· By R. A. Fisher, University of London. .., THE COMPARATIVE ANATOMY, HISTOLOGY, AND DE~ELOPMENT OF THIe PITUITARY BODY By G. R. de Beer, Oxford University. THE COMPOSITION AND DISTRIBUTION OF THE PROTOZOAN FAUNA OF THE SOIL By H. Sandon, Egyptian University, Cairo. THE SPECIES PROBLEM By G. C. Robson, British Museum (Natural History). • THE MIGRATION OF BUTTERFLIES By C. B. Williams, Rothamsted Experimental Station. GROWTH AND THE DEVELOP:\IENT OF MUTTON QUALITIES IN THE SHEEP By John Hammond, Cambridge. MATERNAL BEHAVIOUR IN THE RAT By Bertold P. Wiesner and Norah M. Sheard. And other volumes are in course 0/ puJilication. Statistical Methods for Research Workers BY R. A. FISHER, Sc.D., F.R.S. Formerly Fellow oj Conville and Caius Colle/;e, Cambridge Honorary Member, American Statistical Assoc~f!tl0!1-.._~ .. Calton Professor, ziver~¥YA'l0 LOI!,ffn I 1,.'· I, j FIFTH EDITION-REVISED AND ENLARGED - ANGRAU Central Library Hydl'rabad G3681 /11.1111.1111111111111111111111 1111 OL~C~r;-.~OYD EDINBURGH:" TWEEDDALE COURT LONDON: 33 PATERNOSTER ROW, E.C. I934 }IADE; IN OLIVER AND :R~AT BRITAIN BY OYD LTD ., EDINBURGH EDITORS' PREFACE THE increasing specialisation in biological inquiry has made it impossible for anyone author to deal adequately with current advances in knowledge.
    [Show full text]
  • Estimation and Inference of the Three-Level Intraclass Correlation Coefficient
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2014 Estimation and Inference of the Three-Level Intraclass Correlation Coefficient Matthew David Davis University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Biostatistics Commons Recommended Citation Davis, Matthew David, "Estimation and Inference of the Three-Level Intraclass Correlation Coefficient" (2014). Publicly Accessible Penn Dissertations. 1252. https://repository.upenn.edu/edissertations/1252 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/1252 For more information, please contact [email protected]. Estimation and Inference of the Three-Level Intraclass Correlation Coefficient Abstract Since the early 1900's, the intraclass correlation coefficient (ICC) has been used to quantify the level of agreement among different assessments on the same object. By comparing the level of variability that exists within subjects to the overall error, a measure of the agreement among the different assessments can be calculated. Historically, this has been performed using subject as the only random effect. However, there are many cases where other nested effects, such as site, should be controlled for when calculating the ICC to determine the chance corrected agreement adjusted for other nested factors. We will present a unified framework to estimate both the two-level and three-level ICC for both binomial and multinomial outcomes. In addition, the corresponding standard errors and confidence intervals for both ICC measurements will be displayed. Finally, an example of the effect that controlling for site can have on ICC measures will be presented for subjects nested within genotyping plates comparing genetically determined race to patient reported race.
    [Show full text]
  • Robust Methods for Estimating the Intraclass Correlation Coefficient and for Analyzing Recurrent Event Data
    Robust Methods for Estimating the Intraclass Correlation Coefficient and for Analyzing Recurrent Event Data The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:39947185 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Robust Methods for Estimating the Intraclass Correlation Coefficient and for Analyzing Recurrent Event Data A dissertation presented by Tom Chen to The Department of Biostatistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biostatistics Harvard University Cambridge, Massachusetts September 2018 © 2018 Tom Chen All rights reserved. Dissertation Advisors: Tom Chen Rui Wang Eric Tchetgen Tchetgen Robust Methods for Estimating the Intraclass Correlation Coefficient and for Analyzing Recurrent Event Data Abstract Robust statistics have emerged as a family of theories and techniques for estimating parameters of a model while dealing with deviations from idealized assumptions. Examples of deviations include misspecification of parametric assumptions or missing data; while there are others, these two deviations are the focus of this work. When unaccounted for, naive analysis with existing techniques may lead to biased estimators and/or undercovered confidence intervals. At the same time, research poured into clustered/correlated data is extensive and a large body of methods have been developed. Many works have already connected topics within robust statistics and correlated data, but a plethora of open problems remain.
    [Show full text]