Confidence Interval Estimation for Intraclass Correlation Coefficient
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Modern Applied Statistical Methods Volume 8 | Issue 2 Article 16 11-1-2009 Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes Madhusudan Bhandary Columbus State University, [email protected] Koji Fujiwara North Dakota State University, [email protected] Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Bhandary, Madhusudan and Fujiwara, Koji (2009) "Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes," Journal of Modern Applied Statistical Methods: Vol. 8 : Iss. 2 , Article 16. DOI: 10.22237/jmasm/1257034500 Available at: http://digitalcommons.wayne.edu/jmasm/vol8/iss2/16 This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright © 2009 JMASM, Inc. November 2009, Vol. 8, No. 2, 520-525 1538 – 9472/09/$95.00 Confidence Interval Estimation for Intraclass Correlation Coefficient Under Unequal Family Sizes Madhusudan Bhandary Koji Fujiwara Columbus State University North Dakota State University Confidence intervals (based on the χ 2 -distribution and (Z) standard normal distribution) for the intraclass correlation coefficient under unequal family sizes based on a single multinormal sample have been proposed. It has been found that the confidence interval based on the χ 2 -distribution consistently and reliably produces better results in terms of shorter average interval length than the confidence interval based on the standard normal distribution: especially for larger sample sizes for various intraclass correlation coefficient values. The coverage probability of the interval based on the χ 2 -distribution is competitive with the coverage probability of the interval based on the standard normal distribution. An example with real data is presented. Key words: Z-distribution, χ 2 -distribution, intraclass correlation coefficient, confidence interval. Introduction degree of intrafamily resemblance with respect Suppose, it is required to estimate the correlation to characteristics such as blood pressure, coefficient between blood pressures of children cholesterol, weight, height, stature, lung on the basis of measurements taken on p capacity, etc. Several authors have studied children in each of n families. The p statistical inference concerning ρ based on measurements on a family provide p(p − 1) pairs single multinormal samples (Scheffe, 1959; Rao, of observations (x, y), x being the blood pressure 1973; Rosner, et al., 1977, 1979; Donner & Bull, of one child and y that of another. From the n 1983; Srivastava, 1984; Konishi, 1985; Gokhale families we generate a total of np(p − 1) pairs & SenGupta, 1986; SenGupta, 1988; Velu & from which a typical correlation coefficient is Rao, 1990). computed. The correlation coefficient thus Donner and Bull (1983) discussed the computed is called an intraclass correlation likelihood ratio test for testing the equality of coefficient. Statistical inference concerning two intraclass correlation coefficients based on intraclass correlations is important because it two independent multinormal samples under provides information regarding blood pressure, equal family sizes. Konishi and Gupta (1987) cholesterol, etc. in a family within a particular proposed a modified likelihood ratio test and race. derived its asymptotic null distribution. They ρ The intraclass correlation coefficient also discussed another test procedure based on a as a wide variety of uses for measuring the modification of Fisher’s Z-transformation following Konishi (1985). Huang and Sinha (1993) considered an optimum invariant test for the equality of intraclass correlation coefficients Madhusudan Bhandary is an Associate Professor under equal family sizes for more than two in the Department of Mathematics. Email: intraclass correlation coefficients based on [email protected]. Koji independent samples from several multinormal Fujiwara is a graduate student in the distributions. Department of Statistics. Email: For unequal family sizes, Young and Bhandary (1998) proposed Likelihood ratio test, [email protected]. large sample Z-test and large sample Z*-test for 520 BHANDARY & FUJIWARA the equality of two intraclass correlation vector and the covariance matrix for the familial coefficients based on two independent data is given by the following (Rao, 1973): multinormal samples. For several populations and unequal family sizes, Bhandary and Alam μ = μ i 1i (2000) proposed the Likelihood ratio and large ~ ~ sample ANOVA tests for the equality of several and intraclass correlation coefficients based on 1ρρ ... several independent multinormal samples. ρρ1 ... Donner and Zou (2002) proposed asymptotic Σσ= 2 i , (2.1) test for the equality of dependent intraclass pxp ... ... ... ... ii correlation coefficients under unequal family ρρ... 1 sizes. None of the above authors, however, derived any confidence interval estimator for where1i is a pi x1 vector of 1’s, ~ intraclass correlation coefficients under unequal μ −∞ < μ < ∞ family sizes. In this article, confidence interval ( ) is the common mean and estimators for intraclass correlation coefficients σ 2 (σ 2 > 0) is the common variance of are considered based on a single multinormal members of the family and ρ , which is called sample under unequal family sizes, and the intraclass correlation coefficient, is the conditional analyses - assuming family sizes are coefficient of correlation among the members of fixed - though unequal. It could be of interest to estimate the 1 the family and max− ≤ ρ ≤ 1. blood pressure or cholesterol or lung capacity 1≤i≤k − pi 1 for families in American races. Therefore, an It is assumed that interval estimator for the intraclass correlation x ~ N (μ ,Σ );i = 1,..., k , where N coefficient under unequal family sizes must be i pi i i pi ~ ~ developed. To address this need, this paper proposes two confidence interval estimators for represents a pi -variate normal distribution and μ Σ the intraclass correlation coefficient under i , i 's are defined in (2.1). Let unequal family sizes, and these interval ~ estimators are compared using simulation u i1 techniques. u i2 . Methodology = = Proposed Confidence Intervals: Interval Based Let ui Q xi (2.2) ~ . ~ on the Standard Normal Distribution Consider a random sample of k families. . Let u ipi x i1 xi2 where Q is an orthogonal matrix. Under the . orthogonal transformation (2.2), it can be shown X = μ * Σ* = i that ui ~ N p ( i , i );i 1,...,k , where ~ . i ~ ~ . x ipi th be a pi x1 vector of observations from i family; i = 1,2,..., k. The structure of the mean 521 CI INTRACLASS CORRELATION ESTIMATION UNDER UNEQUAL FAMILY SIZES μ Var =−2(1ρ )2 k −+−−−−−12ρ − 11 0 (1)pc 2(1)(1) pka i (2.4) i=1 * . k = number of families in the sample μ = i k ~ . = −1 p x1 pk pi i i=1 . k 2212=− −ρρ− + − cka12(1) i (1) 0 i=1 kk and −−−1112+− kapii(1)( ka ) η 0 ... 0 ii==11 i 0 1− ρ ... 0 Σ* = σ 2 −1 i and a = 1− p . ... ... ... ... i i − ρ Under the above setup, it is observed 0 0 ... 1 (using Srivastava & Katapa, 1986) that: − and η = p 1{}1+ ( p −1)ρ . The ρˆ − ρ i i i Z = ~ N(0,1) , (2.5) transformation used on the data from x to u Var ~ ~ above is independent of ρ. Helmert’s orthogonal k transformation can also be used. Srivastava (1984) gave an estimator of asymptotically, where, Var is to be determined 2 ρˆ ρ and σ under unequal family sizes which from (2.4) and is obtained from (2.3). Using the expression (2.5), it is found are good substitute for the maximum likelihood −α ρ estimator and are given by the following: that the 100(1 )% confidence interval for is γˆ2 Var ρˆ =−1, ρˆ ± Z α (2.6) σ 2 ˆ 2 k where χ 2 k k Interval Based on the Distribution σˆ 2 = − −1 − μˆ 2 + −1γˆ 2 It can be shown, by using the (k 1) (ui1 ) k ( ai ) i=1 i=1 distribution of ui given by (2.2): k pi ~ 2 u k pi ir 2 2 i==12r u γˆ = ir k ir==12 χ 2 − 2 ~,k (2.7) ( pi 1) σρ− − (1 ) (1)pi i=1 i=1 k where χ 2 denotes the Chi-square distribution μˆ = −1 n k ui1 i=1 with n degrees of freedom. Using (2.7), a −α ρ (2.3) 100(1 )% confidence interval for can be found as follows: = − −1 and ai 1 pi . k pi k pi Srivastava and Katapa (1986) derived 2 2 uir uir ρˆ == == the asymptotic distribution of ; they showed: 1− i 12r < ρ < 1− i 12r (2.8) χ 2 σ 2 χ 2 σ 2 that ρˆ ~ N( ρ , Var k ) asymptotically, where α . ˆ α . ˆ 1− ;n ;n 2 2 522 BHANDARY & FUJIWARA k in simulating family sizes (Rosner, et al., 1977; = − χ 2 where, n ( pi 1) and α ;n denotes the and Srivastava & Keen, 1988) determined the i=1 α parameter setting for FORTRAN IMSL negative upper 100 % point of a Chi-square distribution binomial subroutine with a mean = 2.86 and a with n degrees of freedom and σˆ 2 can be success probability = 0.483. obtained from (2.3). Given this, the mean was set to equal 2.86 and theta was set to equal 41.2552. All Data Simulation parameters were set the same except for the Multivariate normal random vectors value of ρ which took values from 0.1 to 0.9 at were generated using an R program in order to increments of 0.1.The R program produced evaluate the average lengths and coverage 3,000 estimates of ρ along with the coverage probability of the intervals given by (2.6) and probability and the confidence intervals given by (2.8).