Estimation of Age-Specific Reference Intervals for Skewed Data
Estimation of Age-specific Reference Intervals for Skewed Data
Mohamed A.A. Moussa
Faculty of Medicine, Deptartment of Community Medicine & Behavioural Sciences, P. 0. Box 24923, Safat 13110, Kuwait. E-mail Address: amoussa@hsc. kuniv. edu. kw
1. Introduction
Age-specific reference intervals (RIS) are used in medieal practice. An RI is the range of values bounded by a pair of centiles which are symmetric about the median. Values which lie outside the limits of an RI are regarded as unusual and may indicate the presence of a disease or disorder. The measurements may depend on a covariate, commonly age. The effect of age may be modelled and the resulting age-specific RI is bounded by centile curves rather than by point estimates. Some methods have been proposed (Cole, 1988; Wright and Royston, 1997).
2. Example
As part of a cross-sectional survey to study cardiovascular risk factors associated with obesity ;; i in children (Moussa et al., 1998) body mass index (BMI) measurements were collected on 2400 healthy children 6 to 13 years old. It is required to determine appropriate reference intervals for the positively skewed age-dependent BMI which may be used to screen for obesity.
3. Methods
The data tonsist of pairs of observations (Y,T), where T is age and Y is the measurement of interest, or after natural logarithmic transformation. Let l.tT and GT be the population mean and
standard deviation (SD) of Y as a function of age. If Y is normally distributed, then Z = -Y-c1T OT has a standard normal distribution, N(O,l). A 1OOpth centile curve for Y is then calculated as
CP = PT + qpnT 7 (1) where qp is the 100pth percentile of N(O,l). The distribution of Z may not be Normal, therefore we followed two methods to reach approximate Normality.
1. The LMS method (Cole, 1990)
In this method, a Box-Cox transformation to normality is performed to the data within each age interval to obtain initial age-specific estimates for each of the three LMS parameters. These parameters are then smoothed individually by polynomials and combined to produce estimated centiles as, 1 = MT LTSTqp LT 3 (2) CP 1 + Y where MT is a generalized median, ST is a generalized coefficient of variation and LT is the “Box-Cox power of transformation”, and 9, is the 1OOpth centile of N(0,l).
2. Exponential-Normal distribution (Wright and Royston, 1997)
In this method, polynomials (or fractional polynomials) in age are fitted separately to each of the three parameters of an exponential-Normal distribution by maximum likelihood. The parameters are the age-specific mean, standard deviation, SD and skewness ( p T, 0 T, K T ). A 1OOpth centile curve for Y is estimated as,
= 7 CP fiT + Qp 6~ (3) where ( @T, 6~ ) are the estimated age-specific mean and SD from polynomial regression,
4. Conclusion
The Exponential-Normal distribution method has the advantage of being pararnetric with explicit expressions for estimated parameters and centiles. Moreover, it requires only weighted least squares regressions and transformations which are available in most statistical packages.
REFERENCES
Cole, T. J. (1988). Fitting smoothed centile curves to reference data (with discussion). Journal of the Royal Statistical Society A 15 1, 385-418. l
Cole, T. J. (1990). The LMS method for constructing normalized growth standards. European l Journal of Clinical Nutrition 44,45-60. l Moussa, M. A. A., Shaltout, A. A., Nkansa-Dwamena, D., Mottrad, M., Al-Sheikh, N., Agha, N., and Galal, D. 0. (1998). Association of fasting insulin with serum lipids and blood pressure in Kuwaiti children. Metabolism 47,420-424.
Wright, E. and Royston, P. (1997). Simplified estimation of age-specific reference intervals for skewed data. Statistics in Medicine 16,2785-2803. l
l RESUMÉ
Ce papier traite les intervalles de référence d’age spéci$que des données déviées. Deux méthodes ont été utilisées pour estimer les intervalles de reference: la méthode LMS et la méthode de distribution normale-exponentielle. La setonde méthode a 1‘avantage d’ etre paramétrique et nécessite seulement la regression least-squares pondérée et les transformations qui sont présentes dans la plupart des programmes informatiques statistiques.