Quick viewing(Text Mode)

Type I Error Rates of the Two-Sample Pseudo-Median Procedure," Journal of Modern Applied Statistical Methods: Vol

Type I Error Rates of the Two-Sample Pseudo-Median Procedure," Journal of Modern Applied Statistical Methods: Vol

Journal of Modern Applied Statistical Methods

Volume 10 | Issue 2 Article 3

11-1-2011 Type I Error Rates of the Two-Sample Pseudo- Procedure Nor Aishah Ahad Universiti Utara Malaysia

Abdul Rahman Othman Universiti Sains Malaysia

Sharipah Soaad Syed Yahaya Universiti Utara Malaysia

Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons

Recommended Citation Ahad, Nor Aishah; Othman, Abdul Rahman; and Yahaya, Sharipah Soaad Syed (2011) "Type I Error Rates of the Two-Sample Pseudo-Median Procedure," Journal of Modern Applied Statistical Methods: Vol. 10 : Iss. 2 , Article 3. DOI: 10.22237/jmasm/1320120120 Available at: http://digitalcommons.wayne.edu/jmasm/vol10/iss2/3

This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright © 2011 JMASM, Inc. November 2011, Vol. 10, No. 2, 418-423 1538 – 9472/11/$95.00

Regular Articles Type I Error Rates of the Two-Sample Pseudo-Median Procedure

Nor Aishah Ahad Abdul Rahman Othman Sharipah Soaad Syed Yahaya Universiti Utara Malaysia, Universiti Sains Malaysia, Universiti Utara Malaysia, Kedah, Malaysia Penang, Malaysia Kedah, Malaysia

The performance of the pseudo-median based procedure is examined in terms of controlling Type I error for a two independent groups test. The procedure is a modification of the one-sample Wilcoxon using the pseudo-median of differences between group values as the central measure of location. The proposed procedure was shown to have good control of Type I error rates under the study conditions regardless of distribution type.

Key words: Mann-Whitney-Wilcoxon, pseudo-median, t-test, type I error.

Introduction similar to parametric tests (Pratt, 1964; Testing the equality of Zimmerman & Zumbo, 1992). Further, parameters between two independent samples by nonparametric methods are more appropriate for controlling Type I error is a common statistical non-normal symmetric . Many attempts problem. If an underlying distribution is have been made to deal with asymmetric normally distributed with equal population distributions. In this study, a method to handle , the most suitable test statistic to use is the problem of asymmetric data, as well as the Student’s t-test. Student’s t, however, is heterogeneity of variances, is suggested. The sensitive to non-normal data and heterogeneity method is known as the pseudo-median of variances. Under these situations, Welch’s procedure, where the pseudo-median of approximate test (Welch, 1938) usually offers differences between group values are employed the best practical solution, but this statistic does as the central measure of location with the one- not adequately control Type I error probabilities sample nonparametric Wilcoxon procedure in a under non-normal distributions. two group setting. The pseudo-median of a To surmount the problem of non- distribution F is defined to be the median of the normality, researchers typically seek distribution (Z1 + Z2)/2, where Z1 and Z2 are all nonparametric test alternatives, such as the possible differences between two observations Mann-Whitney-Wilcoxon, which is believed to from each group. Z1 and Z2 are independent and be effective against violations of normality. have the same distribution as F (Hoyland, 1965; Although methods are often useful when Hollander & Wolfe, 1999). samples are obtained from heavy-tailed The pseudo-median is a location distributions, they are influenced by unequal parameter. The estimation of this parameter is variances accomplished using the Hodges-Lehmann estimator. According to Hollander and Wolfe (1999), the Hodges-Lehmann estimator ()θˆ is a Nor Aishah Ahad is an Academician in the consistent estimator of the pseudo-median, School of Quantitative Sciences. Email: which in general may differ from the median. [email protected]. Abdul Rahman Othman is However, when F is symmetric, the median and a Professor in the School of Distance Education. pseudo-median coincide. The pseudo-median is Email: [email protected]. Sharipah Soaad selected as the central measure of location Syed Yahaya is an Associate Professor in the because it is convenient and the asymptotic School of Quantitative Sciences. Email: properties of the pseudo-median are the same as [email protected].

418

AHAD, OTHMAN & SYED YAHAYA

median. In this study, the performance of the nn12 = pseudo- procedure in terms of Type I WRe ij ij . (4) == error was measured via Monte Carlo simulation. ij1 Because the distribution of this pseudo-median procedure is intractable, the The modification of the Wilcoxon bootstrap method was used to arrive at the procedure is performed by adding the pseudo- significant values. median value to the second sample to form a + ˆ new sample, Xd2 . The aligned difference, Methodology based on the location-aligned samples, becomes: This study addresses both symmetric and asymmetric distribution and the methods applied DXˆ =−() X +=− dDdˆˆ. (5) to the two types of distributions are very ij12 i j ij different. Let XXX= (), , ..., X and 111121n1 Let Rˆ denote the rank of Dˆ . The indicator = ij ij XXXX221222(), , ..., n be samples from 2 function and the aligned statistic are expressed distributions F1 and F2 respectively. The as: pseudo-median is defined as: 0, Dˆ < 0  ij == ˆ DD+ eDˆij0.5, ij 0 (6) ˆ ij i′′ j d = Median  2 1, Dˆ > 0   ij (1) ()XX−+−() X X and 12ij 12ij'' = Median nn12  ˆˆ= ˆ 2 WRe ij ij . (7)  ij==1

≠ ≠ where ii' and j j ' . When F1 and F2 are Because the second sample was symmetric, d can be defined as the difference realigned with the estimate d, it is necessary to between the centers of symmetry. Hence, the find the pseudo for the hypothesis is given as: estimate W. Use of a bootstrap procedure is proposed in order to construct the hypothesis = test. Separately bootstrap ni observations from Hd0 :0 X group and n observations from Xd+ ˆ versus (2) 1 j 2 group to obtain bootstrap samples, X * and X * . Hd:0.≠ 1 2 1 The bootstrap difference becomes

***=− * = = DXij12 i X j where Rij denotes the rank of Let DXXij12 i- j , in1,2,..., 1 , * = = Dij . The indicator function and the bootstrap j 1,2,...,n2 and Nnn12. The statistic is a one-sample Wilcoxon statistic based on the statistic can be defined as:

NDij ’s. Let Rij denote the rank of Dij . The  * < indicator function and the statistic are expressed 0, Dij 0  as: **== eDij0.5, ij 0 (8)  <  0, Dij 0 * >  1, Dij 0 ==  eDij0.5, ij 0 (3)  and 1, D > 0  ij nn12 WRe***= . (9) and  ij ij ij==1

419

TYPE I ERROR RATES OF THE TWO-SAMPLE PSEUDO-MEDIAN PROCEDURE

The steps to obtain the p value using the example, let X1 and X 2 be two skewed bootstrap method for symmetric distribution are distributions where the standard deviations need as follows: not be the same. Let YYY= (), and 11112 YYY= (), represent the new generated 1. Calculate W from X1 and X 2 . 22122 samples of size two, which have the same

ˆ distribution with X1 and X 2 , respectively. 2. Calculate d from X1 and X 2 . Compute ai as follows: ˆ 3. Add d to X 2 to form a new sample, ()()YY−+− YY Xd+ ˆ . = 11 21 12 22 2 amediani   (10)  2  ˆ 4. Calculate W from X1 and the new sample in step 3. Repeat the process of generating new samples of size two 9,999 times and repeat the computation 5. Generate bootstrap samples by randomly of ai to obtain aa1, 2 ,..., a 10,000 . Therefore, the sampling with replacement n observations i median of aa1, 2 ,..., a 10,000 is the value of a . from the X1 group, and nj observations For asymmetric distributions, the steps * to obtain the p value using a bootstrap method from the new sample in step 3 yielding X1 are the same except for one small alteration in and X * . 2 step 1. In this step, a constant a is introduced to the members of the second sample (X2) to form * 6. Calculate W from the bootstrap samples, a new sample, Xnew2 . Steps 2-10 proceed as X * and X * . 1 2 noted, with the one difference that X 2 has

become Xnew2 . 7. Find ()WW* − ˆ . To study the robustness of this procedure, four variables were manipulated to create conditions known to highlight the 8. Repeat Steps 5 - 7 B times. strengths and weaknesses of the test for the equality of location parameters. The variables 9. Compare the value of ()WW* − ˆ with are (1) types of distributions, (2) degree of inequality, (3) balanced/unbalanced ()WEWH− ()| . 0 sample sizes, and (4) pairings of unequal group variance and sample sizes. In this study, =−>−* ˆ () empirical Type 1 error rates were collected and Let UWW()() WEWH| 0 and later compared under various study conditions. =−<−* ˆ () LWW()() WEWH| 0 . The number of groups and sample sizes were fixed. This study covered only the two = 2 groups case with total sample size of N 40 . 10. Calculate the p value as × min() #LU , # . This value was later divided into two groups B forming the balanced and unbalanced design. For the balanced design, the value is equally For asymmetric distributions, the divided into nn==20 , and for the difference between the centers of symmetry 12 between the two groups cannot be assumed to be unbalanced design the groups were divided into = and = . To investigate the zero; therefore, to ensure the setting for the null n1 15 n2 25 condition, a constant a must be determined and distribution types, this study focused on (1) added to the members of the second sample. The heavy tailed symmetric non-normal distribution, value of a is obtained via simulation. For and (2) heavy tailed asymmetric distribution.

420

AHAD, OTHMAN & SYED YAHAYA

The normal distribution was used as the basis for (positive/negative) of the pairings has been comparison. The symmetric non-normal shown to exert some effect on the results. distribution was generated from a g-and-h Positive pairings typically produce conservative distribution (Hoaglin, 1985); specifically, g = 0 results and negative pairings tend to produce ()γ liberal results (Keselman, Wilcox, Othman & and h = 0.225 with 1 = 0 and ()γ Fradette, 2002; Cribbie & Keselman, 2003; 2 = 154.84 was chosen for Othman, et al., 2004; Syed Yahaya, Othman & investigation. The Chi-square with three degrees Keselman, 2004, 2006). Therefore, both positive γ = γ = of freedom ( 1 1.63 and 2 4 ) was selected and negative pairings were evaluated. to represent the asymmetric distribution. The operating characteristics of the The pseudo-random normal variates procedures investigated in this study could be were generated using the SAS generator described as extreme because they substantially RANDGEN function (SAS Institute, 1999); this depart from homogeneity and normality. These involved the (RANDGEN(Y, ‘NORMAL’)) conditions were used because it is reasonable to function to generate normal variates with assume that, if a procedure works under the most equals to zero and equals to extreme conditions, it will probably also work one. To generate data from the g-and-h under most conditions likely to be encountered distribution, standard unit normal variables by researchers. ()Z were converted to the h random variates The simulation program was written in ij SAS/IML (SAS Institute, 1999). For each via condition examined, 5,000 data sets were hZ 2 generated and within each data set, 599 YZ= expij . (11) ij ij 2 bootstrap samples were obtained. The level of  significance was set at α = 0.05.

For the Chi-square distribution, data were Results generated using the (RANDGEN(Y, To evaluate whether the test is robust ‘CHISQUARE’, 3)) function. (insensitive to assumption violations) under each Apart from the types of distribution, two particular condition, the Bradley criterion of other manipulated variables were the degrees of robustness (Bradley, 1978) was employed. variance inequality and pairings of variances and According to this criterion, for the five percent group sizes. The nature of pairings of variances nominal level used in this study, a test is and sample sizes affect Type I error rates considered robust if its empirical Type I error (Keselman, et al., 1998; Keselman, Othman, rate is within [0.025, 0.075]. Correspondingly, a Wilcox & Fradette, 2004; Othman, et al., 2004). test is considered to be non-robust if, for a The variances were manipulated in the following particular condition, its Type I error rate is not manner: In the case of equal variances, both contained in this criterion. This criterion was group variances were set at 1; for the unequal chosen because it provides a reasonable standard case, the variances were set at 1 and 36. for judging robustness. The empirical Type I For positive pairings, the group with the error rates for the pseudo-median procedure largest number of observations was paired with (PM), t-test and Mann-Whitney-Wilcoxon the group having largest variance, and the group (MWW) across all distributions are displayed in with the smallest number of observations was Table 1. paired with the group having smallest variance. With respect to the procedures, results For the negative pairings, the group with largest show that all Type I error rates for the pseudo- number of observations was paired with the median procedure are robust under Bradley’s group having the smallest variance, and the liberal criterion and are very close to the group with smallest number of observations was nominal level (0.05) regardless of distribution or paired with the group having the largest conditions. The disparity between Type I error variance. This condition was included in the rates from balanced and unbalanced designs is investigation because the direction minuscule and the rates are consistent across the

421

TYPE I ERROR RATES OF THE TWO-SAMPLE PSEUDO-MEDIAN PROCEDURE investigated conditions. The t-test also produces With respect to variance equality and inequality, robust Type I error rates for all distributions and results show a contradiction between symmetric conditions, however, for the Chi square and asymmetric distributions for both the distribution, the Type I error rates inflate to a pseudo-median and the t-test. For the g = 0, h = level above 0.065 when the variances are 0.225 distributions, homogeneous variances unequal and worsen under negative pairing. For produced greater Type I error rates compared to the Mann-Whitney-Wilcoxon test, half of the heterogeneous variances. For the Chi-square Type I error rates are above the robustness distribution, homogeneous variances produced criterion under unequal variances, especially smaller Type I error rates compared to negative pairing. The Type I error rates for heterogeneous variances. However, no specific MWW under the Chi-square distribution are too pattern could be identified for the Mann- liberal and not robust except under the Whitney-Wilcoxon test. homogeneous variance condition. With respect to the pairings of group In terms of distributional shapes, the sizes and variances, results show that the g-and- Chi-square distribution produced better h distribution produced liberal (> 0.05) Type I empirical Type I error rates compared to the g- error rates for the pseudo-median procedure and and-h distribution in most conditions for the conservative (< 0.05) results for the t-test. The pseudo-median procedure. Higher values of Chi-square distribution for the pseudo-median Type I error rates from Chi-square distribution procedure produced conservative Type I error are apparent for the t-test and Mann-Whitney- rates for the positive pairing, and liberal results Wilcoxon. for the negative pairing. The t-test produced liberal results for both pairings

Table1: Empirical Type I Error Rates of Pseudo-Medians Procedure, t-test and Mann-Whitney- Wilcoxon*

Group Sizes

(20, 20) (15, 25)

Method Distribution Variance Variance Variance Variance (1:36) (36:1) (1:1) (1:36) +ve pairing -ve pairing

Normal 0.0552 0.049 0.0486 0.0492 PM g=0, h=0.225 0.0588 0.0544 0.0518 0.0532 χ 2 3 0.0454 0.0504 0.0476 0.055

Normal 0.054 0.052 0.0492 0.0514 g=0, h=0.225 t-test 0.0522 0.0458 0.0448 0.044 χ 2 3 0.052 0.0696 0.0654 0.0736

Normal 0.0516 0.0912 0.0458 0.1142

MWW g=0, h=0.225 0.0516 0.0854 0.0436 0.108 χ 2 3 0.052 0.2428 0.1812 0.2398

*Bolded entries indicate Type I error rates of the test exceeding the 0.075 criterion.

422

AHAD, OTHMAN & SYED YAHAYA

Conclusion Hoyland, A. (1965). Robustness of the The purpose of this study was to investigate how Hodges-Lehmann estimates for shift. The Annals well the pseudo-medians procedure responded to of , 36, 174-197. the violations of assumptions compared to the Keselman, H. J., et al. (1998). Statistical traditional t-test and Mann-Whitney-Wilcoxon practices of educational researchers: An analysis method. The procedure was tested the heavy- of their ANOVA, MANOVA, and ANCOVA tailed distributions, namely the g = 0 and h = analyses. Review of Educational Research, 0.225 and the Chi-square with three degrees of 68(3), 350-386. freedom. Results show that the Type I error rates Keselman, H. J., et al. (2004). The new for the pseudo-median procedure and the t-test and improved two-sample t-test. American are robust under Bradley’s criterion of Psychological Society, 15(1), 57-51. robustness and close to the nominal value. The Keselman, H. J., et al. (2002). nature of the sample sizes - balanced or Trimming, transforming statistics, and unbalanced - did not show much difference in bootstrapping: Circumventing the biasing effects the procedure’s ability to control Type I error of and nonnormality. Journal rates. of Modern Applied Statistical Methods, 1(2), The pseudo-median procedure 288-309. performed better than t-test, especially for a Othman, A. R., et al. (2004). Comparing skewed distribution with unbalanced design and measures of the “typical” score across treatment heterogeneous variances. This procedure also groups. British Journal of Mathematical and outperforms the popular Mann-Whitney- Statistical Psychology, 57(2), 215-234. Wilcoxon method in most conditions. The Pratt, J. W. (1964). Robustness of some pseudo-median procedure was observed to have procedures for two-sample location problem. good control of Type I error rates, regardless of Journal of the American Statistical Association, distributions under the study conditions. The 59, 665-680. pseudo-median procedure can thus be SAS Institute, Inc. (1999). SAS/IML recommended as an alternative for testing the User’s Guide version 8. Cary, NC:SAS Institute. differences between two groups, particularly Syed Yahaya, S. S., Othman, A. R., & when assumptions of normality and variance Keselman, H. J. (2004). Testing the equality of homogeneity are not met. location parameters for skewed distributions using S1 with high breakdown robust scale estimators. In Theory and Applications of Recent References Robust Methods, Series: Statistics for Industry Bradley, J. V. (1978). Robustness? and Technology, M. Hubert, G. Pison, A. Struyf British Journal of Mathematical and Statistical & S. Van Aelst (Eds.), 319-328. Birkhauser: Psychology, 31, 321-339. Basel, Switzerland. Cribbie, R. A., & Keselman, H. J. Syed Yahaya, S. S., Othman, A. R., & (2003). The effects of nonnormality on Keselman, H. J. (2006). Comparing the “typical parametric, nonparametric, and model score” across independent groups based on comparison approaches to pairwise comparisons. different criteria for trimming. Methodoloski Educational and Psychological Measurement, Zvezki-Advances in Methodology and Statistics, 63(4), 615-635. 3, 49-62 Hoaglin, D. C. (1985). Summarizing Welch, B. L. (1938). The significance of shape numerically: The g- and h- distributions. the difference between two means when the In Exploring data tables, trends, and shapes, D. population variances are unequal. Biometrika, C. Hoaglin, F. Mosteller & J. W. Tukey (Eds.), 29, 350-362. 461-513. New York: NY: Wiley. Zimmerman, D. W., & Zumbo, B. D. Hollander, M., & Wolfe, D. A. (1999). (1992). Parametric alternatives to the Student t Nonparametric statistical methods (2nd Ed.). test under violation of normality and New York: NY: Wiley. homogeneity of variance. Perceptual Motor Skills, 74, 835-844.

423