Modified T Tests and Confidence Intervals for Asymmetrical Populations Author(S): Norman J
Total Page:16
File Type:pdf, Size:1020Kb
Modified t Tests and Confidence Intervals for Asymmetrical Populations Author(s): Norman J. Johnson Source: Journal of the American Statistical Association, Vol. 73, No. 363 (Sep., 1978), pp. 536- 544 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2286597 . Accessed: 19/12/2013 01:17 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 143.169.248.6 on Thu, 19 Dec 2013 01:17:45 AM All use subject to JSTOR Terms and Conditions ModifiedL Testsand ConfidenceIntervals forAsymmetrical Populations NORMANJ. JOHNSON* This articleconsiders a procedurethat reducesthe effectof popu- versa. These authors studied the use of It in forming lation skewnesson the distributionof the t variableso that tests about themean can be morecorrectly computed. A modificationof confidenceintervals for ,u in orderto reduce the skewness thet variable is obtainedthat is usefulfor distributions with skewness effect.Their resultsshowed that the use of I tI was valid as severeas that of the exponentialdistribution. The procedureis only in the case of slightpopulation skewness. generalizedand applied to the jackknifet variablefor a class of statisticswith moments similar to thoseof the samplemean. Tests Because the mathematicsbecomes complicated very of the correlationcoefficient obtained using this procedureare quickly for sample sizes greaterthan two or three,few comparedempirically with correspondingtests determinedusing exact theoreticalresults are available (see Rider 1929, Fisher'sz transformationand theusual jackknifeestimate. Geary 1936, Laderman 1939, and Perlo 1933). A different KEY WORDS: t variable;Population skewness; Hypothesis tests; approach is to transformt (Anscombe 1950); anotheris Confidenceintervals; Jackknife: Cornish-Fisher expansions. to avoid entirelythe question of a distributionby using 1. INTRODUCTION various nonparametrictechniques (Box and Andersen 1955; Hartigan 1969). Most of the results obtained by Let x and s2 be the sample mean and variance, respec- these proceduresare established assuming symmetrical tively,of a sample of N independent,identically distri- distributions.When samplingfrom long-tailed symmetric buted observationstaken froma populationhaving mean distributions,Winsorizing or trimmingprocedures have Auand finitevariance a2. The variable t = V\N( - )ls been consideredby Andrewset al. (1972), Tukey (1964), is frequentlyused to make hypothesis tests about A and Yuen (1974). and formconfidence intervals for At.The distributionof The most useful results have been derived by ap- the variable t was firstobtained by Student (1908), proximationtechniques. Hotelling (1961) obtained an assuming a normallydistributed population. Since most expressiongiving the ratio of the tail area of the t dis- populations are not normallydistributed, the question tribution computed for samples from a known but of the robustnessof the distributionof t (and the robust- arbitrary distribution to the tail area of the usual ness of associated tests and confidenceintervals) to the Student t distribution. His findings,established for nonnormalityof the population has been of considerable symmetricaldistributions, support the early empirical interestto statisticians.This article deals with a modifi- results,which showed that largertails in the population cation of the t variable using propertiesof the data. The resultin smallertails in the t distribution.Chung (1946), modifiedvariable is less biased than the t variable and Bartlett (1935), Geary (1936), and Gayen (1949) deter- its distributionis less subject to effectsof population mined the distributionof t by means of an Edgeworth asymmetrywhen the population is nonnormal. Thus, or a Gram-Charlierexpansion. The approximationto the the modificationresults in a more robust procedurefor distributionobtained by Bartlettand Geary is similarto makingtests and determiningconfidence intervals for ,u. that of Gayen. The firstfew terms of Gayen's expression Much of the previous research on the robustnessof are given by the t variable has treated the variable as given, and = - +. ., (1.1) concentratedon indicating changes in the distribution P(t) Po(t, N) + O3P1(t,N) 32P2(t,N) function of t resultingfrom the nonnormalityof the where 3,B= /L3/0f3, /32 = g4/14, a2, and A3 are the second population. Early studies were empirical. Sophister and third central moments,respectively, of the popu- (1928), Neyman and Pearson (1928), and Nair (1941) lation, Po(t, N) is the usual Student t distribution,and showed that skewness in a population affectsthe dis- Pi(t, N) are correctiveterms for #j obtained fromin- tribution of t more than kurtosis. Their results also complete,B functions. For convenience,tables of Pi(t, N) showed that positive skewnessin the population results were given in Gayen's articlefor selected values of t and in negative skewness in the distributionof t and vice several differentsample sizes N. Difficultiesmay arise when using (1.1) in a sampling * NormanJ. Johnsonis AssistantProfessor, Department of Bio- situation. The series given by (1.1) is exact only for an statistics,University of North Carolina,Chapel Hill, NC 27514. The workwas donewhile the author was VisitingAssistant Professor, infinitenumber of terms. When the series is truncated, Departmentof Statistics,University of Kentucky,whose support is gratefullyacknowledged. The authorwishes to thankJ.A. Hartigan ? Journalof the AmericanStatistical Association formany helpful discussions, and L.J. Gleser,M.H. DeGroot,two September 1978,Volume 73, Number363 referees,and an associateeditor for helpful suggestions. Theoryand Methods Section 536 This content downloaded from 143.169.248.6 on Thu, 19 Dec 2013 01:17:45 AM All use subject to JSTOR Terms and Conditions Johnson:Effect of Skewnesson t Variables 537 probabilities smaller than zero or larger than one, or which are of the same orderas the momentsof x. Thus, distributionfunctions which are nonmonotonicallyin- the formof the Cornish-Fisherexpansion for these vari- creasing, may result. Gayen assumed in his work that ables will be analogous to the one given in (2.2) for x. populationparameters were known.This is not generally Note in this expansion that the skewness of the popu- the case in sampling situations,so that the f3imust be lation,/3, is the coefficientof the term(?2 - 1). (It is estimated fromthe sample. Empirical resultsshow that also the coefficientof otherterms but these are of smaller when appropriateestimates are used, errorsin estimation order.) The methodof this sectionfor deriving a modifi- affectthe accuracy of the approximation.Finally, the cation of the t variable is to eliminatethe (t2 -1) term Gayen method is cumbersomesince the adjustmentsto in the expansion of the modifiedt variable. In so doing, parameters, Pi(t, N), depend on the sample size and the largest-orderterm involvingskewness is eliminated many tables are required. fromthe expansion and the effectof all other termsin- This article seeks to correct the t variables for the volving skewnessis of small order. The appropriateness nonnormalityof the population distribution,not by of modificationsmade by this methodis establishedfor abandoning the t distributionas a standard, but by a varietyof differentdistributions by simulation. adjusting the t variable using propertiesof the data so Let the modifiedt variable be that the adjusted version has Student's t distributionto t - + X +? - - a sufficientlygood approximation. The form of the A) yI(x 4)2 (2/lN)}] correctedt variable is derivedby using a Cornish-Fisher EIs2/N]--' .(2.3) expansion. This form for t, given in equation (2.2), This formis suggestedby replacing - Auin the numer- differsfrom the usual variable in that the numeratoris ator of the t variable by the firstfew terms of the inverse - adjusted by a term involving ( j,) and a constant. Cornish-Fisherexpansion for x - Iu; i.e., in the notation These adjustmentscorrect bias and skewnesseffects due of (2.1), an expansion for v in terms of x - u. The re- to the skewnessof the nonnormalpopulation distribution. sulting expansion for t1is similarto that given for x in This technique avoids many of the annoyingfeatures (2.2). The constant), a functionof N, is chosen so that of previouslyproposed methods.Empirical studies show constant terms in the Cornish-Fisherexpansion of t1 that hypothesis tests determined by this procedure sum to zero, thus eliminatingthe low-orderbias; -yis compare favorably with tests determined by other chosen so that the coefficientof the t2 term in the methods for samples as small as 13 drawn fromdistri- Cornish-Fisherexpansion of ti is zero, therebyeliminat- butions as asymmetrical as x2 with two degrees of ing the low-ordereffects of skewness.The derivationof freedom.This method also has a natural extension to -yand X parametersis given in AppendixA. other problemsof parametricestimation involving