SOME NONPARAMETRIC TESTS 1. Introduction in Parametric Statistics

Total Page:16

File Type:pdf, Size:1020Kb

SOME NONPARAMETRIC TESTS 1. Introduction in Parametric Statistics SOME NONPARAMETRIC TESTS 1. Introduction In parametric statistics, we may consider particular parametric fam- ilies, such as the normal distribution in testing for equality of variances via F tests, or for equality of means via t tests or analysis of variance. In regression, the assumption of i.i.d. N(0,σ2) errors is used in testing whether regression coefficients are significantly different from 0. The Wilks test applies to more general families of distributions in- dexed by θ in a finite-dimensional parameter space Θ. Similarly, the χ2 test of composite hypotheses applies to fairly general parametric subfamilies of multinomial distributions. In nonparametric statistics, there actually still are parameters in a sense, such as the median m or other quantiles, but we don’t have dis- tributions determined uniquely by such a parameter. Instead there are more general restrictions on the distribution function F of the observa- tions, such as that F is continuous. So the families of possible distribu- tions are infinite-dimensional. Given n observations X1,...,Xn, we can always form their order statistics X(1) X(2) X(n) and their ≤1 n ≤···≤ empirical distribution function Fn(x) := n j=1 1Xj ≤x. Some tests can be based on these. This handout will considerP two tests of whether two samples both came from the same (unknown) distribution, the Mann– Whitney–Wilcoxon rank-sum test and the Kolmogorov–Smirnov test. Also we will have the Wilcoxon “signed rank” test of whether paired variables (Xj,Yj) have the same distribution as (Yj,Xj). 2. Symmetry of random variables Two real random variables X and Y are said to have the same dis- tribution or to be equal in distribution, written X =d Y , if for some F , Pr(X x) = Pr(Y x) = F (x) for all x. If X =d Y , then for any ≤ ≤ constant c, X + c =d Y + c. But one cannot necessarily add the same random variable to both sides of an equality in distribution. Let X and Y be i.i.d. N(0, 1). Then X =d Y , but X + X =d X + Y because X + X =2X is N(0, 4), but X + Y is N(0, 2). 6 A real random variable X is said to have a symmetric distribution (around 0) or to be symmetric if X and X have the same distribution. − Date: 18.650, Nov. 20, 2015. 1 SOMENONPARAMETRICTESTS 2 Then, given a real number m, (the distribution of) X is said to be symmetric around m if X m is symmetric around 0, i.e. X m and m X have the same distribution.− Equivalently, X and 2m −X have the− same distribution. − Examples. Any N(µ,σ2) is symmetric around µ. Any t(d) distribution is symmetric around 0. A Beta(a,b) distribution is symmetric around m if and only if both m = 1/2 and a = b. If X has a density f with f(x) > 0 for all x> 0 and f(x) = 0forall x 0, then X is not symmet- ric around any m. This applies, for example,≤ to gamma distributions, including χ2 and exponential distributions, and to F distributions. Fact 1. Suppose a random variable X is symmetric around some m. (a) Then m is a median of X. (b) Actually m is the median of X, in the sense that if X has a non- degenerate interval of medians, m must be the midpoint of that interval. (c) If E X < + then also EX = m. | | ∞ Proof. (a) Let X be symmetric around m. Then from the definitions, Pr(X m) = Pr(X m 0) = Pr(m X 0) = Pr(X m). ≥ − ≥ − ≥ ≤ Then 2Pr(X m) = Pr(X m)+Pr(X m)=1+Pr(X = m) 1 and so Pr(X ≤ m) = Pr(X ≤ m) 1/2 and≥ m is indeed a median≥ of X. ≤ ≥ ≥ (b) Suppose for some c = 0, m + c is a median of X. Then c is a median of X m, so it is6 also a median of m X. It follows that c is a median of−X m, so m c is a median of− X. So the interval− of medians of X is symmetric− around− m and m is the midpoint of it. (c) E(X m)= EX m = E(m X)= m EX, so2EX =2m and EX = m−. − − − Example. Consider the binomial(5, 1/2) distribution. Its distribution function F satisfies F (x)=1/2, 2 x< 3. So it has an interval [2, 3] of medians. The distribution is symmetric≤ around 5/2=2.5, the median as the midpoint of the interval of medians, and also the mean. 3. The Mann–Whitney–Wilcoxon rank-sum test This is a test of whether two samples come from the same distribu- tion, against the alternative that members of one sample tend to be larger than those of the other sample (a location or shift alternative). No parametric form of the distributions is assumed. They can be quite general, as long as the distribution functions are continuous. One might want to use such a test, called a nonparametric test, if, for example, SOMENONPARAMETRICTESTS 3 the data have outliers and so appear not to be normally distributed. Rice considers this test in subsection 11.2.3 pp. 435–443. The general assumption for the test is that real random variables X1,... , Xm are i.i.d. with a distribution function F , and independent of Y1,...,Yn which are i.i.d. with another distribution function G, with both F and G continuous. The hypothesis to be tested is H0: F = G. The test works as follows: let N = m + n and combine the samples of X’s and Y ’s into a total sample Z1,...,ZN . Arrange the Zk in order (take their order statistics) to get Z(1) < Z(2) < < Z(N). With probability 1, no two of the order statistics are equal··· because F and G are continuous. Let rank(V )= k if V = Z(k) for V = Xi or Yj. Let m TX = i=1rank(Xi). Then TX will be the test statistic. H0 will be rejectedP if either TX is too small, indicating that the X’s tend to be less than the Y ’s, or if TX is too large, indicating that the Y ’s tend to be less than the X’s. To determine quantitatively what values are too small or too large, we need to look into the distribution of TX under H0. If H0 holds then Z1,...,ZN are i.i.d. with distribution F = G. Let E0 be expectation, and Var0 the variance, when the hypothesis H0 is true. Let Ri be the rank of Xi. Then Ri has the discrete uniform distribution on 1, 2,...,N , Pr(Ri = k)=1/N for k =1,...,N. This { } 1 N(N+1) N+1 distribution has mean E0Ri = N 2 = 2 . A variable U with this distribution has N 1 1 N(N + 1)(2N + 1) (N + 1)(2N + 1) (1) E(U 2)= k2 = = . N N 6 6 Xk=1 It follows that the variance Var0(Ri) of the distribution is (2) (N + 1)(2N + 1) (N + 1)2 4N 2 +6N +2 3N 2 6N 3 N 2 1 = − − − = − . 6 − 4 12 12 Recalling that the continuous U[a,b] distribution has a variance (b a)2/12, the 12 in the denominator is to be expected. Moreover, − let X have the discrete uniform distribution on 1,...,N , as each Ri does. Let V have a U[ 1/2, 1/2] distribution and{ be independent} of − 1 X. Then X + V is easily seen to have a U[1/2,N + 2 ] distribution, so Var(X +V )= N 2/12, while by independence, Var(X +V ) = Var(X)+ Var(V ) = Var(X)+1/12, so Var(X)=(N 2 1)/12, giving another proof of (2). − We know the mean and variance of each rank Ri for i = 1,...,m. m To find the mean and variance of the sum TX = i=1 Ri, the mean is easy, namely E0TX = m(N + 1)/2. For the variance,P we need to find SOMENONPARAMETRICTESTS 4 the covariances of ranks Ri and Rj for i = j, all of which equal the 6 covariance of R1 and R2. These two ranks are not independent because they cannot have the same value. First we find E0(R1R2). To make this easier we can express it as E0(R1E0(R2 R1)). (Using the conditional expectation breaks the calculation into two| easier ones.) We have 1 N(N + 1) E (R R )= R 0 2| 1 N 1 2 − 1 − because, given R1, R2 can have any of the N 1 values in 1, 2,...,N other than R , each with probability 1/(N −1). It follows{ by (1) that} 1 − 1 N(N + 1) N +1 2N 2 +3N +1 E (R R )= E (R E (R R )) = 0 1 2 0 1 0 2| 1 N 1 2 · 2 − 6 − 1 3N 3 +6N 2 +3N (4N 2 +6N + 2) = − N 1 12 − 1 3N 3 +2N 2 3N 2 (3N + 2)(N 2 1) (3N + 2)(N + 1) = − − = − = . N 1 12 12(N 1) 12 − − Thus under H0 the covariance of R1 and R2 is 3N 2 +5N +2 N 2 +2N +1 E (R R ) (E R )2 = 0 1 2 − 0 1 12 − 4 3N 2 +5N +2 3N 2 6N 3 N +1 = − − − = , 12 − 12 and so for 1 i < k m ≤ ≤ N +1 (3) Cov (Ri,Rk) = Cov (R ,R )= .
Recommended publications
  • Parametric and Non-Parametric Statistics for Program Performance Analysis and Comparison Julien Worms, Sid Touati
    Parametric and Non-Parametric Statistics for Program Performance Analysis and Comparison Julien Worms, Sid Touati To cite this version: Julien Worms, Sid Touati. Parametric and Non-Parametric Statistics for Program Performance Anal- ysis and Comparison. [Research Report] RR-8875, INRIA Sophia Antipolis - I3S; Université Nice Sophia Antipolis; Université Versailles Saint Quentin en Yvelines; Laboratoire de mathématiques de Versailles. 2017, pp.70. hal-01286112v3 HAL Id: hal-01286112 https://hal.inria.fr/hal-01286112v3 Submitted on 29 Jun 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Public Domain Parametric and Non-Parametric Statistics for Program Performance Analysis and Comparison Julien WORMS, Sid TOUATI RESEARCH REPORT N° 8875 Mar 2016 Project-Teams "Probabilités et ISSN 0249-6399 ISRN INRIA/RR--8875--FR+ENG Statistique" et AOSTE Parametric and Non-Parametric Statistics for Program Performance Analysis and Comparison Julien Worms∗, Sid Touati† Project-Teams "Probabilités et Statistique" et AOSTE Research Report n° 8875 — Mar 2016 — 70 pages Abstract: This report is a continuation of our previous research effort on statistical program performance analysis and comparison [TWB10, TWB13], in presence of program performance variability. In the previous study, we gave a formal statistical methodology to analyse program speedups based on mean or median performance metrics: execution time, energy consumption, etc.
    [Show full text]
  • Non Parametric Test: Chi Square Test of Contingencies
    Weblinks http://www:stat.sc.edu/_west/applets/chisqdemo1.html http://2012books.lardbucket.org/books/beginning-statistics/s15-chi-square-tests-and-f- tests.html www.psypress.com/spss-made-simple Suggested Readings Siegel, S. & Castellan, N.J. (1988). Non Parametric Statistics for the Behavioral Sciences (2nd edn.). McGraw Hill Book Company: New York. Clark- Carter, D. (2010). Quantitative psychological research: a student’s handbook (3rd edn.). Psychology Press: New York. Myers, J.L. & Well, A.D. (1991). Research Design and Statistical analysis. New York: Harper Collins. Agresti, A. 91996). An introduction to categorical data analysis. New York: Wiley. PSYCHOLOGY PAPER No.2 : Quantitative Methods MODULE No. 15: Non parametric test: chi square test of contingencies Zimmerman, D. & Zumbo, B.D. (1993). The relative power of parametric and non- parametric statistics. In G. Karen & C. Lewis (eds.), A handbook for data analysis in behavioral sciences: Methodological issues (pp. 481- 517). Hillsdale, NJ: Lawrence Earlbaum Associates, Inc. Field, A. (2005). Discovering statistics using SPSS (2nd ed.). London: Sage. Biographic Sketch Description 1894 Karl Pearson(1857-1936) was the first person to use the term “standard deviation” in one of his lectures. contributed to statistical studies by discovering Chi square. Founded statistical laboratory in 1911 in England. http://www.swlearning.com R.A. Fisher (1890- 1962) Father of modern statistics was educated at Harrow and Cambridge where he excelled in mathematics. He later became interested in theory of errors and ultimately explored statistical problems like: designing of experiments, analysis of variance. He developed methods suitable for small samples and discovered precise distributions of many sample statistics.
    [Show full text]
  • Pitfalls and Options in Hypothesis Testing for Comparing Differences in Means
    Annals of Nuclear Cardiology Vol. 4 No. 1 83-87 REVIEW ARTICLE Statistics in Nuclear Cardiology: So, What’s the Difference? The t-Test: Pitfalls and Options in Hypothesis Testing for Comparing Differences in Means David N. Williams, PhD1), Kathryn A. Williams, MS2) and Michael Monuteaux, ScD3) Received: June 18, 2018/Revised manuscript received: July 19, 2018/Accepted: July 30, 2018 ○C The Japanese Society of Nuclear Cardiology 2018 Abstract Decisions related to differences of the measures of central tendency of population parameters are an important part of clinical research. Choice of the appropriate statistical test is critical to avoiding errors when making those decisions. All statistical tests require that one or more assumptions be met. The t-test is one of the most widely used tools but is not appropriate when assumptions such as normality are not met, especially when small samples, <40, are used. Non-parametric tests, such as the Wilcoxon rank sum and others, offer effective alternatives when there are questions about meeting assumptions. When normality is in question, the Wilcoxon non-parametric tests offer substantially higher levels of power and an reliable alternative to the t-test. Keywords: Assumptions, Difference of means, Mann-Whitney, Non-parametric, t-test, Violations, Wilcoxon Ann Nucl Cardiol 2018; 4(1): 83 -87 ome of the most important studies in clinical research and what degree those assumptions are violated should be an S decision-making are those related to differences in important step in analysis but one that is often left un-done (2). measures of central tendency of population parameters i.e.
    [Show full text]
  • Inferential Statistics Katie Rommel-Esham Education 604 Probability
    Inferential Statistics Katie Rommel-Esham Education 604 Probability • Probability is the scientific way of stating the degree of confidence we have in predicting something • Tossing coins and rolling dice are examples of probability experiments • The concepts and procedures of inferential statistics provide us with the language we need to address the probabilistic nature of the research we conduct in the field of education From Samples to Populations • Probability comes into play in educational research when we try to estimate a population mean from a sample mean • Samples are used to generate the data, and inferential statistics are used to generalize that information to the population, a process in which error is inherent • Different samples are likely to generate different means. How do we determine which is “correct?” The Role of the Normal Distribution • If you were to take samples repeatedly from the same population, it is likely that, when all the means are put together, their distribution will resemble the normal curve. • The resulting normal distribution will have its own mean and standard deviation. • This distribution is called the sampling distribution and the corresponding standard deviation is known as the standard error. Remember me? Sampling Distributions • As before, with the sampling distribution, approximately 68% of the means lie within 1 standard deviation of the distribution mean and 96% would lie within 2 standard deviations • We now know the probable range of means, although individual means might vary somewhat The Probability-Inferential Statistics Connection • Armed with this information, a researcher can be fairly certain that, 68% of the time, the population mean that is generated from any given sample will be within 1 standard deviation of the mean of the sampling distribution.
    [Show full text]
  • A Monte Carlo Method for Comparing Generalized Estimating Equations to Conventional Statistical Techniques Tor Discounting Data
    HHS Public Access Author manuscript Author ManuscriptAuthor Manuscript Author J Exp Anal Manuscript Author Behav. Author Manuscript Author manuscript; available in PMC 2019 March 20. Published in final edited form as: J Exp Anal Behav. 2019 March ; 111(2): 207–224. doi:10.1002/jeab.497. A Monte Carlo Method for Comparing Generalized Estimating Equations to Conventional Statistical Techniques tor Discounting Data Jonathan E. Friedel1, William B. DeHart2, Anne M. Foreman1, and Michael E. Andrew1 1National Institute for Occupational Safety and Health 2Virginia Tech Carillion Research Institute Abstract Discounting is the process by which outcomes lose value. Much of discounting research has focused on differences in the degree of discounting across various groups. This research has relied heavily on conventional null hypothesis significance testing that is familiar to psychology in general such as t-tests and ANOVAs. As discounting research questions have become more complex by simultaneously focusing on within-subject and between-group differences conventional statistical testing is often not appropriate for the obtained data. Generalized estimating equations (GEE) are one type of mixed-effects model that are designed to handle autocorrelated data, such as within-subject repeated-measures data, and are therefore more appropriate for discounting data. To determine if GEE provides similar results as conventional statistical tests, were compared the techniques across 2,000 simulated data sets. The data sets were created using a Monte Carlo method based of an existing data set. Across the simulated data sets, the GEE and the conventional statistical tests generally provided similar patterns of results. As the GEE and more conventional statistical tests provide the same pattern of result, we suggest researchers use the GEE because it was designed to handle data that has the structure that is typical of discounting data.
    [Show full text]
  • Nonparametric Permutation Testing
    NONPARAMETRIC PERMUTATION TESTING CHAPTER 33 TONY YE WHAT IS PERMUTATION TESTING? Framework for assessing the statistical significance of EEG results. Advantages: • Does not rely on distribution assumptions • Corrections for multiple comparisons are easy to incorporate • Highly appropriate for correcting multiple comparisons in EEG data WHAT IS PARAMETRIC STATISTICAL TESTING? The test statistic is compared against a theoretical distribution of test statistics expected under the H0. • t-value • χ2-value • Correlation coefficient The probability (p-value) of obtaining a statistic under the H0 is at least as large as the observed statistic. [INSERT 33.1A] NONPARAMETRIC PERMUTATION TESTING No assumptions are made about the theoretical underlying distribution of test statistics under the H0. • Instead, the distribution is created from the data that you have! How is this done? • Shuffling condition labels over trials • Within-subject analyses • Shuffling condition labels over subjects • Group-level analyses • Recomputing the test statistic NULL-HYPOTHESIS DISTRIBUTION Evaluating your hypothesis using a t-test of alpha power between two conditions. Two types of tests: • Discrete tests • Compare conditions • Continuous tests • Correlating two continuous variables DISCRETE TESTS Compare EEG activity between Condition A & B • H0 = No difference between conditions • Random relabeling of conditions • Test Statistic (TS) = as large as the TS BEFORE the random relabeling. Steps 1. Randomly swap condition labels from many trials 2. Compute t-test across conditions 3. If TS ≠ 0, there is sampling error or outliers CONTINOUS TESTS The idea: • Testing statistical significance of a correlation coefficient What’s the difference between this and discrete tests? • TS is created by swapping data points instead of labels SIMILARITIES The data are not altered • The “mapping” of data are shuffled around.
    [Show full text]
  • Paper Presented at the Annual Meeting of the American Educational Research Association, New York, New York, February 1971
    DOCUMENT RESUME ED 048 371 TM 000 463 AUTHOR Porter, Andrew C.; 'McSweeney, Maryellen TITLE Comparison of Pank Analysis of Covariance and Nonparametric Randomized Blocks Analysis. PUB DATE Jan 71 NOTE 35p.; Paper presented at the Annual Meeting of the American Educational Research Association, New York, New York, February 1971 EDRS PRIC. EDRS Price M7-$0.65 EC-$3.29 DESCRIPTORS *Analysis cf Covariance, *Analysis of Variance, *Goodness of Fit, Hypothesis Testing, *Nonparametric Statistics, Predictor Variables, *Research Design, Sampling, Statistical Analysis ABSTRACT The relative power of three possible experimental designs under the condition that data is to be analyzed by nonparametric techniques; the comparison of the power of each nonparametric technique to its parametric analogue; and the comparison of relative powers using nonparametric and parametric techniques are discussed. The three nonparametric techniques concerned are the Kruskal-Wallis test on data where experimental units have been randomly assigned to levels of the independent variable, Friedman's rank analysis of variance (ANOVA) on data in a randomized blocks design, and a nonparametric analysis of covariance (ANCOVA). The parametric counterparts are respectively; one way analysis of variance (ANOVA) ,two-way ANOVt, and parametric ANCOVA. Sine: the nonparametric tests are based on large sample approximations, the goodness of fit for small samples is also of concern. Statements of asymptotic relative efficiency and imprecision are reviewed as suggestive of small sample powers of the designs and statistics investigated. Results of a Monte Carlo investigation are provided to further illustrate small sample power and to provide data on goodness of fit.(Author/CK) U.S. DEPARTMENT OF HEALTH.
    [Show full text]
  • Scales and Statistics: Parametric and Nonparametric1 Norman H
    Psychological Bulletin 1961, Vol. 58, No. 4, 305-316 SCALES AND STATISTICS: PARAMETRIC AND NONPARAMETRIC1 NORMAN H. ANDERSON University of California, Los Angeles The recent rise of interest in the use since the former do not apply to of nonparametric tests stems from strictly categorical data (but see two main sources. One is the concern Cochran, 1954). Second, parametric about the use of parametric tests tests will mean tests of significance when the underlying assumptions are which assume equinormality, i.e., not met. The other is the problem of normality and some form of homo- whether or not the measurement scale geneity of variance. For convenience, is suitable for application of paramet- parametric test, F test, and analysis ric procedures. On both counts of variance will be used synony- parametric tests are generally more in mously. Although this usage is not danger than nonparametric tests. strictly correct, it should be noted Because of this, and because of a that the t test and regression analysis natural enthusiasm for a new tech- may be considered as special applica- nique, there has been a sometimes tions of F. Nonparametric tests will uncritical acceptance of nonparamet- refer to significance tests which make ric procedures. By now a certain considerably weaker distributional degree of agreement concerning the assumptions as exemplified by rank more practical aspects involved in the order tests such as the Wilcoxon T, choice of tests appears to have been the Kruskal-Wallis H, and by the reached. However, the measurement various median-type tests. Third, the theoretical issue has been less clearly main focus of the article is on tests of resolved.
    [Show full text]
  • Guiding Principles for Monte Carlo Analysis (Pdf)
    EPA/630/R-97/001 March 1997 Guiding Principles for Monte Carlo Analysis Technical Panel Office of Prevention, Pesticides, and Toxic Substances Michael Firestone (Chair) Penelope Fenner-Crisp Office of Policy, Planning, and Evaluation Timothy Barry Office of Solid Waste and Emergency Response David Bennett Steven Chang Office of Research and Development Michael Callahan Regional Offices AnneMarie Burke (Region I) Jayne Michaud (Region I) Marian Olsen (Region II) Patricia Cirone (Region X) Science Advisory Board Staff Donald Barnes Risk Assessment Forum Staff William P. Wood Steven M. Knott Risk Assessment Forum U.S. Environmental Protection Agency Washington, DC 20460 DISCLAIMER This document has been reviewed in accordance with U.S. Environmental Protection Agency policy and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. ii TABLE OF CONTENTS Preface . iv Introduction ...............................................................1 Fundamental Goals and Challenges ..............................................3 When a Monte Carlo Analysis Might Add Value to a Quantitative Risk Assessment . 5 Key Terms and Their Definitions ...............................................6 Preliminary Issues and Considerations ............................................9 Defining the Assessment Questions ........................................9 Selection and Development of the Conceptual and Mathematical Models ...........10 Selection and Evaluation of Available Data .................................10
    [Show full text]
  • Parametric Vs. Non-Parametric Statistics of Low Resolution Electromagnetic Tomography (LORETA)
    CLINICAL EEG and NEUROSCIENCE ©2005 VOL. 36 NO. 1 Parametric vs. Non-Parametric Statistics of Low Resolution Electromagnetic Tomography (LORETA) R. W. Thatcher, D. North and C. Biver Key Words positives of the 2,394 gray matter pixels for any cross-val- EEG Inverse Solutions idated normal subject. LORETA Parametric Statistics In conclusion, adequate approximation to Gaussian Non-parametric Statistics and high cross-validation can be achieved by the Key Institute’s LORETA programs by using a log10 transform ABSTRACT and parametric statistics, and parametric normative com- This study compared the relative statistical sensitivity parisons had lower false positive rates than the non-para- of non-parametric and parametric statistics of 3-dimen- metric tests. sional current sources as estimated by the EEG inverse INTRODUCTION solution Low Resolution Electromagnetic Tomography Both non-parametric1,2 and parametric3-10 statistics have (LORETA). One would expect approximately 5% false pos- been used in studies of Low Resolution Electromagnetic itives (classification of a normal as abnormal) at the P < Tomography or LORETA,11,12 however, the relative sensitiv- .025 level of probability (two tailed test) and approximately ity of parametric vs. non-parametric statistics has not been 1% false positives at the P < .005 level. systematically evaluated. Non-parametric statistics have EEG digital samples (2 second intervals sampled 128 the advantage of being distribution independent as well as Hz, 1 to 2 minutes eyes closed) from 43 normal adult sub- insensitive to extreme values or outliers. The disadvantage jects were imported into the Key Institute’s LORETA pro- of non-parametric statistics is the complexity, lower power gram and then used the Key Institute’s cross-spectrum and time required for computation.
    [Show full text]
  • An Interesting Reading "History of Statistics on Timeline"
    History of Statistics on Timeline V.K. Gupta, B.N. Mandal and Rajender Parsad ICAR-Indian Agricultural Statistics Research Institute Libray Avenue, Pusa, New Delhi – 110012, INDIA ____________________________________________________________________________________________________________________ SUMMARY The purpose of this article is to give the historical developments in the theory and applications of Statistics starting from pre-historic times till the beginning of the 20th century when the foundations of modern statistics were laid. Some selected contributions of prominent Indian statisticians have also been listed. The development of some selected statistical journals on the time line is also given. No claim is being made on the coverage being exhaustive. Keywords: Statistics in the ancient times, Modern statistics, Important selected journals. ____________________________________________________________________________________________________________________ Corresponding author: B.N. Mandal Email address: [email protected] 1. HISTORY OF STATISTICS ON TIME LINE Note: BCE ~ Before Common Era (or BC ~ Before Christ); CE ~ Common Era (or AD ~ Anno Domini) Time Contributor Contribution Dwaparyuga ‒ Mahabharata; Vana Prarva; Nala and king Bhangasuri were moving in a chariot through a forest. Bhangasuri told Kalayuga Nala – Damyanti Akhyan Nala that if he can count how many fallen leaves and fruits are there, he (Bhangasuri) can tell the number of fruits and leaves on two strongest branches of Vibhitak tree. One above one hundred are the number of leaves and one fruit informed Nala after counting the fallen leaves and fruit. Bhangasuri avers 2095 fruits and five ten million leaves on the two strongest branches of the tree (actually it is 5 koti leaves and 1 koti is 10 million). Nala counts all night and is duly amazed by morning.
    [Show full text]
  • Unit 1 Parametric and Non- Parametric Statistics
    UNIT 1 PARAMETRIC AND NON- PARAMETRIC STATISTICS Structure 1.0 Introduction 1.1 Objectives 1.2 Definition of Parametric and Non-parametric Statistics 1.3 Assumptions of Parametric and Non-parametric Statistics 1.3.1 Assumptions of Parametric Statistics 1.3.2 Assumptions of Non-parametric Statistics 1.4 Advantages of Non-parametric Statistics 1.5 Disadvantages of Non-parametric Statistical Tests 1.6 Parametric Statistical Tests for Different Samples 1.7 Parametric Statistical Measures for Calculating the Difference Between Means 1.7.1 Significance of Difference Between the Means of Two Independent Large and Small Samples 1.7.2 Significance of the Difference Between the Means of Two Dependent Samples 1.7.3 Significance of the Difference Between the Means of Three or More Samples 1.8 Parametric Statistics Measures Related to Pearson’s ‘r’ 1.8.1 Non-parametric Tests Used for Inference 1.9 Some Non-parametric Tests for Related Samples 1.10 Let Us Sum Up 1.11 Unit End Questions 1.12 Glossary 1.13 Suggested Readings 1.0 INTRODUCTION In this unit you will be able to know the various aspects of parametric and non- parametric statistics. A parametric statistical test specifies certain conditions such as the data should be normally distributed etc. The non-parametric statistics does not require the conditions of parametric stats. In fact non-parametric tests are known as distribution free tests. In this unit we will study the nature of quantitative data and various descriptive statistical measures which are used in the analysis of such data. These include measures of central tendency, variability, relative position and relationships of normal probability curve etc.
    [Show full text]