DISCERNIBLE PERIODS in the HISTORICAL DEVELOPMENT of STATISTICAL INFERENCE by RICHARD WAYNE GOBER a DISSERTATION Submitted in Pa

Total Page:16

File Type:pdf, Size:1020Kb

DISCERNIBLE PERIODS in the HISTORICAL DEVELOPMENT of STATISTICAL INFERENCE by RICHARD WAYNE GOBER a DISSERTATION Submitted in Pa DISCERNIBLE PERIODS IN THE HISTORICAL DEVELOPMENT OF STATISTICAL INFERENCE by RICHARD WAYNE GOBER A DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the School of Connnerce and Business Administration in the Graduate School of the University of Alabama UNIVERSITY, ALABAMA 1967 ACKNOWLEDGMENTS Special acknowledgment is given to my dissertation committee, Dr. John P. Gill, Chairman, and Dr. A. Lee Cobb, for their skilled guidance, valuable suggestions, and encouragement; Dro Albert Drake and Dr. Tom D. Moore, for their assistance in preparing the final copye To my wife and family I am grateful for their patience and approbation. ii CONTENTS Page ACKNOWLEDGMENTS ii Chapter I. INTRODUCTION 1 Purpose and Scope of the Study •••••.. 1 Background of the Study •• 0 . 0 2 Limitations and Procedures ••• e 4 II. THE PERIOD OF INDUCTIVE STATISTICAL INFERENCE. 6 The Principle of Inverse Probability 6 Origination by Bayes ••• e ••• 6 Enunciation and Generalization by Laplace 13 Controversy Relating to the Principle •. 16 Criticism of the Principle . 16 Defense of the Principle . 18 Repudiation by Fisher 20 Sampling Distributions • • • ff . .. 23 The Normal Distribution 24 Discovery of the Normal Curve ••.•. 24 The Normal Sampling Distribution 24 Use of the Modulus and the Probable Error. 26 The Chi-Square Distribution. 28 Discovery by Helmert 28 The Chi-square Test of Goodness of Fit •••• 29 Contingency Tables •.•••••• 31 Controversy Relating to Degrees of Freedom 33 Student's Distribution 34 Discovery by Gosset •••• 34 Extended Applicatiorr by Fisher 36 iii Chapter Page Fisher's z Distribution o ~ $ 37 Fisher's z Transformation o 38 The Distribution of z • 39 Analysis of Variance 41 III. THE PERIOD OF CLASSICAL STATISTICAL INFERENCE. 44 Statistical Interval Estimation. 45 The Problem of Interval Estimation 45 Fiducial Probability • o c • 46 Confidence Intervals .•. o 48 Differences in Formulation of Theories 51 Statistical Hypotheses Testing 54 Statement of the Neyman-Pearson Theory 55 Development of the Neyman-Pearson Theory 58 The Likelihood Ratio Criterion 59 Additional Principles for Test Selection 62 The Power of Classical Tests ..••• 64 Fisher's Reaction to the Neyman-Pearson Theory 66 The Behrens...:. Fisher Problem" • . • . 69 Controversy Over the Behrens-Fisher Solution 71 The Controversy Evolves Into a Clash •.•.• 73 IV. THE PERIODS OF NONPARAMETRIC AND DECISION-THEORETIC STATISTICAL INFERENCE. • . • • . ••• 76 Nonparametric Statistical Inference 76 Nonparametric Test Problems 80 The Sign Test •• o 81 Order Statistics .•... 85 Rank Tests 90 Randomization and Independence 92 Power of Nonparametric Tests 95 Decision-Theoretic Statistical Inference 98 Basic Concepts of the Theory 99 Historical Development ••• 102 iv Chapter Page V. THE PERIOD OF BAYESIAN STATISTICAL INFERENCE . 107 Beginning and Basis .. 108 Bayes' Theorem ..... 109 A Priori Distributions ... 109 Frequentist Interpretation •• 0 110 Bayesian Interpretation •. 112 Epistemological Approach • 112 Personal or Individualistic Approach .. 114 Likelihood Functions 118 Bayesian Analysis ... 120 Confidence Distributions 122 Robustness • 123 Tolerance Regions .. • • 0 124 VI. SUMMARY AND CONCLUSIONS . • . 126 An Overview of the Development . • . • . • 126 Discernible Patterns in the Development • o o • r 132 A Priori Information ...•..........• 132 Interaction Between Theory and Application ...... 134 The Pattern of Controversy . • • • . 136 Increasing Mathematical Content . • 137 Conclusions . • 0 138 APPENDIX 140 BIBLIOGRAPHY . 148 V CHAPTER I INTRODUCTION Purpose and Scope of the Study The purpose of this study is to trace the historical develop­ ment of that part of modern statistical procedures known as statistical inference. Although the application of statistical methods is concerned more than ever with the study of great masses of data, percentages, and columns of figures, statistics has moved far beyond the descriptive stage. Using concepts from mathematics, logic, economics, and psychol­ ogy, modern statistics has developed into a designed "way of thinking" about conclusions or decisions to help a person choose a reasonable course of action under uncertainty. The general theory and methodology is called statistical inference. Statistical inference, roughly described, is that part of sta­ tistics in which quantitative measures of uncertainty are employed. Generally, statistical inference is a way to formalize the many mental assumptions, facts, and goals that go into making a decision under uncertainty, where a decision is made by following certain logical principles to reduce the chance of error and inconsistent action. Modern textbook writers avoid the historical approach in favor of a logical presentation of statistical practices and techniques in current use. As a result, the student of statistics is denied the opportunity to learn about the evolution of a flourishing and expanding 1 2 branch of statistical knowledge. To provide this historical perspective, the scope of this study will include salient contributions of the past two hundred years. Background of the Study The problem of using the facts of experience to infer general conclusions is common to all branches of science seeking to establish laws in terms of which a rational explanation may be given observable phenomena. The facts of experience that provide the impetus for this problem of inductive reasoning do not unrelentingly lead to categorical conclusions. The uncertainty characteristic of drawing conclusions or making decisions on the basis of limited information is analogous to the uncertainty characteristics of the outcomes of games of chance. The success which seventeenth century mathematicians had in dealing with the uncertainty characteristics of the outcomes of games of chance made probability a logical basis for handling the uncertainty character­ istics of the conclusions or decisions drawn from limited information. Such early attempts to use probability inductively were the beginning of the use of inductive inference in statistics. Various periods are discernible in the development of statis­ tical inference, each of which has a very definite beginning but no end. Until the latter part of the nineteenth century the concept was that of a loosely defined set of ideas expressed in terms of justifying propositions on the basis of data. The problem was that of inductive inference which introduced methods of making inferences about the general from the study of the particular. After the turn of the cen­ tury a measurement about the uncertainty of an inference was added. In the 1920's and 1930's the number of persons competent to develop 3 new statistical theory increased rapidly, and the concept of inductive behavior appeared. The same applies for concepts of tests of statis­ tical hypotheses, of errors of the first and second kind, of power of the tests, and of interval estimates" The emphasis was now on the importance of decisions or actions, as opposed to assertions or con­ clusions, in the face of uncertaintyo In 1936, attention was focused on the restrictive assumptions about the functional form of the dis- tribution of the population sampled. The fact that statistical methods were not limited to inferences concerning certain parameters resulted in the development of the nonparametric or distribution-free approach to statistical inference. During and shortly after World War II, sta­ tistical inference advanced with a decision-theoretic, or economic, outlook. The decision-theoretic formulation made use of sequential analysis to add the concept of risk to the decisions or actions approacho The latest period in the development of statistical inference started in the 1950's as the influence of the decision-theoretic framework com­ bined with the notion of personal probability and a procedure for probability calculation, Bayes' theorem, to form the concept of Bayesian statistical inference. The steadily increasing importance of statistical inference in such areas as government, industry, and education, and the rapid growth in the theory and application of statistical methods designed for this purpose underscore the desirability of an account of the historical development of statistical inferenceo By increasing the efficiency with which the scientific method can be applied, research in the use of statistical inference has made 'a lasting contribution to the solution of problems in many divergent fields~ Some of these problems, such 4 as predicting the performance of a complex new missile system, ascer­ taining the effectiveness of new vaccines, the evaluation of the possible effects of smoking on health, and the dete-r'mination of public opinion on various national and local issues,) have gained much publicity, Other problems, to which statistical inference has made a notable contribution, have received less publicity~ Some of the problems, however, no less significant with respect to the implications of their solution, include the following: discovering how galaxies of stars change shape as they grow old, forecasting the change in the long distance traffic that would follow an increase in toll rates, control of the quality of manufactured articles, and construction of mathematical models for the purpose of explaining human behavior. Limitations and Procedures When sketching a history of scientific thought, such as sta­ tistical inference, and trying to indicate periods of birth and develop­ ment of particular ideas, there is invariably
Recommended publications
  • Confidence Interval Estimation in System Dynamics Models: Bootstrapping Vs
    Confidence Interval Estimation in System Dynamics Models: Bootstrapping vs. Likelihood Ratio Method Gokhan Dogan* MIT Sloan School of Management, 30 Wadsworth Street, E53-364, Cambridge, Massachusetts 02142 [email protected] Abstract In this paper we discuss confidence interval estimation for system dynamics models. Confidence interval estimation is important because without confidence intervals, we cannot determine whether an estimated parameter value is significantly different from 0 or any other value, and therefore we cannot determine how much confidence to place in the estimate. We compare two methods for confidence interval estimation. The first, the “likelihood ratio method,” is based on maximum likelihood estimation. This method has been used in the system dynamics literature and is built in to some popular software packages. It is computationally efficient but requires strong assumptions about the model and data. These assumptions are frequently violated by the autocorrelation, endogeneity of explanatory variables and heteroskedasticity properties of dynamic models. The second method is called “bootstrapping.” Bootstrapping requires more computation but does not impose strong assumptions on the model or data. We describe the methods and illustrate them with a series of applications from actual modeling projects. Considering the empirical results presented in the paper and the fact that the bootstrapping method requires less assumptions, we suggest that bootstrapping is a better tool for confidence interval estimation in system dynamics
    [Show full text]
  • WHAT DID FISHER MEAN by an ESTIMATE? 3 Ideas but Is in Conflict with His Ideology of Statistical Inference
    Submitted to the Annals of Applied Probability WHAT DID FISHER MEAN BY AN ESTIMATE? By Esa Uusipaikka∗ University of Turku Fisher’s Method of Maximum Likelihood is shown to be a proce- dure for the construction of likelihood intervals or regions, instead of a procedure of point estimation. Based on Fisher’s articles and books it is justified that by estimation Fisher meant the construction of likelihood intervals or regions from appropriate likelihood function and that an estimate is a statistic, that is, a function from a sample space to a parameter space such that the likelihood function obtained from the sampling distribution of the statistic at the observed value of the statistic is used to construct likelihood intervals or regions. Thus Problem of Estimation is how to choose the ’best’ estimate. Fisher’s solution for the problem of estimation is Maximum Likeli- hood Estimate (MLE). Fisher’s Theory of Statistical Estimation is a chain of ideas used to justify MLE as the solution of the problem of estimation. The construction of confidence intervals by the delta method from the asymptotic normal distribution of MLE is based on Fisher’s ideas, but is against his ’logic of statistical inference’. Instead the construc- tion of confidence intervals from the profile likelihood function of a given interest function of the parameter vector is considered as a solution more in line with Fisher’s ’ideology’. A new method of cal- culation of profile likelihood-based confidence intervals for general smooth interest functions in general statistical models is considered. 1. Introduction. ’Collected Papers of R.A.
    [Show full text]
  • A Framework for Sample Efficient Interval Estimation with Control
    A Framework for Sample Efficient Interval Estimation with Control Variates Shengjia Zhao Christopher Yeh Stefano Ermon Stanford University Stanford University Stanford University Abstract timates, e.g. for each individual district or county, meaning that we need to draw a sufficient number of We consider the problem of estimating confi- samples for every district or county. Another diffi- dence intervals for the mean of a random vari- culty can arise when we need high accuracy (i.e. a able, where the goal is to produce the smallest small confidence interval), since confidence intervals possible interval for a given number of sam- producedp by typical concentration inequalities have size ples. While minimax optimal algorithms are O(1= number of samples). In other words, to reduce known for this problem in the general case, the size of a confidence interval by a factor of 10, we improved performance is possible under addi- need 100 times more samples. tional assumptions. In particular, we design This dilemma is generally unavoidable because concen- an estimation algorithm to take advantage of tration inequalities such as Chernoff or Chebychev are side information in the form of a control vari- minimax optimal: there exist distributions for which ate, leveraging order statistics. Under certain these inequalities cannot be improved. No-free-lunch conditions on the quality of the control vari- results (Van der Vaart, 2000) imply that any alter- ates, we show improved asymptotic efficiency native estimation algorithm that performs better (i.e. compared to existing estimation algorithms. outputs a confidence interval with smaller size) on some Empirically, we demonstrate superior perfor- problems, must perform worse on other problems.
    [Show full text]
  • Likelihood Confidence Intervals When Only Ranges Are Available
    Article Likelihood Confidence Intervals When Only Ranges are Available Szilárd Nemes Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg 405 30, Sweden; [email protected] Received: 22 December 2018; Accepted: 3 February 2019; Published: 6 February 2019 Abstract: Research papers represent an important and rich source of comparative data. The change is to extract the information of interest. Herein, we look at the possibilities to construct confidence intervals for sample averages when only ranges are available with maximum likelihood estimation with order statistics (MLEOS). Using Monte Carlo simulation, we looked at the confidence interval coverage characteristics for likelihood ratio and Wald-type approximate 95% confidence intervals. We saw indication that the likelihood ratio interval had better coverage and narrower intervals. For single parameter distributions, MLEOS is directly applicable. For location-scale distribution is recommended that the variance (or combination of it) to be estimated using standard formulas and used as a plug-in. Keywords: range, likelihood, order statistics, coverage 1. Introduction One of the tasks statisticians face is extracting and possibly inferring biologically/clinically relevant information from published papers. This aspect of applied statistics is well developed, and one can choose to form many easy to use and performant algorithms that aid problem solving. Often, these algorithms aim to aid statisticians/practitioners to extract variability of different measures or biomarkers that is needed for power calculation and research design [1,2]. While these algorithms are efficient and easy to use, they mostly are not probabilistic in nature, thus they do not offer means for statistical inference. Yet another field of applied statistics that aims to help practitioners in extracting relevant information when only partial data is available propose a probabilistic approach with order statistics.
    [Show full text]
  • Methods to Calculate Uncertainty in the Estimated Overall Effect Size from a Random-Effects Meta-Analysis
    Methods to calculate uncertainty in the estimated overall effect size from a random-effects meta-analysis Areti Angeliki Veroniki1,2*; Email: [email protected] Dan Jackson3; Email: [email protected] Ralf Bender4; Email: [email protected] Oliver Kuss5,6; Email: [email protected] Dean Langan7; Email: [email protected] Julian PT Higgins8; Email: [email protected] Guido Knapp9; Email: [email protected] Georgia Salanti10; Email: [email protected] 1 Li Ka Shing Knowledge Institute, St. Michael’s Hospital, 209 Victoria Street, East Building. Toronto, Ontario, M5B 1T8, Canada 2 Department of Primary Education, School of Education, University of Ioannina, Ioannina, Greece 3 MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K | downloaded: 28.9.2021 4 Department of Medical Biometry, Institute for Quality and Efficiency in Health Care (IQWiG), Im Mediapark 8, 50670 Cologne, Germany 5 Institute for Biometrics and Epidemiology, German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich Heine University, 40225 Düsseldorf, Germany 6 Institute of Medical Statistics, Heinrich-Heine-University, Medical Faculty, Düsseldorf, Germany This article has been accepted for publication and undergone full peer review but has not https://doi.org/10.7892/boris.119524 been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1002/jrsm.1319 source: This article is protected by copyright. All rights reserved. 7 Institute of Child Health, UCL, London, WC1E 6BT, UK 8 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, U.K.
    [Show full text]
  • 32. Statistics 1 32
    32. Statistics 1 32. STATISTICS Revised September 2007 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in High Energy Physics. In statistics, we are interested in using a given sample of data to make inferences about a probabilistic model, e.g., to assess the model’s validity or to determine the values of its parameters. There are two main approaches to statistical inference, which we may call frequentist and Bayesian. In frequentist statistics, probability is interpreted as the frequency of the outcome of a repeatable experiment. The most important tools in this framework are parameter estimation, covered in Section 32.1, and statistical tests, discussed in Section 32.2. Frequentist confidence intervals, which are constructed so as to cover the true value of a parameter with a specified probability, are treated in Section 32.3.2. Note that in frequentist statistics one does not define a probability for a hypothesis or for a parameter. Frequentist statistics provides the usual tools for reporting objectively the outcome of an experiment without needing to incorporate prior beliefs concerning the parameter being measured or the theory being tested. As such they are used for reporting most measurements and their statistical uncertainties in High Energy Physics. In Bayesian statistics, the interpretation of probability is more general and includes degree of belief (called subjective probability). One can then speak of a probability density function (p.d.f.) for a parameter, which expresses one’s state of knowledge about where its true value lies. Bayesian methods allow for a natural way to input additional information, such as physical boundaries and subjective information; in fact they require as input the prior p.d.f.
    [Show full text]
  • Interval Estimation Statistics (OA3102)
    Module 5: Interval Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.5-8.9 Revision: 1-12 1 Goals for this Module • Interval estimation – i.e., confidence intervals – Terminology – Pivotal method for creating confidence intervals • Types of intervals – Large-sample confidence intervals – One-sided vs. two-sided intervals – Small-sample confidence intervals for the mean, differences in two means – Confidence interval for the variance • Sample size calculations Revision: 1-12 2 Interval Estimation • Instead of estimating a parameter with a single number, estimate it with an interval • Ideally, interval will have two properties: – It will contain the target parameter q – It will be relatively narrow • But, as we will see, since interval endpoints are a function of the data, – They will be variable – So we cannot be sure q will fall in the interval Revision: 1-12 3 Objective for Interval Estimation • So, we can’t be sure that the interval contains q, but we will be able to calculate the probability the interval contains q • Interval estimation objective: Find an interval estimator capable of generating narrow intervals with a high probability of enclosing q Revision: 1-12 4 Why Interval Estimation? • As before, we want to use a sample to infer something about a larger population • However, samples are variable – We’d get different values with each new sample – So our point estimates are variable • Point estimates do not give any information about how far
    [Show full text]
  • Confidence Intervals in Analysis and Reporting of Clinical Trials Guangbin Peng, Eli Lilly and Company, Indianapolis, IN
    Confidence Intervals in Analysis and Reporting of Clinical Trials Guangbin Peng, Eli Lilly and Company, Indianapolis, IN ABSTRACT for more frequent use of confidence Regulatory agencies around the world intervals (Simon, 1993). have recommended reporting confidence There is a close relationship between intervals for treatment differences along confidence intervals and significance with the results of significance tests. tests (Hahn and Meeker, 1991); in fact, a SAS provides easy and convenient ways confidence interval can often be used to to produce confidence intervals using test a hypothesis. If the 100(1-α)% procedures such as PROC GLM and confidence interval for the mean PROC UNIVARIATE in conjunction treatment difference in a clinical trial with ODS (output delivery system). In does not contain zero, there is evidence this paper, I will discuss the relationship to indicate a treatment difference at the between significance tests and 100 α % significance level. This strategy confidence intervals, summarize the is equivalent to the hypothesis test that types of confidence intervals used in rejects the null hypothesis of no mean clinical study reports, and provide treatment difference at the level of α. examples from clinical trials to illustrate Compared to p-values, confidence the computation of distribution- intervals are generally more informative. dependent confidence intervals for the They provide quantitative bounds that mean treatment difference and express the uncertainty inherent in distribution-free confidence intervals for estimation, instead of merely an accept the median response within each or reject statement. The length of a treatment group using SAS. confidence interval depends on the sample size; this influence of sample INTRODUCTION size is evident from observing the length Confidence interval estimation and of the interval, while this is not the case significance testing (hypothesis testing) for a significance test.
    [Show full text]
  • Interval Estimation for Drop-The-Losers Designs
    Biometrika (2010), 97,2,pp. 405–418 doi: 10.1093/biomet/asq003 C 2010 Biometrika Trust Advance Access publication 14 April 2010 Printed in Great Britain Interval estimation for drop-the-losers designs BY SAMUEL S. WU Department of Epidemiology and Health Policy Research, University of Florida, Gainesville, Florida 32610, U.S.A. [email protected]fl.edu WEIZHEN WANG Downloaded from Department of Mathematics and Statistics, Wright State University, Dayton, Ohio 45435, U.S.A. [email protected] AND MARK C. K. YANG Department of Statistics, University of Florida, Gainesville, Florida 32610, U.S.A. http://biomet.oxfordjournals.org [email protected]fl.edu SUMMARY In the first stage of a two-stage, drop-the-losers design, a candidate for the best treatment is selected. At the second stage, additional observations are collected to decide whether the candidate is actually better than the control. The design also allows the investigator to stop the trial for ethical reasons at the end of the first stage if there is already strong evidence of futility or superiority. Two types of tests have recently been developed, one based on the combined means and the other based on the combined p-values, but corresponding inter- at University of Florida on May 27, 2010 val estimators are unavailable except in special cases. The problem is that, in most cases, the interval estimators depend on the mean configuration of all treatments in the first stage, which is unknown. In this paper, we prove a basic stochastic ordering lemma that enables us to bridge the gap between hypothesis testing and interval estimation.
    [Show full text]
  • Interval Estimation, Point Estimation, and Null Hypothesis Significance Testing Calibrated by an Estimated Posterior Probability of the Null Hypothesis David R
    Interval estimation, point estimation, and null hypothesis significance testing calibrated by an estimated posterior probability of the null hypothesis David R. Bickel To cite this version: David R. Bickel. Interval estimation, point estimation, and null hypothesis significance testing cali- brated by an estimated posterior probability of the null hypothesis. 2020. hal-02496126 HAL Id: hal-02496126 https://hal.archives-ouvertes.fr/hal-02496126 Preprint submitted on 2 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Interval estimation, point estimation, and null hypothesis significance testing calibrated by an estimated posterior probability of the null hypothesis March 2, 2020 David R. Bickel Ottawa Institute of Systems Biology Department of Biochemistry, Microbiology and Immunology Department of Mathematics and Statistics University of Ottawa 451 Smyth Road Ottawa, Ontario, K1H 8M5 +01 (613) 562-5800, ext. 8670 [email protected] Abstract Much of the blame for failed attempts to replicate reports of scientific findings has been placed on ubiquitous and persistent misinterpretations of the p value. An increasingly popular solution is to transform a two-sided p value to a lower bound on a Bayes factor. Another solution is to interpret a one-sided p value as an approximate posterior probability.
    [Show full text]
  • Principles of Statistical Inference
    Principles of Statistical Inference In this important book, D. R. Cox develops the key concepts of the theory of statistical inference, in particular describing and comparing the main ideas and controversies over foundational issues that have rumbled on for more than 200 years. Continuing a 60-year career of contribution to statistical thought, Professor Cox is ideally placed to give the comprehensive, balanced account of the field that is now needed. The careful comparison of frequentist and Bayesian approaches to inference allows readers to form their own opinion of the advantages and disadvantages. Two appendices give a brief historical overview and the author’s more personal assessment of the merits of different ideas. The content ranges from the traditional to the contemporary. While specific applications are not treated, the book is strongly motivated by applications across the sciences and associated technologies. The underlying mathematics is kept as elementary as feasible, though some previous knowledge of statistics is assumed. This book is for every serious user or student of statistics – in particular, for anyone wanting to understand the uncertainty inherent in conclusions from statistical analyses. Principles of Statistical Inference D.R. COX Nuffield College, Oxford CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521866736 © D. R. Cox 2006 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press.
    [Show full text]
  • Statistical Inference II: Interval Estimation and Hypotheses Tests
    Statistical Inference II: The Principles of Interval Estimation and Hypothesis Testing The Principles of Interval Estimation and Hypothesis Testing 1. Introduction In Statistical Inference I we described how to estimate the mean and variance of a population, and the properties of those estimation procedures. In Statistical Inference II we introduce two more aspects of statistical inference: confidence intervals and hypothesis tests. In contrast to a point estimate of the population mean β, like b = 17.158, a confidence interval estimate is a range of values which may contain the true population mean. A confidence interval estimate contains information not only about the location of the population mean but also about the precision with which we estimate it. A hypothesis test is a statistical procedure for using data to check the compatibility of a conjecture about a population with the information contained in a sample of data. Continuing the example from Statistical Inference I, suppose airplane designers have been basing seat designs based on the assumption that the average hip width of U.S. passengers is 16 inches. Is the information contained in the random sample of 50 hip measurements compatible with this conjecture, or not? These are the issues we consider in Statistical Inference II. 2. Interval Estimation for Mean of Normal Population When σ2 is Known Let Y be a random variable from a normal population. That is, assume YN~,()β σ2 . Assume that we have a random sample of size T from this population, YY12,,, YT . The least squares estimator of the population mean is T = bYT∑ i (2.1) i=1 This estimator has a normal distribution if the population is normal, bN~,()βσ2 T (2.2) For the present, let us assume that the population variance σ2 is known.
    [Show full text]