2 Yule: the Time–Correlation Problem, Nonsense Correlations, Periodicity and Autoregressions
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
F:\RSS\Me\Society's Mathemarica
School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Mathematics in the Statistical Society 1883-1933 John Aldrich No. 0919 This paper is available on our website http://www.southampton.ac.uk/socsci/economics/research/papers ISSN 0966-4246 Mathematics in the Statistical Society 1883-1933* John Aldrich Economics Division School of Social Sciences University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract This paper considers the place of mathematical methods based on probability in the work of the London (later Royal) Statistical Society in the half-century 1883-1933. The end-points are chosen because mathematical work started to appear regularly in 1883 and 1933 saw the formation of the Industrial and Agricultural Research Section– to promote these particular applications was to encourage mathematical methods. In the period three movements are distinguished, associated with major figures in the history of mathematical statistics–F. Y. Edgeworth, Karl Pearson and R. A. Fisher. The first two movements were based on the conviction that the use of mathematical methods could transform the way the Society did its traditional work in economic/social statistics while the third movement was associated with an enlargement in the scope of statistics. The study tries to synthesise research based on the Society’s archives with research on the wider history of statistics. Key names : Arthur Bowley, F. Y. Edgeworth, R. A. Fisher, Egon Pearson, Karl Pearson, Ernest Snow, John Wishart, G. Udny Yule. Keywords : History of Statistics, Royal Statistical Society, mathematical methods. -
Two Principles of Evidence and Their Implications for the Philosophy of Scientific Method
TWO PRINCIPLES OF EVIDENCE AND THEIR IMPLICATIONS FOR THE PHILOSOPHY OF SCIENTIFIC METHOD by Gregory Stephen Gandenberger BA, Philosophy, Washington University in St. Louis, 2009 MA, Statistics, University of Pittsburgh, 2014 Submitted to the Graduate Faculty of the Kenneth P. Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2015 UNIVERSITY OF PITTSBURGH KENNETH P. DIETRICH SCHOOL OF ARTS AND SCIENCES This dissertation was presented by Gregory Stephen Gandenberger It was defended on April 14, 2015 and approved by Edouard Machery, Pittsburgh, Dietrich School of Arts and Sciences Satish Iyengar, Pittsburgh, Dietrich School of Arts and Sciences John Norton, Pittsburgh, Dietrich School of Arts and Sciences Teddy Seidenfeld, Carnegie Mellon University, Dietrich College of Humanities & Social Sciences James Woodward, Pittsburgh, Dietrich School of Arts and Sciences Dissertation Director: Edouard Machery, Pittsburgh, Dietrich School of Arts and Sciences ii Copyright © by Gregory Stephen Gandenberger 2015 iii TWO PRINCIPLES OF EVIDENCE AND THEIR IMPLICATIONS FOR THE PHILOSOPHY OF SCIENTIFIC METHOD Gregory Stephen Gandenberger, PhD University of Pittsburgh, 2015 The notion of evidence is of great importance, but there are substantial disagreements about how it should be understood. One major locus of disagreement is the Likelihood Principle, which says roughly that an observation supports a hypothesis to the extent that the hy- pothesis predicts it. The Likelihood Principle is supported by axiomatic arguments, but the frequentist methods that are most commonly used in science violate it. This dissertation advances debates about the Likelihood Principle, its near-corollary the Law of Likelihood, and related questions about statistical practice. -
Bayesian Inference for Heterogeneous Event Counts
Bayesian Inference for Heterogeneous Event Counts ANDREW D. MARTIN Washington University This article presents an integrated set of Bayesian tools one can use to model hetero- geneous event counts. While models for event count cross sections are now widely used, little has been written about how to model counts when contextual factors introduce heterogeneity. The author begins with a discussion of Bayesian cross-sectional count models and discusses an alternative model for counts with overdispersion. To illustrate the Bayesian framework, the author fits the model to the number of women’s rights cosponsorships for each member of the 83rd to 102nd House of Representatives. The model is generalized to allow for contextual heterogeneity. The hierarchical model allows one to explicitly model contextual factors and test alternative contextual explana- tions, even with a small number of contextual units. The author compares the estimates from this model with traditional approaches and discusses software one can use to easily implement these Bayesian models with little start-up cost. Keywords: event count; Markov chain Monte Carlo; hierarchical Bayes; multilevel models; BUGS INTRODUCTION Heterogeneity is what makes society and, for that matter, statistics interesting. However, until recently, little attention has been paidto modeling count data observed in heterogeneous clusters (a notable exception is Sampson, Raudenbush, and Earls 1997). Explanations AUTHOR’S NOTE: This research is supported by the National Science Foundation Law and Social Sciences and Methodology, Measurement, and Statistics Sections, Grant SES-0135855 to Washing- ton University. I would like to thank Dave Peterson, Kevin Quinn, and Kyle Saunders for their helpful comments. I am particularly indebted to Christina Wolbrecht for sharing her valuable data set with me. -
The Likelihood Principle
1 01/28/99 ãMarc Nerlove 1999 Chapter 1: The Likelihood Principle "What has now appeared is that the mathematical concept of probability is ... inadequate to express our mental confidence or diffidence in making ... inferences, and that the mathematical quantity which usually appears to be appropriate for measuring our order of preference among different possible populations does not in fact obey the laws of probability. To distinguish it from probability, I have used the term 'Likelihood' to designate this quantity; since both the words 'likelihood' and 'probability' are loosely used in common speech to cover both kinds of relationship." R. A. Fisher, Statistical Methods for Research Workers, 1925. "What we can find from a sample is the likelihood of any particular value of r [a parameter], if we define the likelihood as a quantity proportional to the probability that, from a particular population having that particular value of r, a sample having the observed value r [a statistic] should be obtained. So defined, probability and likelihood are quantities of an entirely different nature." R. A. Fisher, "On the 'Probable Error' of a Coefficient of Correlation Deduced from a Small Sample," Metron, 1:3-32, 1921. Introduction The likelihood principle as stated by Edwards (1972, p. 30) is that Within the framework of a statistical model, all the information which the data provide concerning the relative merits of two hypotheses is contained in the likelihood ratio of those hypotheses on the data. ...For a continuum of hypotheses, this principle -
School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK
School of Social Sciences Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Professor A L Bowley’s Theory of the Representative Method John Aldrich No. 0801 This paper is available on our website http://www.socsci.soton.ac.uk/economics/Research/Discussion_Papers ISSN 0966-4246 Key names: Arthur L. Bowley, F. Y. Edgeworth, , R. A. Fisher, Adolph Jensen, J. M. Keynes, Jerzy Neyman, Karl Pearson, G. U. Yule. Keywords: History of Statistics, Sampling theory, Bayesian inference. Professor A. L. Bowley’s Theory of the Representative Method * John Aldrich Economics Division School of Social Sciences University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract Arthur. L. Bowley (1869-1957) first advocated the use of surveys–the “representative method”–in 1906 and started to conduct surveys of economic and social conditions in 1912. Bowley’s 1926 memorandum for the International Statistical Institute on the “Measurement of the precision attained in sampling” was the first large-scale theoretical treatment of sample surveys as he conducted them. This paper examines Bowley’s arguments in the context of the statistical inference theory of the time. The great influence on Bowley’s conception of statistical inference was F. Y. Edgeworth but by 1926 R. A. Fisher was on the scene and was attacking Bayesian methods and promoting a replacement of his own. Bowley defended his Bayesian method against Fisher and against Jerzy Neyman when the latter put forward his concept of a confidence interval and applied it to the representative method. * Based on a talk given at the Sample Surveys and Bayesian Statistics Conference, Southampton, August 2008. -
The Contrasting Strategies of Almroth Wright and Bradford Hill to Capture the Nomenclature of Controlled Trials
6 Whose Words are they Anyway? The Contrasting Strategies of Almroth Wright and Bradford Hill to Capture the Nomenclature of Controlled Trials When Almroth Wright (1861–1947) began to endure the experience common to many new fathers, of sleepless nights caused by his crying infant, he characteristically applied his physiological knowledge to the problem in a deductive fashion. Reasoning that the child’s distress was the result of milk clotting in its stomach, he added citric acid to the baby’s bottle to prevent the formation of clots. The intervention appeared to succeed; Wright published his conclusions,1 and citrated milk was eventually advocated by many paediatricians in the management of crying babies.2 By contrast, Austin Bradford Hill (1897–1991) approached his wife’s insomnia in an empirical, inductive manner. He proposed persuading his wife to take hot milk before retiring, on randomly determined nights, and recording her subsequent sleeping patterns. Fortunately for domestic harmony, he described this experiment merely as an example – actually to perform it would, he considered, be ‘exceedingly rash’.3 These anecdotes serve to illustrate the very different approaches to therapeutic experiment maintained by the two men. Hill has widely been credited as the main progenitor of the modern randomised controlled trial (RCT).4 He was principally responsible for the design of the first two published British RCTs, to assess the efficacy of streptomycin in tuberculosis5 and the effectiveness of whooping cough vaccine6 and, in particular, was responsible for the introduction of randomisation into their design.7 Almroth Wright remained, throughout his life, an implacable opponent of this ‘statistical method’, preferring what he perceived as the greater certainty of a ‘crucial experiment’ performed in the laboratory. -
Frequentist and Bayesian Statistics
Faculty of Life Sciences Frequentist and Bayesian statistics Claus Ekstrøm E-mail: [email protected] Outline 1 Frequentists and Bayesians • What is a probability? • Interpretation of results / inference 2 Comparisons 3 Markov chain Monte Carlo Slide 2— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics What is a probability? Two schools in statistics: frequentists and Bayesians. Slide 3— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentist school School of Jerzy Neyman, Egon Pearson and Ronald Fischer. Slide 4— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Bayesian school “School” of Thomas Bayes P(D|H) · P(H) P(H|D)= P(D|H) · P(H)dH Slide 5— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentists Frequentists talk about probabilities in relation to experiments with a random component. Relative frequency of an event, A,isdefinedas number of outcomes consistent with A P(A)= number of experiments The probability of event A is the limiting relative frequency. Relative frequency 0.0 0.2 0.4 0.6 0.8 1.0 020406080100 n Slide 6— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics Frequentists — 2 The definition restricts the things we can add probabilities to: What is the probability of there being life on Mars 100 billion years ago? We assume that there is an unknown but fixed underlying parameter, θ, for a population (i.e., the mean height on Danish men). Random variation (environmental factors, measurement errors, ...) means that each observation does not result in the true value. Slide 7— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics The meta-experiment idea Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets. -
Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-Series
SSStttooonnnyyy BBBrrrooooookkk UUUnnniiivvveeerrrsssiiitttyyy The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook University. ©©© AAAllllll RRRiiiggghhhtttsss RRReeessseeerrrvvveeeddd bbbyyy AAAuuuttthhhooorrr... Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-series A Dissertation presented by Iñigo Urteaga to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Electrical Engineering Stony Brook University August 2016 Stony Brook University The Graduate School Iñigo Urteaga We, the dissertation committee for the above candidate for the Doctor of Philosophy degree, hereby recommend acceptance of this dissertation. Petar M. Djuri´c,Dissertation Advisor Professor, Department of Electrical and Computer Engineering Mónica F. Bugallo, Chairperson of Defense Associate Professor, Department of Electrical and Computer Engineering Yue Zhao Assistant Professor, Department of Electrical and Computer Engineering Svetlozar Rachev Professor, Department of Applied Math and Statistics This dissertation is accepted by the Graduate School. Nancy Goroff Interim Dean of the Graduate School ii Abstract of the Dissertation Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-series by Iñigo Urteaga Doctor of Philosophy in Electrical Engineering Stony Brook University 2016 In the era of information-sensing mobile devices, the Internet- of-Things and Big Data, research on advanced methods for extracting information from data has become extremely critical. One important task in this area of work is the analysis of time-varying phenomena, observed sequentially in time. This endeavor is relevant in many applications, where the goal is to infer the dynamics of events of interest described by the data, as soon as new data-samples are acquired. -
Identification of Spikes in Time Series Arxiv:1801.08061V1 [Stat.AP]
Identification of Spikes in Time Series Dana E. Goin1 and Jennifer Ahern1 1 Division of Epidemiology, School of Public Health, University of California, Berkeley, California January 25, 2018 arXiv:1801.08061v1 [stat.AP] 24 Jan 2018 1 Abstract Identification of unexpectedly high values in a time series is useful for epi- demiologists, economists, and other social scientists interested in the effect of an exposure spike on an outcome variable. However, the best method to identify spikes in time series is not known. This paper aims to fill this gap by testing the performance of several spike detection methods in a sim- ulation setting. We created simulations parameterized by monthly violence rates in nine California cities that represented different series features, and randomly inserted spikes into the series. We then compared the ability to detect spikes of the following methods: ARIMA modeling, Kalman filtering and smoothing, wavelet modeling with soft thresholding, and an iterative outlier detection method. We varied the magnitude of spikes from 10-50% of the mean rate over the study period and varied the number of spikes inserted from 1 to 10. We assessed performance of each method using sensitivity and specificity. The Kalman filtering and smoothing procedure had the best over- all performance. We applied Kalman filtering and smoothing to the monthly violence rates in nine California cities and identified spikes in the rate over the 2005-2012 period. 2 List of Tables 1 ARIMA models and parameters by city . 21 2 Average sensitivity of spike identification methods for spikes of magnitudes ranging from 10-50% increase over series mean . -
Statistical Inference a Work in Progress
Ronald Christensen Department of Mathematics and Statistics University of New Mexico Copyright c 2019 Statistical Inference A Work in Progress Springer v Seymour and Wes. Preface “But to us, probability is the very guide of life.” Joseph Butler (1736). The Analogy of Religion, Natural and Revealed, to the Constitution and Course of Nature, Introduction. https://www.loc.gov/resource/dcmsiabooks. analogyofreligio00butl_1/?sp=41 Normally, I wouldn’t put anything this incomplete on the internet but I wanted to make parts of it available to my Advanced Inference Class, and once it is up, you have lost control. Seymour Geisser was a mentor to Wes Johnson and me. He was Wes’s Ph.D. advisor. Near the end of his life, Seymour was finishing his 2005 book Modes of Parametric Statistical Inference and needed some help. Seymour asked Wes and Wes asked me. I had quite a few ideas for the book but then I discovered that Sey- mour hated anyone changing his prose. That was the end of my direct involvement. The first idea for this book was to revisit Seymour’s. (So far, that seems only to occur in Chapter 1.) Thinking about what Seymour was doing was the inspiration for me to think about what I had to say about statistical inference. And much of what I have to say is inspired by Seymour’s work as well as the work of my other professors at Min- nesota, notably Christopher Bingham, R. Dennis Cook, Somesh Das Gupta, Mor- ris L. Eaton, Stephen E. Fienberg, Narish Jain, F. Kinley Larntz, Frank B. -
“It Took a Global Conflict”— the Second World War and Probability in British
Keynames: M. S. Bartlett, D.G. Kendall, stochastic processes, World War II Wordcount: 17,843 words “It took a global conflict”— the Second World War and Probability in British Mathematics John Aldrich Economics Department University of Southampton Southampton SO17 1BJ UK e-mail: [email protected] Abstract In the twentieth century probability became a “respectable” branch of mathematics. This paper describes how in Britain the transformation came after the Second World War and was due largely to David Kendall and Maurice Bartlett who met and worked together in the war and afterwards worked on stochastic processes. Their interests later diverged and, while Bartlett stayed in applied probability, Kendall took an increasingly pure line. March 2020 Probability played no part in a respectable mathematics course, and it took a global conflict to change both British mathematics and D. G. Kendall. Kingman “Obituary: David George Kendall” Introduction In the twentieth century probability is said to have become a “respectable” or “bona fide” branch of mathematics, the transformation occurring at different times in different countries.1 In Britain it came after the Second World War with research on stochastic processes by Maurice Stevenson Bartlett (1910-2002; FRS 1961) and David George Kendall (1918-2007; FRS 1964).2 They also contributed as teachers, especially Kendall who was the “effective beginning of the probability tradition in this country”—his pupils and his pupils’ pupils are “everywhere” reported Bingham (1996: 185). Bartlett and Kendall had full careers—extending beyond retirement in 1975 and ‘85— but I concentrate on the years of setting-up, 1940-55. -
State Space and Unobserved Component Models
STATE SPACE AND UNOBSERVED COMPONENT MODELS Theory and Applications This volume offers a broad overview of the state-of-the-art developments in the theory and applications of state space modelling. With fourteen chap- ters from twenty three contributors, it offers a unique synthesis of state space methods and unobserved component models that are important in a wide range of subjects, including economics, finance, environmental science, medicine and engineering. The bookis divided into four sections: introduc- tory papers, testing, Bayesian inference and the bootstrap, and applications. It will give those unfamiliar with state space models a flavour of the work being carried out as well as providing experts with valuable state-of-the- art summaries of different topics. Offering a useful reference for all, this accessible volume makes a significant contribution to the advancement of time series analysis. Andrew Harvey is Professor of Econometrics and Fellow of Corpus Christi College, University of Cambridge. He is the author of The Economet- ric Analysis of Time Series (1981), Time Series Models (1981) and Fore- casting, Structural Time Series Models and the Kalman Filter (1989). Siem Jan Koopman is Professor of Econometrics at the Free University Amsterdam and Research Fellow of Tinbergen Institute, Amsterdam. He has published in international journals and is coauthor of Time Series Anaysis by State Space Models (with J. Durbin, 2001). Neil Shephard is Professor of Economics and Official Fellow, Nuffield College, Oxford University. STATE SPACE AND