Chapter 10. Experimental Design: Statistical Analysis of Data Purpose
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Introduction to Biostatistics
Introduction to Biostatistics Jie Yang, Ph.D. Associate Professor Department of Family, Population and Preventive Medicine Director Biostatistical Consulting Core Director Biostatistics and Bioinformatics Shared Resource, Stony Brook Cancer Center In collaboration with Clinical Translational Science Center (CTSC) and the Biostatistics and Bioinformatics Shared Resource (BB-SR), Stony Brook Cancer Center (SBCC). OUTLINE What is Biostatistics What does a biostatistician do • Experiment design, clinical trial design • Descriptive and Inferential analysis • Result interpretation What you should bring while consulting with a biostatistician WHAT IS BIOSTATISTICS • The science of (bio)statistics encompasses the design of biological/clinical experiments the collection, summarization, and analysis of data from those experiments the interpretation of, and inference from, the results How to Lie with Statistics (1954) by Darrell Huff. http://www.youtube.com/watch?v=PbODigCZqL8 GOAL OF STATISTICS Sampling POPULATION Probability SAMPLE Theory Descriptive Descriptive Statistics Statistics Inference Population Sample Parameters: Inferential Statistics Statistics: 흁, 흈, 흅… 푿ഥ , 풔, 풑ෝ,… PROPERTIES OF A “GOOD” SAMPLE • Adequate sample size (statistical power) • Random selection (representative) Sampling Techniques: 1.Simple random sampling 2.Stratified sampling 3.Systematic sampling 4.Cluster sampling 5.Convenience sampling STUDY DESIGN EXPERIEMENT DESIGN Completely Randomized Design (CRD) - Randomly assign the experiment units to the treatments -
The Savvy Survey #16: Data Analysis and Survey Results1 Milton G
AEC409 The Savvy Survey #16: Data Analysis and Survey Results1 Milton G. Newberry, III, Jessica L. O’Leary, and Glenn D. Israel2 Introduction In order to evaluate the effectiveness of a program, an agent or specialist may opt to survey the knowledge or Continuing the Savvy Survey Series, this fact sheet is one of attitudes of the clients attending the program. To capture three focused on working with survey data. In its most raw what participants know or feel at the end of a program, he form, the data collected from surveys do not tell much of a or she could implement a posttest questionnaire. However, story except who completed the survey, partially completed if one is curious about how much information their clients it, or did not respond at all. To truly interpret the data, it learned or how their attitudes changed after participation must be analyzed. Where does one begin the data analysis in a program, then either a pretest/posttest questionnaire process? What computer program(s) can be used to analyze or a pretest/follow-up questionnaire would need to be data? How should the data be analyzed? This publication implemented. It is important to consider realistic expecta- serves to answer these questions. tions when evaluating a program. Participants should not be expected to make a drastic change in behavior within the Extension faculty members often use surveys to explore duration of a program. Therefore, an agent should consider relevant situations when conducting needs and asset implementing a posttest/follow up questionnaire in several assessments, when planning for programs, or when waves in a specific timeframe after the program (e.g., six assessing customer satisfaction. -
Effect Size (ES)
Lee A. Becker <http://web.uccs.edu/lbecker/Psy590/es.htm> Effect Size (ES) © 2000 Lee A. Becker I. Overview II. Effect Size Measures for Two Independent Groups 1. Standardized difference between two groups. 2. Correlation measures of effect size. 3. Computational examples III. Effect Size Measures for Two Dependent Groups. IV. Meta Analysis V. Effect Size Measures in Analysis of Variance VI. References Effect Size Calculators Answers to the Effect Size Computation Questions I. Overview Effect size (ES) is a name given to a family of indices that measure the magnitude of a treatment effect. Unlike significance tests, these indices are independent of sample size. ES measures are the common currency of meta-analysis studies that summarize the findings from a specific area of research. See, for example, the influential meta- analysis of psychological, educational, and behavioral treatments by Lipsey and Wilson (1993). There is a wide array of formulas used to measure ES. For the occasional reader of meta-analysis studies, like myself, this diversity can be confusing. One of my objectives in putting together this set of lecture notes was to organize and summarize the various measures of ES. In general, ES can be measured in two ways: a) as the standardized difference between two means, or b) as the correlation between the independent variable classification and the individual scores on the dependent variable. This correlation is called the "effect size correlation" (Rosnow & Rosenthal, 1996). These notes begin with the presentation of the basic ES measures for studies with two independent groups. The issues involved when assessing ES for two dependent groups are then described. -
Experimentation Science: a Process Approach for the Complete Design of an Experiment
Kansas State University Libraries New Prairie Press Conference on Applied Statistics in Agriculture 1996 - 8th Annual Conference Proceedings EXPERIMENTATION SCIENCE: A PROCESS APPROACH FOR THE COMPLETE DESIGN OF AN EXPERIMENT D. D. Kratzer K. A. Ash Follow this and additional works at: https://newprairiepress.org/agstatconference Part of the Agriculture Commons, and the Applied Statistics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Kratzer, D. D. and Ash, K. A. (1996). "EXPERIMENTATION SCIENCE: A PROCESS APPROACH FOR THE COMPLETE DESIGN OF AN EXPERIMENT," Conference on Applied Statistics in Agriculture. https://doi.org/ 10.4148/2475-7772.1322 This is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Conference on Applied Statistics in Agriculture by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Conference on Applied Statistics in Agriculture Kansas State University Applied Statistics in Agriculture 109 EXPERIMENTATION SCIENCE: A PROCESS APPROACH FOR THE COMPLETE DESIGN OF AN EXPERIMENT. D. D. Kratzer Ph.D., Pharmacia and Upjohn Inc., Kalamazoo MI, and K. A. Ash D.V.M., Ph.D., Town and Country Animal Hospital, Charlotte MI ABSTRACT Experimentation Science is introduced as a process through which the necessary steps of experimental design are all sufficiently addressed. Experimentation Science is defined as a nearly linear process of objective formulation, selection of experimentation unit and decision variable(s), deciding treatment, design and error structure, defining the randomization, statistical analyses and decision procedures, outlining quality control procedures for data collection, and finally analysis, presentation and interpretation of results. -
Evolution of the Infographic
EVOLUTION OF THE INFOGRAPHIC: Then, now, and future-now. EVOLUTION People have been using images and data to tell stories for ages—long before the days of the Internet, smartphones, and Excel. In fact, the history of infographics pre-dates the web by more than 30,000 years with the earliest forms of these visuals being cave paintings that helped early humans find food, resources, and shelter. But as technology has advanced, so has our ability to tell meaningful stories. Here’s a look into the evolution of modern infographics—where they’ve been, how they’ve evolved, and where they’re headed. Then: Printed, static infographics The 20th Century introduced the infographic—a staple for how we communicate, visualize, and share information today. Early on, these print graphics married illustration and data to communicate information in a revolutionary way. ADVANTAGE Design elements enable people to quickly absorb information previously confined to long paragraphs of text. LIMITATION Static infographics didn’t allow for deeper dives into the data to explore granularities. Hoping to drill down for more detail or context? Tough luck—what you see is what you get. Source: http://www.wired.co.uk/news/archive/2012-01/16/painting- by-numbers-at-london-transport-museum INFOGRAPHICS THROUGH THE AGES DOMO 03 Now: Web-based, interactive infographics While the first wave of modern infographics made complex data more consumable, web-based, interactive infographics made data more explorable. These are everywhere today. ADVANTAGE Everyone looking to make data an asset, from executives to graphic designers, are now building interactive data stories that deliver additional context and value. -
1.2. INDUCTIVE REASONING Much Mathematical Discovery Starts with Inductive Reasoning – the Process of Reaching General Conclus
1.2. INDUCTIVE REASONING Much mathematical discovery starts with inductive reasoning – the process of reaching general conclusions, called conjectures, through the examination of particular cases and the recognition of patterns. These conjectures are then more formally proved using deductive methods, which will be discussed in the next section. Below we look at three examples that use inductive reasoning. Number Patterns Suppose you had to predict the sixth number in the following sequence: 1, -3, 6, -10, 15, ? How would you proceed with such a question? The trick, it seems, is to discern a specific pattern in the given sequence of numbers. This kind of approach is a classic example of inductive reasoning. By identifying a rule that generates all five numbers in this sequence, your hope is to establish a pattern and be in a position to predict the sixth number with confidence. So can you do it? Try to work out this number before continuing. One fact is immediately clear: the numbers alternate sign from positive to negative. We thus expect the answer to be a negative number since it follows 15, a positive number. On closer inspection, we also realize that the difference between the magnitude (or absolute value) of successive numbers increases by 1 each time: 3 – 1 = 2 6 – 3 = 3 10 – 6 = 4 15 – 10 = 5 … We have then found the rule we sought for generating the next number in this sequence. The sixth number should be -21 since the difference between 15 and 21 is 6 and we had already determined that the answer should be negative. -
There Is No Pure Empirical Reasoning
There Is No Pure Empirical Reasoning 1. Empiricism and the Question of Empirical Reasons Empiricism may be defined as the view there is no a priori justification for any synthetic claim. Critics object that empiricism cannot account for all the kinds of knowledge we seem to possess, such as moral knowledge, metaphysical knowledge, mathematical knowledge, and modal knowledge.1 In some cases, empiricists try to account for these types of knowledge; in other cases, they shrug off the objections, happily concluding, for example, that there is no moral knowledge, or that there is no metaphysical knowledge.2 But empiricism cannot shrug off just any type of knowledge; to be minimally plausible, empiricism must, for example, at least be able to account for paradigm instances of empirical knowledge, including especially scientific knowledge. Empirical knowledge can be divided into three categories: (a) knowledge by direct observation; (b) knowledge that is deductively inferred from observations; and (c) knowledge that is non-deductively inferred from observations, including knowledge arrived at by induction and inference to the best explanation. Category (c) includes all scientific knowledge. This category is of particular import to empiricists, many of whom take scientific knowledge as a sort of paradigm for knowledge in general; indeed, this forms a central source of motivation for empiricism.3 Thus, if there is any kind of knowledge that empiricists need to be able to account for, it is knowledge of type (c). I use the term “empirical reasoning” to refer to the reasoning involved in acquiring this type of knowledge – that is, to any instance of reasoning in which (i) the premises are justified directly by observation, (ii) the reasoning is non- deductive, and (iii) the reasoning provides adequate justification for the conclusion. -
Programming in Stata
Programming and Post-Estimation • Bootstrapping • Monte Carlo • Post-Estimation Simulation (Clarify) • Extending Clarify to Other Models – Censored Probit Example What is Bootstrapping? • A computer-simulated nonparametric technique for making inferences about a population parameter based on sample statistics. • If the sample is a good approximation of the population, the sampling distribution of interest can be estimated by generating a large number of new samples from the original. • Useful when no analytic formula for the sampling distribution is available. B 2 ˆ B B b How do I do it? SE B = s(x )- s(x )/ B /(B -1) å[ åb =1 ] b=1 1. Obtain a Sample from the population of interest. Call this x = (x1, x2, … , xn). 2. Re-sample based on x by randomly sampling with replacement from it. 3. Generate many such samples, x1, x2, …, xB – each of length n. 4. Estimate the desired parameter in each sample, s(x1), s(x2), …, s(xB). 5. For instance the bootstrap estimate of the standard error is the standard deviation of the bootstrap replications. Example: Standard Error of a Sample Mean Canned in Stata use "I:\general\PRISM Programming\auto.dta", clear Type: sum mpg Example: Standard Error of a Sample Mean Canned in Stata Type: bootstrap "summarize mpg" r(mean), reps(1000) » 21.2973±1.96*.6790806 x B - x Example: Difference of Medians Test Type: sum length, detail Type: return list Example: Difference of Medians Test Example: Difference of Medians Test Example: Difference of Medians Test Type: bootstrap "mymedian" r(diff), reps(1000) The medians are not very different. -
Discussion Notes for Aristotle's Politics
Sean Hannan Classics of Social & Political Thought I Autumn 2014 Discussion Notes for Aristotle’s Politics BOOK I 1. Introducing Aristotle a. Aristotle was born around 384 BCE (in Stagira, far north of Athens but still a ‘Greek’ city) and died around 322 BCE, so he lived into his early sixties. b. That means he was born about fifteen years after the trial and execution of Socrates. He would have been approximately 45 years younger than Plato, under whom he was eventually sent to study at the Academy in Athens. c. Aristotle stayed at the Academy for twenty years, eventually becoming a teacher there himself. When Plato died in 347 BCE, though, the leadership of the school passed on not to Aristotle, but to Plato’s nephew Speusippus. (As in the Republic, the stubborn reality of Plato’s family connections loomed large.) d. After living in Asia Minor from 347-343 BCE, Aristotle was invited by King Philip of Macedon to serve as the tutor for Philip’s son Alexander (yes, the Great). Aristotle taught Alexander for eight years, then returned to Athens in 335 BCE. There he founded his own school, the Lyceum. i. Aside: We should remember that these schools had substantial afterlives, not simply as ideas in texts, but as living sites of intellectual energy and exchange. The Academy lasted from 387 BCE until 83 BCE, then was re-founded as a ‘Neo-Platonic’ school in 410 CE. It was finally closed by Justinian in 529 CE. (Platonic philosophy was still being taught at Athens from 83 BCE through 410 CE, though it was not disseminated through a formalized Academy.) The Lyceum lasted from 334 BCE until 86 BCE, when it was abandoned as the Romans sacked Athens. -
Simple Mean Weighted Mean Or Harmonic Mean
MultiplyMultiply oror Divide?Divide? AA BestBest PracticePractice forfor FactorFactor AnalysisAnalysis 77 ––10 10 JuneJune 20112011 Dr.Dr. ShuShu-Ping-Ping HuHu AlfredAlfred SmithSmith CCEACCEA Los Angeles Washington, D.C. Boston Chantilly Huntsville Dayton Santa Barbara Albuquerque Colorado Springs Ft. Meade Ft. Monmouth Goddard Space Flight Center Ogden Patuxent River Silver Spring Washington Navy Yard Cleveland Dahlgren Denver Johnson Space Center Montgomery New Orleans Oklahoma City Tampa Tacoma Vandenberg AFB Warner Robins ALC Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PRT-70, 01 Apr 2011 ObjectivesObjectives It is common to estimate hours as a simple factor of a technical parameter such as weight, aperture, power or source lines of code (SLOC), i.e., hours = a*TechParameter z “Software development hours = a * SLOC” is used as an example z Concept is applicable to any factor cost estimating relationship (CER) Our objective is to address how to best estimate “a” z Multiply SLOC by Hour/SLOC or Divide SLOC by SLOC/Hour? z Simple, weighted, or harmonic mean? z Role of regression analysis z Base uncertainty on the prediction interval rather than just the range Our goal is to provide analysts a better understanding of choices available and how to select the right approach Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 2 of 25 OutlineOutline Definitions -
The Cascade Bayesian Approach: Prior Transformation for a Controlled Integration of Internal Data, External Data and Scenarios
risks Article The Cascade Bayesian Approach: Prior Transformation for a Controlled Integration of Internal Data, External Data and Scenarios Bertrand K. Hassani 1,2,3,* and Alexis Renaudin 4 1 University College London Computer Science, 66-72 Gower Street, London WC1E 6EA, UK 2 LabEx ReFi, Université Paris 1 Panthéon-Sorbonne, CESCES, 106 bd de l’Hôpital, 75013 Paris, France 3 Capegemini Consulting, Tour Europlaza, 92400 Paris-La Défense, France 4 Aon Benfield, The Aon Centre, 122 Leadenhall Street, London EC3V 4AN, UK; [email protected] * Correspondence: [email protected]; Tel.: +44-(0)753-016-7299 Received: 14 March 2018; Accepted: 23 April 2018; Published: 27 April 2018 Abstract: According to the last proposals of the Basel Committee on Banking Supervision, banks or insurance companies under the advanced measurement approach (AMA) must use four different sources of information to assess their operational risk capital requirement. The fourth includes ’business environment and internal control factors’, i.e., qualitative criteria, whereas the three main quantitative sources available to banks for building the loss distribution are internal loss data, external loss data and scenario analysis. This paper proposes an innovative methodology to bring together these three different sources in the loss distribution approach (LDA) framework through a Bayesian strategy. The integration of the different elements is performed in two different steps to ensure an internal data-driven model is obtained. In the first step, scenarios are used to inform the prior distributions and external data inform the likelihood component of the posterior function. In the second step, the initial posterior function is used as the prior distribution and the internal loss data inform the likelihood component of the second posterior function. -
Uses for Census Data in Market Research
Uses for Census Data in Market Research MRS Presentation 4 July 2011 Andrew Zelin, Director, Sampling & RMC, Ipsos MORI Why Census Data is so critical to us Use of Census Data underlies almost all of our quantitative work activities and projects Needed to ensure that: – Samples are balanced and representative – ie Sampling; – Results derived from samples reflect that of target population – ie Weighting; – Understanding how demographic and geographic factors affect the findings from surveys – ie Analysis. How would we do our jobs without the use of Census Data? Our work withoutCensusData Our work Regional Assembly What I will cover Introduction, overview and why Census Data is so critical; Census data for sampling; Census data for survey weighting; Census data for statistical analysis; Use of SARs; The future (eg use of “hypercubes”); Our work without Census Data / use of alternatives; Conclusions Census Data for Sampling For Random (pre-selected) Samples: – Need to define stratum sample sizes and sampling fractions; – Need to define cluster size, esp. if PPS – (Need to define stratum / cluster) membership; For any type of Sample: – To balance samples across demographics that cannot be included as quota controls or stratification variables – To determine number of sample points by geography – To create accurate booster samples (eg young people / ethnic groups) Census Data for Sampling For Quota (non-probability) Samples: – To set appropriate quotas (based on demographics); Data are used here at very localised level – ie Census Outputs Census Data for Sampling Census Data for Survey Weighting To ensure that the sample profile balances the population profile Census Data are needed to tell you what the population profile is ….and hence provide accurate weights ….and hence provide an accurate measure of the design effect and the true level of statistical reliability / confidence intervals on your survey measures But also pre-weighting, for interviewer field dept.