Statistical Modelling

Statistical Modelling Dave Woods and Antony Overstall (Chapters 1–2 closely based on original notes by Anthony Davison and Jon Forster) c 2016 StatisticalModelling ............................... ......................... 0 1. Model Selection 1 Overview............................................ .................. 2 Basic Ideas 3 Whymodel?.......................................... .................. 4 Criteriaformodelselection............................ ...................... 5 Motivation......................................... .................... 6 Setting ............................................ ................... 9 Logisticregression................................... .................... 10 Nodalinvolvement................................... .................... 11 Loglikelihood...................................... .................... 14 Wrongmodel.......................................... ................ 15 Out-of-sampleprediction ............................. ..................... 17 Informationcriteria................................... ................... 18 Nodalinvolvement................................... .................... 20 Theoreticalaspects.................................. .................... 21 PropertiesofAIC,NIC,BIC............................... ................. 22 Linear Model 23 Variableselection ................................... .................... 24 Stepwisemethods.................................... ................... 25 Nuclearpowerstationdata............................ ..................... 26 StepwiseMethods: Comments ........................... ................... 28 Predictionerror........................................ ................. 29 Example ............................................ ................. 31 Cross-validation .................................... .................... 32 Othercriteria........................................ .................. 34 Experiment .......................................... ................. 35 Sparse Variable Selection 39 Motivation......................................... ................... 40 1 Desiderata.......................................... .................. 41 Example: Lasso...................................... ................... 42 Softthresholding.................................... .................... 43 Example ............................................ ................. 45 Penalties.......................................... ................... 46 Thresholdfunctions.................................. .................... 47 Propertiesofpenalties................................. ................... 48 Oracle.............................................. ................. 49 Bayesian Inference 50 ThomasBayes(1702–1761) ............................. ................... 51 Bayesianinference................................... .................... 52 Encompassingmodel.................................. ................... 54 Inference........................................... .................. 55 Lindley’sparadox .................................... ................... 56 Modelaveraging...................................... .................. 57 Cementdata ......................................... ................. 58 DIC................................................ ................. 62 2. Beyond the Generalised Linear Model 63 Overview............................................ ................. 64 Generalised Linear Models 65 GLMrecap............................................ ................ 66 GLMfailure.......................................... ................. 67 Overdispersion 68 Example1........................................... ................. 69 Quasi-likelihood ................................... ..................... 73 Reasons............................................ .................. 75 Directmodels........................................ .................. 77 Randomeffects........................................ ................. 79 Dependence 81 Example1revisited................................... ................... 82 Reasons............................................ .................. 83 Randomeffects........................................ ................. 84 Marginalmodels...................................... .................. 85 Clustereddata ....................................... .................. 87 Example2: Ratgrowth .................................. ................. 88 Random Effects and Mixed Models 92 Linearmixedmodels .................................. ................... 93 Discussion......................................... ................... 96 LMMfitting.......................................... ................. 98 REML............................................... ................ 99 Estimatingrandomeffects ............................... ................. 100 BayesianLMMs ....................................... ................ 101 Example2revisited................................... .................. 103 GLMMs.............................................. ............... 107 GLMMfitting......................................... ................ 110 BayesianGLMMS...................................... ................ 114 Example1revisited................................... .................. 115 2 3. Design of Experiments 117 Overview............................................ ................ 118 Introduction and principles of experimentation 119 Modesofdatacollection .............................. ................... 120 Roleofexperimentation............................... ................... 121 Definitions......................................... .................. 122 A unit-treatment statistical model. ...................... 123 Example ............................................ ................ 124 Principlesofexperimentation ........................... ................... 125 Factorial designs 127 Examplerevisited .................................... .................. 128 Treatmentsandaregressionmodel ......................... ................. 129 (Two-level) Factorialdesign .......................... ..................... 130 Regressionmodel ..................................... ................. 131 Maineffectsandinteractions ............................ .................. 132 Orthogonality....................................... .................. 135 Replication........................................ ................... 136 Principles of factorial experimentation . ..................... 137 Regular fractional factorial designs 138 Example ............................................ ................ 139 Choosingsubsetsoftreatments......................... .................... 140 The defining relation and aliasscheme. .................... 143 Thealiasmatrix...................................... ................. 144 The role of fractional factorial designs in a sequential strategy........................ 146 D-optimality and non-regular designs 147 Introduction........................................ .................. 148 D-optimality....................................... ................... 149 Comments........................................... ................ 150 Example: Plackett-Burman design.. .. .. .. .. .. .. .. .. .. ................... 151 Example: supersaturateddesign ......................... ................... 153 Approximate designs 157 Ageneralised definitionofadesign ...................... .................... 158 Informationmatrix .................................... ................. 159 Example: logisticregression ........................... .................... 160 Generalequivalencetheorem............................ ................... 162 3 Statistical Modelling 1. Model Selection 2. Beyond the Generalised Linear Model 3. Design of Experiments APTS: Statistical Modelling April 2016 – slide 0 4 1. Model Selection slide 1 Overview 1. Basic ideas 2. Linear model 3. Sparse variable selection 4. Bayesian inference APTS: Statistical Modelling April 2016 – slide 2 Basic Ideas slide 3 Why model? George E. P. Box (1919–2013): All models are wrong, but some models are useful. Some reasons we construct models: – to simplify reality (efficient representation); – to gain understanding; – to compare scientific, economic, . theories; – to predict future events/data; – to control a process. We (statisticians!) rarely believe in our models, but regard them as temporary constructs subject to improvement. Often we have several and must decide which is preferable, if any. APTS: Statistical Modelling April 2016 – slide 4 Criteria for model selection Substantive knowledge, from prior studies, theoretical arguments, dimensional or other general considerations (often qualitative) Sensitivity to failure of assumptions (prefer models that are robustly valid) Quality of fit—residuals, graphical assessment (informal), or goodness-of-fit tests (formal) Prior knowledge in Bayesian sense (quantitative) Generalisability of conclusions and/or predictions: same/similar models give good fit for many different datasets . but often we have just one dataset . APTS: Statistical Modelling April 2016 – slide 5 5 Motivation Even after applying these criteria (but also before!) we may compare many models: linear regression with p covariates, there are 2p possible combinations of covariates (each in/out), before allowing for transformations, etc.— if p = 20 then we have a problem; choice of bandwidth h> 0 in smoothing problems the number of different

Statistical Modelling

Introduction to Bayesian Estimation

Hybrid of Naive Bayes and Gaussian Naive Bayes for Classification: a Map Reduce Approach

Adjusted Probability Naive Bayesian Induction

The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective

Applied Bayesian Inference

Iterative Bayes Jo˜Ao Gama∗ LIACC, FEP-University of Porto, Rua Campo Alegre 823, 4150 Porto, Portugal

Bayesian Analysis 1 Introduction

A Brief Overview of Probability Theory in Data Science by Geert

Arxiv:2002.00269V2 [Cs.LG] 8 Mar 2021

Naive Bayes Classifier Assumes That the Presence (Or Absence) of a Particular Feature of a Class Is Unrelated to the Presence (Or Absence) of Any Other Feature

Bayesian Statistics

Bayesian Probability Theory