Linear Models with R Julian J.Faraway Downloaded by [University of Toronto] at 16:20 23 May 2014 Texts in Statistical Science
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded by [University of Toronto] at 16:20 23 May 2014 CHAPMAN & HALL/CRC Texts in Statistical Science Series Series Editors Chris Chatfield, University of Bath, UK Martin Tanner, Northwestern University, USA Jim Zidek, University of British Columbia, Canada Analysis of Failure and Survival Data Peter J.Smith The Analysis and Interpretation of Multivariate Data for Social Scientists David J.Bartholomew, Fiona Steele, Irini Moustaki, and Jane Galbraith The Analysis of Time Series—An Introduction, Sixth Edition Chris Chatfield Applied Bayesian Forecasting and Time Series Analysis A.Pole, M.West and J.Harrison Applied Nonparametric Statistical Methods, Third Edition P.Sprent and N.C.Smeeton Applied Statistics—Handbook of GENSTAT Analysis E.J.Snell and H.Simpson Applied Statistics—Principles and Examples D.R.Cox and E.J.Snell Bayes and Empirical Bayes Methods for Data Analysis, Second Edition Bradley P.Carlin and Thomas A.Louis Downloaded by [University of Toronto] at 16:20 23 May 2014 Bayesian Data Analysis, Second Edition Andrew Gelman, John B.Carlin, Hal S.Stern, and Donald B.Rubin Beyond ANOVA—Basics of Applied Statistics R.G.Miller, Jr. Computer-Aided Multivariate Analysis, Third Edition A.A.Afifi and V.A.Clark A Course in Categorical Data Analysis T.Leonard A Course in Large Sample Theory T.S.Ferguson Data Driven Statistical Methods P.Sprent Decision Analysis—A Bayesian Approach J.Q.Smith Elementary Applications of Probability Theory, Second Edition H.C.Tuckwell Elements of Simulation B.J.T.Morgan Epidemiology—Study Design and Data Analysis M.Woodward Essential Statistics, Fourth Edition D.A.G.Rees A First Course in Linear Model Theory Nalini Ravishanker and Dipak K.Dey Interpreting Data—A First Course in Statistics A.J.B.Anderson An Introduction to Generalized Linear Models, Second Edition A.J.Dobson Introduction to Multivariate Analysis C.Chatfield and A.J.Collins Downloaded by [University of Toronto] at 16:20 23 May 2014 Introduction to Optimization Methods and their Applications in Statistics B.S.Everitt Large Sample Methods in Statistics P.K.Sen and J.da Motta Singer Markov Chain Monte Carlo—Stochastic Simulation for Bayesian Inference D.Gamerman Mathematical Statistics K.Knight Modeling and Analysis of Stochastic Systems V.Kulkarni Modelling Binary Data, Second Edition D.Collett Modelling Survival Data in Medical Research, Second Edition D.Collett Multivariate Analysis of Variance andRepeated Measures—A Practical Approach for Behavioural Scientists D.J.Hand and C.C.Taylor Multivariate Statistics—A Practical Approach B.Flury and H.Riedwyl Practical Data Analysis for Designed Experiments B.S.Yandell Practical Longitudinal Data Analysis D.J.Hand and M.Crowder Practical Statistics for Medical Research D.G.Altman Probability—Methods and Measurement A.O’Hagan Problem Solving—A Statistician’s Guide, Second Edition C.Chatfield Randomization, Bootstrap and Monte Carlo Methods in Biology, Second Edition B.F.J.Manly Downloaded by [University of Toronto] at 16:20 23 May 2014 Readings in Decision Analysis S.French Sampling Methodologies with Applications Poduri S.R.S.Rao Statistical Analysis of Reliability Data M.J.Crowder, A.C.Kimber, T.J.Sweeting, and R.L.Smith Statistical Methods for SPC and TQM D.Bissell Statistical Methods in Agriculture and Experimental Biology, Second Edition R.Mead, R.N.Curnow, and A.M.Hasted Statistical Process Control—Theory and Practice, Third Edition G.B.Wetherill and D.W.Brown Statistical Theory, Fourth Edition B.W.Lindgren Statistics for Accountants S.Letchford Statistics for Epidemiology Nicholas P.Jewell Statistics for Technology—A Course in Applied Statistics, Third Edition C.Chatfield Statistics in Engineering—A Practical Approach A.V.Metcalfe Statistics in Research and Development, Second Edition R.Caulcutt Survival Analysis Using S—Analysis of Time-to-Event Data Mara Tableman and Jong Sung Kim The Theory of Linear Models B.Jørgensen Linear Models with R Julian J.Faraway Downloaded by [University of Toronto] at 16:20 23 May 2014 Texts in Statistical Science Linear Models with R Julian J.Faraway Downloaded by [University of Toronto] at 16:20 23 May 2014 CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London NewYork Washington, D.C. This edition published in the Taylor & Francis e-Library, 2009. To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk. Library of Congress Cataloging-in-Publication Data Faraway, Julian James. Linear models with R/Julian J.Faraway. p. cm.—(Chapman & Hall/CRC texts in statistical science series; v. 63) Includes bibliographical references and index. ISBN 1-58488-425-8 (alk. paper) 1. Analysis of variance. 2. Regression analysis. I. Title. II. Texts in statistical science; v. 63. QA279.F37 2004 519.5'38–dc22 2004051916 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe. Visit the CRC Press Web site at www.crcpress.com © 2005 by Chapman & Hall/CRC No claim to original U.S. Government works Downloaded by [University of Toronto] at 16:20 23 May 2014 ISBN 0-203-50727-4 Master e-book ISBN ISBN 0-203-59454-1 (Adobe ebook Reader Format) International Standard Book Number 1-58488-425-8 Library of Congress Card Number 2004051916 Contents Preface xi 1 Introduction 1 1.1 Before You Start 1 1.2 Initial Data Analysis 2 1.3 When to Use Regression Analysis 7 1.4 History 7 2 Estimation 12 2.1 Linear Model 12 2.2 Matrix Representation 13 2.3 Estimating ! 13 2.4 Least Squares Estimation 14 2.5 Examples of Calculating 16 2.6 Gauss—Markov Theorem 17 2.7 Goodness of Fit 18 2.8 Example 20 2.9 Identifiability 23 3 Inference 28 3.1 Hypothesis Tests to Compare Models 28 3.2 Testing Examples 30 3.3 Permutation Tests 36 Downloaded by [University of Toronto] at 16:20 23 May 2014 3.4 Confidence Intervals for ! 38 3.5 Confidence Intervals for Predictions 41 3.6 Designed Experiments 44 3.7 Observational Data 48 3.8 Practical Difficulties 53 4 Diagnostics 58 4.1 Checking Error Assumptions 58 4.2 Finding Unusual Observations 69 4.3 Checking the Structure of the Model 78 viii Contents 5 Problems with the Predictors 83 5.1 Errors in the Predictors 83 5.2 Changes of Scale 88 5.3 Collinearity 89 6 Problems with the Error 96 6.1 Generalized Least Squares 96 6.2 Weighted Least Squares 99 6.3 Testing for Lack of Fit 102 6.4 Robust Regression 106 7 Transformation 117 7.1 Transforming the Response 117 7.2 Transforming the Predictors 120 8 Variable Selection 130 8.1 Hierarchical Models 130 8.2 Testing-Based Procedures 131 8.3 Criterion-Based Procedures 134 8.4 Summary 139 9 Shrinkage Methods 142 9.1 Principal Components 142 9.2 Partial Least Squares 150 9.3 Ridge Regression 152 10 Statistical Strategy and Model Uncertainty 157 10.1 Strategy 157 10.2 An Experiment in Model Building 158 Downloaded by [University of Toronto] at 16:20 23 May 2014 10.3 Discussion 159 11 Insurance Redlining—A Complete Example 161 11.1 Ecological Correlation 161 11.2 Initial Data Analysis 163 11.3 Initial Model and Diagnostics 165 11.4 Transformation and Variable Selection 168 11.5 Discussion 171 12 Missing Data 173 Contents ix 13 Analysis of Covariance 177 13.1 A Two-Level Example 178 13.2 Coding Qualitative Predictors 182 13.3 A Multilevel Factor Example 184 14 One-Way Analysis of Variance 191 14.1 The Model 191 14.2 An Example 192 14.3 Diagnostics 195 14.4 Pairwise Comparisons 196 15 Factorial Designs 199 15.1 Two-Way ANOVA 199 15.2 Two-Way ANOVA with One Observation per Cell 200 15.3 Two-Way ANOVA with More than One Observation per Cell 203 15.4 Larger Factorial Experiments 207 16 Block Designs 213 16.1 Randomized Block Design 213 16.2 Latin Squares 218 16.3 Balanced Incomplete Block Design 222 A R Installation, Functions and Data 227 B Quick Introduction to R 229 B.1 Reading the Data In 229 B.2 Numerical Summaries 229 B.3 Graphical Summaries 230 B.4 Selecting Subsets of the Data 231 Downloaded by [University of Toronto] at 16:20 23 May 2014 B.5 Learning More about R 232 Bibliography 233 Index 237 Downloaded by [University of Toronto] at 16:20 23 May 2014 Preface There are many books on regression and analysis of variance. These books expect different levels of preparedness and place different emphases on the material. This book is not introductory. It presumes some knowledge of basic statistical theory and practice.