An Introduction to Generalized Linear Models Second Edition

CHAPMAN & HALL/CRC Texts in Statistical Science Series Series Editors C. Chatfield, University of Bath, UK J. Zidek, University of British Columbia, Canada The Analysis of Time Series — An Introduction to Generalized An Introduction, Fifth Edition Linear Models, Second Edition C. Chatfield A.J. Dobson Applied Bayesian Forecasting and Time Introduction to Multivariate Analysis Series Analysis C. Chatfield and A.J. Collins A. Pole, M. West and J. Harrison Introduction to Optimization Methods Applied Nonparametric Statistical and their Applications in Statistics Methods, Third Edition B.S. Everitt P. Sprent and N.C. Smeeton Large Sample Methods in Statistics Applied Statistics — Principles and P.K. Sen and J. da Motta Singer Examples Markov Chain Monte Carlo — Stochastic D.R. Cox and E.J. Snell Simulation for Bayesian Inference Bayesian Data Analysis D. Gamerman A. Gelman, J. Carlin, H. Stern and D. Rubin Mathematical Statistics Beyond ANOVA — Basics of Applied K. Knight Statistics Modeling and Analysis of Stochastic R.G. Miller, Jr. Systems V. Kulkarni Computer-Aided Multivariate Analysis, Third Edition Modelling Binary Data A.A. Afifi and V.A. Clark D. Collett A Course in Categorical Data Analysis Modelling Survival Data in Medical T. Leonard Research D. Collett A Course in Large Sample Theory T.S. Ferguson Multivariate Analysis of Variance and Repeated Measures — A Practical Data Driven Statistical Methods Approach for Behavioural Scientists P. Sprent D.J. Hand and C.C. Taylor Decision Analysis — A Bayesian Approach Multivariate Statistics — J.Q. Smith A Practical Approach Elementary Applications of Probability B. Flury and H. Riedwyl Theory, Second Edition Practical Data Analysis for Designed H.C. Tuckwell Experiments Elements of Simulation B.S. Yandell B.J.T. Morgan Practical Longitudinal Data Analysis Epidemiology — Study Design and D.J. Hand and M. Crowder Data Analysis Practical Statistics for Medical Research M. Woodward D.G. Altman Essential Statistics, Fourth Edition Probability — Methods and Measurement D.G. Rees A. O’Hagan Interpreting Data — A First Course Problem Solving — A Statistician’s Guide, in Statistics Second Edition A.J.B. Anderson C. Chatfield Randomization, Bootstrap and Statistical Theory, Fourth Edition Monte Carlo Methods in Biology, B.W. Lindgren Second Edition Statistics for Accountants, Fourth Edition B.F.J. Manly S. Letchford Readings in Decision Analysis Statistics for Technology — S. French A Course in Applied Statistics, Sampling Methodologies with Third Edition Applications C. Chatfield P. R a o Statistics in Engineering — Statistical Analysis of Reliability Data A Practical Approach M.J. Crowder, A.C. Kimber, A.V. Metcalfe T.J. Sweeting and R.L. Smith Statistics in Research and Development, Statistical Methods for SPC and TQM Second Edition D. Bissell R. Caulcutt Statistical Methods in Agriculture and The Theory of Linear Models Experimental Biology, Second Edition B. Jørgensen R. Mead, R.N. Curnow and A.M. Hasted Statistical Process Control — Theory and Practice, Third Edition G.B. Wetherill and D.W. Brown AN INTRODUCTION TO GENERALIZED LINEAR MODELS SECOND EDITION Annette J. Dobson CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London New York Washington, D.C. Library of Congress Cataloging-in-Publication Data Dobson, Annette J., 1945- An introduction to generalized linear models / Annette J. Dobson.—2nd ed. p. cm.— (Chapman & Hall/CRC texts in statistical science series) Includes bibliographical references and index. ISBN 1-58488-165-8 (alk. paper) 1. Linear models (Statistics) I. Title. II. Texts in statistical science. QA276 .D589 2001 519.5′35—dc21 2001047417 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe. Visit the CRC Press Web site at www.crcpress.com © 2002 by Chapman & Hall/CRC No claim to original U.S. Government works International Standard Book Number 1-58488-165-8 Library of Congress Card Number 2001047417 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper Contents Preface 1Introduction 1.1Background 1.2Scope 1.3Notation 1.4DistributionsrelatedtotheNormaldistribution 1.5Quadraticforms 1.6Estimation 1.7Exercises 2ModelFitting 2.1Introduction 2.2Examples 2.3Someprinciplesofstatistica lmodelling 2.4Notationandcodingforexplanatoryvariables 2.5Exercises 3ExponentialFamilyandGeneralizedLinearModels 3.1Introduction 3.2Exponentialfamilyofdistributions 3.3Propertiesofdistribution sintheexponentialf amily 3.4Generalizedlinearmodels 3.5Examples 3.6Exercises 4 Estimation 4.1Introduction 4.2Example:Failuretimesforpressurevessels 4.3Maximumlikelihoodestimation 4.4Poissonregressionexample 4.5Exercises 5Inference 5.1Introduction 5.2Samplingdistributionforscorestatistics © 2002 by Chapman & Hall/CRC 5.3Taylorseriesapproximations 5.4Samplingdistributionformaximumlikelihoodestimators 5.5Log-likelihoodratiostatistic 5.6Samplingdistributionforthedeviance 5.7Hypothesistesting 5.8Exercises 6NormalLinearModels 6.1Introduction 6.2Basicresults 6.3Multiplelinearregression 6.4Analysisof variance 6.5Analysisofc ovariance 6.6Generallinearmodels 6.7Exercises 7BinaryVariablesandLogisticRegression 7.1Probabilitydistributions 7.2Generalizedlinearmodels 7.3Doseresponsemodels 7.4Generallogisticregressionmodel 7.5Goodnessoffi tstatistics 7.6Residuals 7.7Otherdiagnostics 7.8Example:SenilityandWAIS 7.9Exercises 8NominalandOrdinalLogisticRegression 8.1Introduction 8.2Multinomialdistribution 8.3Nominallogisticregression 8.4Ordinallogisticregression 8.5Generalcomments 8.6Exercises 9CountData,PoissonRegressionandLog-LinearModels 9.1Introduction 9.2Poissonregression 9.3Examplesofco ntingencytables 9.4Probabilitymodelsforcontingencytables 9.5Log-linearmodels 9.6Inferenceforlog-linearmodels 9.7Numericalexamples 9.8Remarks 9.9Exercises © 2002 by Chapman & Hall/CRC 10SurvivalAnalysis 10.1Introduction 10.2Survivorfunctionsandhazardfunctions 10.3Empiricalsurvivorfunction 10.4Estimation 10.5Inference 10.6Modelchecking 10.7Example:remissiontimes 10.8Exercises 11ClusteredandLongitudinalData 11.1Introduction 11.2Example:Recoveryfromstroke 11.3RepeatedmeasuresmodelsforNormaldata 11.4Repeatedmeasuresmodelsfornon-Normaldata 11.5Multilevelmodels 11.6Strokeexamplecontinued 11.7Comments 11.8Exercises Software References © 2002 by Chapman & Hall/CRC Preface Statistical tools for analyzing data are developing rapidly so that the 1990 edition ofthis book is now out ofdate. The original purpose ofthe book was to present a unified theoretical and conceptual framework for statistical modelling in a way that was accessible to undergraduate students and researchers in other fields. This new edition has been expanded to include nominal (or multinomial) and ordinal logistic regression, survival analysis and analysis oflongitudinal and clustered data. Although these topics do not fall strictly within the definition of generalized linear models, the underlying principles and methods are very similar and their inclusion is consistent with the original purpose ofthe book. The new edition relies on numerical methods more than the previous edition did. Some ofthe calculations can be performedwith a spreadsheet while others require statistical software. There is an emphasis on graphical methods for exploratory data analysis, visualizing numerical optimization (for example, ofthe likelihood function)and plotting residuals to check the adequacy of models. The data sets and outline solutions ofthe exercises are available on the publisher’s website: www.crcpress.com/us/ElectronicProducts/downandup.asp?mscssid= I am grateful to colleagues and students at the Universities of Queensland and Newcastle, Australia, for their helpful suggestions and comments about the material. Annette Dobson © 2002 by Chapman & Hall/CRC 1 Introduction 1.1 Background This book is designed to introduce the reader to generalized linear models; these provide a unifying framework for many commonly used statistical tech- niques. They also illustrate the ideas ofstatistical modelling. The reader is assumed to have some familiarity with statistical principles and methods. In particular, understanding the concepts ofestimation, sampling distributions and hypothesis testing is necessary. Experience in the use oft-tests, analysis ofvariance, simple linear regression and chi-squared tests of independence for two-dimensional contingency tables is assumed. In addition, some knowledge ofmatrix algebra and calculus is required. The reader will find it

Load more