P.J. Green 1994.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
MONOGRAPHS ON STATISTICS AND APPLIED PROBABILITY General Editors D.R. Cox, D.V. Hinkley, N. Reid, D.B. Rubin and B.W. Silverman Stochastic Population Models in Ecology and Epidemology MS. Bartlett (1960) 2 Queues D.R. Cox and W.L. Smith (1961) 3 Monte Carlo Methods J.M Hammersley and D.C. Handscomb (1964) 4 The Statistical Analysis of Series of Events D.R. Cox and P.A.W. Lewis (1966) 5 Population Genetics W.J. Ewens (1969) 6 Probability, Statistics and Time MS. Barlett (1975) 7 Statistical Inference S.D. S/ivey (1975) 8 The Analysis .of Contingency Tables B.S. Everitt (1977) 9 Multivariate Analysis in Behavioural Research A.E. Maxwell (1977) 10 Stochastic Abundance Models S. Engen (1978) 11 Some Basic Theory for Statistical Inference E.J.G. Pitman (1978) 12 Point Processes D.R. Cox and V. Isham (1980) 13 Identification of Outliers D.M Hawkins (1980) 14 Optimal Design S.D. Silvey (1980) 15 Finite Mixture Distributions B.S. Everitt and D.J. Hand (1981) 16 Classification A.D. Gordon (1981) 17 Distribution-free Statistical Methods J.S. Maritz (1981) 18 Residuals and Influence in Regression R.D. Cook and S. Weisberg(1982) 19 Applications of Queueing Theory G.F. Newell (1982) 20 Risk Theory, 3rd edition R.E. Beard, T Pentikainen and E. Pesonen (1984) 21 Analysis of Survival Data D.R. Cox and D. Oakes (1984) 22 An Introduction to Latent Variable Models B.S. Everitt (1984) 23 Bandit Problems D.A. Berry and B. Fristedt (1985) 24 Stochastic Modelling and Control MHA. Davis and R. Vinter (1985) 25 The Statistical Analysis of Compositional Data J. Aitchison (1986) 26 Density Estimation for Statistical and Data Analysis B. W. Silverman 27 Regression Analysis with Application G.B. Wetherill (1986) 28 Sequential Methods in Statistics, 3rd edition G.B. Wetherill (1986) 29 Tensor Methods in Statistics P. McCullagh (1987) 30 Transformation and Weighting in Regression R.J. Carroll and D. Ruppert (1988) 31 Asymptotic Techniques foe Use in Statistics O.E. Barndor.ff-Nielson and D.R. Cox (1989) 32 Analysis of Binary Data, 2nd edition D.R Cox and E.J. Snell (1989) 33 Analysis of Infectious Disease Data N.G. Becker (1989) 34 Design and Analysis of Cross-Over Trials B. Jones and MG. Kenward (1989) 35 Empirical Bayes Method, 2nd edition J.S. Maritz and T. Lwin (1989) 36 Symmetric Multivariate and Related Distributions K.-T. Fang S. Kotz and K. Ng (1989) 37 Generalized Linear Models, 2nd edition P. McCullagh and J.A. Neider (1989) 38 Cyclic Designs J.A. John (1987) 39 Analog Estimation Mehtods in Econometrics C.F. Manski (1988) 40 Subset Selection in Regression A.J. Miller (1990) 41 Analysis of Repeated Measures M Crowder and D.J. Hand (1990) 42 Statistical Reasoning with Imprecise Probabilities P. Walley (1990) 43 Generalized Additive Models T.J. Hastie and R.J. Tibshirani (1990) 44 Inspection Errors for Attributes in Quality Control N.L. Johnson, S. Kotz and X Wu (1991) 45 The Analysis of Contingency Tables, 2nd edition B.S. Everitt (1992) 4 7 Longitudinal Data with Serial Correlation: A State-space Approach R.H Jones (1993) 48 Differential Geometry and Statistics MK. Murray and J.W. Rice (1993) 49 Markov Models and Optimization MHA. Davis (1993) 50 Networks and Chaos - Statistical and Probalistic Aspects O.E. Barndorff-Nielson, J.H Jensen and W.S. Kendall (1993) 51 Number Theoretic Methods in Statistics K.-T. Fang andY. Wang (1993) 52 Inference and Asymptotics D.R Cox and O.E. Barndorff-Nielson (1994) 53 Practical Risk Theory for Actuaries C.D. Daykin, T. Pentikiiinen and M Pesonen (1993) 54 Statistical Concepts and Applications in Medicine J. Aitchison and LJ. Lauder (1994) 55 Predictive Inference S. Geisser (1993) 56 Model Free Curve Estimation M Tartar and M Lock (1994) 57 An Introduction to the Bootstrap B. Efron and R. Tibshirani (1993) 58 Nonparametric Regression and Generalized Linear Models P.J. Green and B. W. Silverman (1994) (Full details concerning this series are available from the publisher) Nonparametric Regression and Generalized Linear Models A ROUGHNESS PENALTY APPROACH P.J.GREEN and B.W. SILVERMAN School of Mathematics University of Bristol UK SPRINGER-SCIENCE+BUSINESS MEDIA, B.V. First edition 1994 © P.J. Green and B.B. Silverman 1994 Originally published by Chapman & Hall in 1994 So ftc over reprint of the hardcover 1st edition 1994 ISBN 978-0-412-30040-0 ISBN 978-1-4899-4473-3 (eBook) DOI 10.1007/978-1-4899-4473-3 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as pennitted under the UK Copyright Designs and Patents Act, 1988, this publication may not be reproduced, stored, or transmitted, in any form or by any means, without the prior permission in writing of the publishers, or in the case of reprographic reproduction only in accordance with the terms of the licences issued by the Copyright Licensing Agency in the UK, or in accordance with the terms of licences issued by the appropriate Reproduction Rights Organization outside the UK. Enquiries concerning reproduction outside the terms stated here should be sent to the publishers at the London address printed on this page. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication data available Contents Preface xi 1 Introduction 1 1.1 Approaches to regression 1 1.1.1 Linear regression 1 1.1.2 Polynomial regression 2 1.2 Roughness penalties 2 1.2.1 The aims of curve fitting 2 1.2.2 Quantifying the roughness of a curve 4 1.2.3 Penalized least squares regression 5 1.3 Extensions of the roughness penalty approach 7 1.4 Computing the estimates 9 1.5 Further reading 9 2 Interpolating and smoothing splines 11 2.1 Cubic splines 11 2.1.1 What is a cubic spline? 11 2.1.2 The value-second derivative representation 12 2.2 Interpolating splines 13 2.2.1 Constructing the interpolating natural cubic spline 15 2.2.2 Optimality properties of the natural cubic spline interpolant 16 2.3 Smoothing splines 17 2.3.1 Restricting the class of functions to be considered 18 2.3.2 Existence and uniqueness of the minimizing spline curve 18 2.3.3 The Reinsch algorithm 19 2.3.4 Some concluding remarks 21 2.4 Plotting a natural cubic spline 22 vi CONTENTS 2.4.1 Constructing a cubic given values and second derivatives at the ends of an interval 22 2.4.2 Plotting the entire cubic spline 23 2.5 Some background technical properties 24 2.5.1 The key property for g to be a natural cubic spline 24 2.5.2 Expressions for the roughness penalty 24 2.6 Band matrix manipulations 25 2.6.1 The Cholesky decomposition 26 2.6.2 Solving linear equations 26 3 One-dimensional case: further topics 29 3.1 Choosing the smoothing parameter 29 3.2 Cross-validation 30 3.2.1 Efficient calculation of the cross-validation score 31 3.2.2 Finding the diagonal elements of the hat matrix 33 3.3 Generalized cross-validation 35 3.3.1 The basic idea 35 3.3.2 Computational aspects 35 3.3.3 Leverage values 36 3.3.4 Degrees of freedom 37 3.4 Estimating the residual variance 38 3.4.1 Local differencing 38 3.4.2 Residual sum of squares about a fitted curve 39 3.4.3 Some comparisons 40 3.5 Weighted smoothing 40 3.5.1 Basic properties of the weighted formulation 41 3.5.2 The Reinsch algorithm for weighted smoothing 41 3.5.3 Cross-validation for weighted smoothing 42 3.5.4 Tied design points 43 3.6 The basis functions approach 44 3.6.1 Details of the calculations 46 3.7 The equivalent kernel 47 3.7.1 Roughness penalty and kernel methods 47 3.7.2 Approximating the weight function 47 3.8 The philosophical basis of roughness penalties 49 3.8.1 Penalized likelihood 50 3.8.2 The bounded roughness approach 51 3.8.3 The Bayesian approach 51 3.8.4 A finite-dimensional Bayesian formulation 54 3.8.5 Bayesian inference for functionals of the curve 55 3.9 Nonparametric Bayesian calibration 57 3.9.1 The monotonicity constraint 58 CONTENTS vii 3.9.2 Accounting for the error in the prediction observation 59 3.9.3 Considerations of efficiency 59 3.9.4 An application in forensic odontology 60 4 Partial splines 63 4.1 Introduction 63 4.2 The semiparametric formulation 64 4.3 Penalized least squares for semi parametric models 65 4.3.1 Incidence matrices 65 4.3.2 Characterizing the minimum 65 4.3.3 Uniqueness of the solution 66 4.3.4 Finding the solution in practice 68 4.3.5 A direct method 69 4.4 Cross-validation for partial spline models 70 4.5 A marketing example 71 4.5.1 A partial spline approach 72 4.5.2 Comparison with a parametric model 73 4.6 Application to agricultural field trials 75 4.6.1 A discrete roughness penalty approach 75 4.6.2 Two spring barley trials 77 4.7 The relation between weather and electricity sales 79 4.7.1 The observed data and the model assumed 79 4.7.2 Estimating the temperature response 81 4.8 Additive models 83 4.9 An alternative approach to partial spline fitting 85 4.9.1 Speckman's algorithm 85 4.9.2 Application: the marketing data 86 4.9.3 Comparison with the penalized least squares method 86 5 Generalized linear models 89 5.1 Introduction 89 5.1.1 Unifying regression models 89 5.1.2 Extending the model 90 5.2 Generalized linear models 91 5.2.1 Exponential families 91 5.2.2 Maximum likelihood estimation 93 5.2.3 Fisher scoring 94 5.2.4 Iteratively reweighted least squares 95 5.2.5 Inference in GLMs 95 5.3 A first look at nonparametric GLMs 98 viii CONTENTS 5.3.1 Relaxing parametric assumptions