Introduction to Empirical Processes and Semiparametric Inference1

This is page i Printer: Opaque this Introduction to Empirical Processes and Semiparametric Inference1 Michael R. Kosorok August 2006 1c 2006 SPRINGER SCIENCE+BUSINESS MEDIA, INC. All rights re- served. Permission is granted to print a copy of this preliminary version for non- commercial purposes but not to distribute it in printed or electronic form. The author would appreciate notification of typos and other suggestions for improve- ment. ii This is page iii Printer: Opaque this Preface The goal of this book is to introduce statisticians, and other researchers with a background in mathematical statistics, to empirical processes and semiparametric inference. These powerful research techniques are surpris- ingly useful for studying large sample properties of statistical estimates from realistically complex models as well as for developing new and im- proved approaches to statistical inference. This book is more a textbook than a research monograph, although some new results are presented in later chapters. The level of the book is more introductory than the seminal work of van der Vaart and Wellner (1996). In fact, another purpose of this work is to help readers prepare for the mathematically advanced van der Vaart and Wellner text, as well as for the semiparametric inference work of Bickel, Klaassen, Ritov and Wellner (1997). These two books, along with Pollard (1990) and chapters 19 and 25 of van der Vaart (1998), formulate a very complete and successful elucida- tion of modern empirical process methods. The present book owes much by the way of inspiration, concept, and notation to these previous works. What is perhaps new is the introductory, gradual and unified way this book introduces the reader to the field. The book consists of three parts. The first part is an overview which concisely covers the basic concepts in both empirical processes and semiparametric inference, while avoiding many technicalities. The second part is devoted to empirical processes, while the third part is devoted to semiparametric efficiency and inference. In each of the last two parts, the second chapter (after the introductory chapter) is devoted to the relevant mathematical concepts and techniques. For example, an overview of metric iv Preface spaces—which are necessary to the study of weak convergence—is included in the second chapter of the second part. Thus the book is largely self con- tained. In addition, a chapter devoted to case studies is included at the end of each of the three parts of the book. These case studies explore in detail practical examples which illustrate applications of theoretical concepts. The impetus for this work came from a course the author gave in the Department of Statistics at the University of Wisconsin-Madison, during the Spring semester of 2001. Accordingly, the book is designed to be used as a text in a one or two semester sequence in empirical processes and semiparametric inference. In a one semester course, some of the material would need to be skipped. Students should have had at least half a year of graduate level probability as well as a year of graduate level mathematical statistics. This is page v Printer: Opaque this Contents Preface iii IOverview 1 1 Introduction 3 2 An Overview of Empirical Processes 9 2.1TheMainFeatures....................... 9 2.2 Empirical Process Techniques . 13 2.2.1 StochasticConvergence................ 13 2.2.2 Entropy for Glivenko-Cantelli and Donsker Theorems 16 2.2.3 Bootstrapping Empirical Processes . 19 2.2.4 The Functional Delta Method . 21 2.2.5 Z-Estimators...................... 24 2.2.6 M-Estimators . 28 2.3OtherTopics.......................... 30 2.4Exercises............................ 31 2.5Notes.............................. 32 3 Overview of Semiparametric Inference 33 3.1 Semiparametric Models and Efficiency . 33 3.2 Score Functions and Estimating Equations . 37 3.3MaximumLikelihoodEstimation............... 42 vi Contents 3.4OtherTopics.......................... 45 3.5Exercises............................ 46 3.6Notes.............................. 46 4 Case Studies I 47 4.1LinearRegression........................ 48 4.1.1 MeanZeroResiduals.................. 48 4.1.2 MedianZeroResiduals................. 50 4.2CountingProcessRegression................. 52 4.2.1 The General Case . 53 4.2.2 TheCoxModel..................... 56 4.3TheKaplan-MeierEstimator................. 58 4.4 Efficient Estimating Equations for Regression . 60 4.4.1 SimpleLinearRegression............... 64 4.4.2 A Poisson Mixture Regression Model . 66 4.5 Partly Linear Logistic Regression . 67 4.6Exercises............................ 69 4.7Notes.............................. 70 II Empirical Processes 71 5 Introduction to Empirical Processes 73 6 Preliminaries for Empirical Processes 77 6.1MetricSpaces.......................... 77 6.2OuterExpectation....................... 84 6.3 Linear Operators and Functional Differentiation . 89 6.4Proofs.............................. 92 6.5Exercises............................ 95 6.6Notes.............................. 98 7 Stochastic Convergence 99 7.1 Stochastic Processes in Metric Spaces . 99 7.2WeakConvergence....................... 103 7.2.1 GeneralTheory..................... 103 7.2.2 Spaces of Bounded Functions . 109 7.3OtherModesofConvergence................. 111 7.4Proofs.............................. 116 7.5Exercises............................ 121 7.6Notes.............................. 122 8 Empirical Process Methods 123 8.1 Maximal Inequalities . 124 8.1.1 Orlicz Norms and Maxima . 124 Contents vii 8.1.2 Maximal Inequalities for Processes . 127 8.2 The Symmetrization Inequality and Measurability . 134 8.3Glivenko-CantelliResults................... 140 8.4DonskerResults......................... 143 8.5Exercises............................ 147 8.6Notes.............................. 148 9 Entropy Calculations 149 9.1UniformEntropy........................ 150 9.1.1 VC-Classes....................... 150 9.1.2 BUEI-Classes...................... 156 9.2BracketingEntropy....................... 160 9.3 Glivenko-Cantelli Preservation . 162 9.4DonskerPreservation...................... 165 9.5Proofs.............................. 167 9.6Exercises............................ 170 9.7Notes.............................. 171 10 Bootstrapping Empirical Processes 173 10.1 The Bootstrap for Donsker Classes . 174 10.1.1 An Unconditional Multiplier Central Limit Theorem 175 10.1.2 Conditional Multiplier Central Limit Theorems . 177 10.1.3 Bootstrap Central Limit Theorems . 181 10.1.4 Continuous Mapping Results . 183 10.2 The Bootstrap for Glivenko-Cantelli Classes . 187 10.3 A Simple Z-Estimator Master Theorem . 190 10.4Proofs.............................. 192 10.5Exercises............................ 198 10.6Notes.............................. 199 11 Additional Empirical Process Results 201 11.1 Bounding Moments and Tail Probabilities . 202 11.2SequencesofFunctions..................... 204 11.3ContiguousAlternatives.................... 208 11.4 Sums of Independent but not Identically Distributed Stochas- ticProcesses.......................... 211 11.4.1CentralLimitTheorems................ 211 11.4.2BootstrapResults................... 215 11.5 Function Classes Changing with n .............. 216 11.6DependentObservations.................... 220 11.7Proofs.............................. 223 11.8Exercises............................ 226 11.9Notes.............................. 226 12 The Functional Delta Method 227 viii Contents 12.1 Main Results and Proofs . 227 12.2Examples............................ 229 12.2.1Composition...................... 229 12.2.2Integration....................... 230 12.2.3 Product Integration . 234 12.2.4Inversion........................ 238 12.2.5OtherMappings.................... 241 12.3Exercises............................ 241 12.4Notes.............................. 242 13 Z-Estimators 243 13.1Consistency........................... 244 13.2WeakConvergence....................... 245 13.2.1TheGeneralSetting.................. 245 13.2.2UsingDonskerClasses................. 246 13.2.3 A Master Theorem and the Bootstrap . 247 13.3UsingtheDeltaMethod.................... 249 13.4Exercises............................ 253 13.5Notes.............................. 253 14 M-Estimators 255 14.1TheArgmaxTheorem..................... 256 14.2Consistency........................... 258 14.3RateofConvergence...................... 259 14.4 Regular Euclidean M-Estimators . 261 14.5 Non-Regular Examples . 263 14.5.1 A Change-Point Model . 263 14.5.2 Monotone Density Estimation . 268 14.6Exercises............................ 271 14.7Notes.............................. 272 15 Case Studies II 273 III Semiparametric Inference 275 16 Introduction to Semiparametric Inference 277 17 Preliminaries for Semiparametric Inference 279 18 Semiparametric Models and Efficiency 281 19 Score Functions and Estimating Equations 283 20 Maximum Likelihood Estimation and Inference 285 Contents ix 21 Semiparametric M-Estimation 287 22 Case Studies III 289 References 291 List of Symbols 297 This is page 1 Printer: Opaque this Part I Overview 2 This is page 3 Printer: Opaque this 1 Introduction Both empirical processes and semiparametric inference techniques have be- come increasingly important tools for solving statistical estimation and inference problems. These tools are particularly important when the statistical model for the data at hand is semiparametric, in that it has one or more unknown component which is a function, measure or some other infinite dimensional quantity. Semiparametric models

Introduction to Empirical Processes and Semiparametric Inference1

On the Mean Speed of Convergence of Empirical and Occupation Measures in Wasserstein Distance Emmanuel Boissard, Thibaut Le Gouic

Instrumental Regression in Partially Linear Models

Superprocesses and Mckean-Vlasov Equations with Creation of Mass

Birth and Death Process in Mean Field Type Interaction

1 Introduction

Arxiv:1609.06421V4 [Math.ST] 26 Sep 2019 Semiparametric Identification and Fisher Information∗

Analysis of Error Propagation in Particle Filters with Approximation

Semiparametric Efficiency

The Tail Empirical Process of Regularly Varying Functions of Geometrically Ergodic Markov Chains Rafal Kulik, Philippe Soulier, Olivier Wintenberger

Efficient Estimation in Single Index Models Through Smoothing Splines

Specification Tests for the Error Distribution in GARCH Models

Nonparametric Bayes: Inference Under Nonignorable Missingness and Model Selection