Invariance Principles via Studentization in Linear Structural and Functional Error-in-Variables Models

by

Yuliya V. Martsynyuk, B.Sc., M.Sc.

A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

School of Mathematics and Ottawa-Carleton Institute for Mathematics and Statistics Carleton University Ottawa, Ontario, Canada August 2005

(c) Copyright 2005 - Yuliya V. Martsynyuk

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Library and Bibliotheque et Archives Canada Archives Canada

Published Heritage Direction du Branch Patrimoine de I'edition

395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada

Your file Votre reference ISBN: 978-0-494-33479-9 Our file Notre reference ISBN: 978-0-494-33479-9

NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par I'lnternet, preter, telecommunication or on the Internet, distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats.

The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission.

In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these.

While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. i * i Canada Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Abstract

This dissertation deals with asymptotic methods in probability and statistics, and some of their applications. Linear structural and functional error-in-variables models (SEIVM’s and FEIVM’s) with univariate observations, and without equation error, are considered. Depending on which one of the major identifiability conditions is assumed, well- known weighted or modified least squares estimators for the slope and intercept, as well as certain estimators for unknown error , are studied under some distribution-free assumptions on error terms and new, most general ever conditions on explanatory variables. For all these estimators, corresponding processes are in­ troduced, that are believed to be new objects of studies for SEIVM’s and FEIVM’s. In this thesis, using the approach of Studentization and self-normalization, var­ ious invariance principles are established for these estimators and their correspond­ ing processes that are defined in D[0, l]-space. The consistency and CLT results, that imply and improve on those available in the literature, and also the obtained marginal weak invariance principles and sup-norm approximations in probability, are believed to be new, first time around results. Due to their Studentized forms, all the established invariance principles are invariant in form within the introduced classes of the explanatory and error variables, and free of any unknown parame­ ters associated with these variables. This, in turn, readily facilitates their complete data-based versions. In particular, the latter resolves the long-standing issue in the context of SEIVM’s and FEIVM’s that CLT’s in the literature to date are depen-

ii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. dent on various unknown parameters that are contained in the covariance matrices of the limiting normal distributions and, generally speaking, hard to estimate from the data. Some potential applications of the obtained results are described. The thesis was initially influenced and inspired by some recent new trends of research in probability and statistics at the Laboratory for Research in Statistics and Probability at Carleton University. As a consequence, all the results for SEIVM’s in Chapter 1 and FEIVM’s in Chapter 2 are due to interacting invariance principles via Studentization and related topics with error-in-variables models. The results for FEIVM’s in Chapter 2 are also on account of an interplay between SEIVM’s and FEIVM’s in hand.

iii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgements

It has been a long awaited opportunity to express my deep heartfelt gratitude to my best ever Teacher in life and mathematics, my thesis supervisor, Professor Emeritus Miklos Csorgo, Distinguished Research Professor of Mathematics and Statistics at

Carleton University. Above all the things I am grateful to you for, T h a n k Y o u for believing in me right from the very beginning, for your moral support, patience, unselfishness and interest in taking this long journey with me. Thank you for shar­ ing your mathematical wisdom and experience with me that not only provided in­ valuable supervision of this thesis, but also enriched me substantially in a general mathematical cultural way. Thank you for your sincere interest in my work and for our numerous discussions that have been a constant source of my inspiration throughout the preparation of this thesis and have helped me considerably. I am truly grateful to life for the unique and invaluable experience of being your student. I am thankful to the School of Mathematics and Statistics, the Faculty of Grad­ uate Studies and Research of Carleton University and to my supervisor for their continuous financial support in the past five years. Special thanks are due to Gillian Murray, the professional and diligent coordi­ nator, true heart and sweetheart of the Laboratory for Research in Statistics and Probability (LRSP). Gillian, thank you sincerely for all the technical supervision and help with the numerous versions of my various research manuscripts, for your help and attentive attitude during my first days in Canada, and simply for being a good supportive friend to me. Your lovely personality makes the LRSP a special

iv

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. place for all of us. My experience as a graduate student at Carleton University would not be the same without being a natural part of the LRSP. This phenomenon of mini-institution with, nevertheless, multiple functions, created in part by Professor Miklos Csorgo, Co-Director of the LRSP, and continuously conducted by its irreplaceable coordina­ tor Gillian Murray, has been a unique integral research environment for me. From the bottom of my heart, I would like to thank Anna and Bordy Semchyshyn for creating a home atmosphere in their home for me, and having been like a family to me. Not only has this made numerous peaceful and productive days of work on my thesis at home possible, but it has also contributed to my whole well-being in these years. Finally, my deep heartfelt gratitude goes to my family. If not for the trust and invaluable investments of my wonderful parents in me, blended into their uncondi­ tional love and constant support, I would not have gone this far.

v

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Contents

Abstract ii

Acknowledgements iv

Contents vi

List of Notations, Definitions and Abbreviations viii

Introduction 1

1 Invariance Principles via Studentization in Linear Structural Error- in-Variables Models 9 1.1 Introduction, Main Results and Applications ...... 9

1.1.1 Model and A ssum ptions ...... 9

1.1.2 Introduction of Estimators and Processes under Study .... 11

1.1.3 Prelude to Main Results ...... 17

1.1.4 Main Results with Remarks and Observations ...... 24 1.1.5 A Note on Some A pplications ...... 50 1.2 Auxiliary Results and Proofs ...... 61

1.2.1 Basics on DAN, Self-Normalization and Studentization with Some Complements ...... 62

vi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Auxiliary Results and Proofs of Theorems 1.1.1-1.1.3, Propo­ sitions 1.1.1, 1.1.2 and Observation 1 .1 .1 ...... 70 1.2.3 Survey on Major Results on GDAN and Studentization with New Characterizations ...... 103 1.2.4 Auxiliary Results and Proofs of Theorems 1.1.4, 1.1.5 and Observation 1.1.2 ...... 116 1.2.5 A p p en d ix ...... 134

2 Invariance Principles via Studentization in Linear Functional Error- in-Variables Models 141 2.1 Introduction, Main Results and Applications ...... 141 2.1.1 Model and A ssum ptions...... 141 2.1.2 Introduction of Estimators and Processes under Study .... 143 2.1.3 Prelude to Main R e su lts ...... 151 2.1.4 Main Results with R em ark s ...... 160 2.1.5 A Note on Some A pplications ...... 181 2.2 Auxiliary Results and Proofs ...... 185 2.2.1 Invariance Principles via Studentization for Independent Non- identically Distributed Random Variables with Two Moments 186 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 ...... 192 2.2.3 A p p en d ix ...... 241

Epilogue 248

Bibliography 250

vii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Notations, Definitions and Abbreviations

1 n u = - £ > i , for {ui E IRd, 1 < * < n}

U iV i , if Q!() — 0, f r i * m o Si, uv 1 < i < n} - u)(vi - v) , if o0 ¥= 0, ^ Ui’ ’ {1 ” Q = - Y l si,uv, for {(Ui, Vi) e IR2, 1 < * < n} n T=i fo ,if

ut [nt] for {ui E IRd, 1 < i < n} Y,U i, if t € [J, 1], 1 n Ui 0 , iftG[0, i), _ J j [nt] Suv,t vL/ v si.uv j II if t* cfc rlIn’ 11 J’ for ^ Ui’ Vi^ e IR2, 1 < i < n } n U 1a indicator of set A sign(-) sign of a real-valued variable cov(-, •) covariance of two r.v.’s Cov covariance matrix of a random vector const absolute constant log logarithm with natural basis 0(1) converging to zero numerical sequence 0(1) bounded numerical sequence p convergence in (probability) P, the measure generated by model (1.1.1)-(1.1.2) or (2.1.1)-(2.1.2); for random matrix Bn = (&£')i_j^and matrix

viii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. b% blj for all i, j Op(l) sequence of r.v.’s that converges to zero in P Op( 1) sequence of r.v.’s that is bounded in probability P a.a. convergence almost surely (a.s. convergence) P(X = Y) = 1, for r.v.’s X and Y o(l) a.s. sequence of r.v.’s that converges to zero almost surely v convergence in distribution X ^ Y r.v.’s X and Y are equal in distribution W (t) or standard real-valued Wiener process or {W(£), 0 < t < 00 } Brownian motion Wd(t) lRd-valued standard Wiener process, i.e., {Wd(t) = (WW(t), • • •, WW(t)), 0 < t < 00 } , where W ^(£), • • •, W^d\t) are i.i.d. standard real-valued Wiener processes P[ 0, 1], P) space of real-valued functions on D[0,1] with the sup-norm metric p C( [0,1], m.d) space of lRd-valued C[0, l]-functions (0 ( [0, lj, IRd), p) C{ [0,1], ]Rd)-space endowed with the sup-norm metric p M l Euclidean norm of a vector Euclidean inner product of two vectors ZU) the j th component of vector Z Z(k 1 k+l) = (Z&\ Z), a subvector of Z eJRd that has all the components of vector Z G IR^ starting with Z k and ending with Z^k+l\ l2 linearly independent vectors Z\, Z2, * • • > Zk in IR^, such that nonrandom vectors relationship C1Z1 + c2Z2 -1------1- CkZk = 0 implies that constants ci, C 2, • • •, ck are all zeroes, k > 2 full vector random vector Z, such that (Z, u) is a nondegenerate random variable, for all deterministic vectors u, with H | = 1 full distribution distribution of a full vector spherically symmetric random vector Z, such that all {Z, u) with nonrandom vectors u, ||it|| = 1, have the same distribution spherically symmetric distribution of a spherically symmetric vector distribution det(-) determinant of a matrix tr(-) trace of a matrix

ix

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Id unit n x n matrix diag(v block-diagonal matrix with listed square matrix blocks on its diagonal A > 0 matrix A is positive definite A > 0 matrix A is positive semidefinite AT transpose of vector/matrix A r/2 the (left) Cholesky square root of matrix A, i.e., uniquely existing lower triangular matrix with positive diagonal elements, such that A ^ f A ^ 2)7^ = A, for A > 0

- 1/2 A = p i/ar r/2 = /2\T t t/2 = p - i/2) Tl/2 the symmetric positive definite square root of matrix A; for A > 0, A 1^2 exists and satisfies (A 1//2) = A A ~1/2 - ( a 'T A 1/2 the Cholesky and symmetric positive definite square roots of matrix A A-M 2 = ( a 1/2) -1 A T/ 2 = ( A V f A-T/2 = (A -V ’f minimum eigenvalue of a matrix ■^max(') maximum eigenvalue of a matrix DAN domain of attraction of (univariate) normal law, i.e., the collection of sequences of i.i.d.r.v.’s {Z, Z\, i > 1} for which there are constants an and 6n, bn > 0, such that (J2?= i Zt - ar^K1 ^ ^(°> 1). n -» oo. GDAN generalized domain of attraction of multivariate normal law, i.e., all sequences {Z, Zi, i > 1} of i.i.d. random vectors in IRd for which there exist nonstochastic sequences of vectors an and d x d matrices Bn, such that (E U Zi-an)Bl % N(0, Id), n-> oo. LSA large-sample approximate Cl z a /2 the 100(1—0-/2)th of the standard tn,a/2 the 100(1 —a/2)th percentile of the Student ^-distribution with n digrees of freedom

x

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. EIVM error-in-variables model SEIVM structural EIVM FEIVM functional EIVM i.i.d. independent identically distributed r.v.’s random variables i.i.d.r.v.’s i.i.d. r.v.’s WLSE weighted least squares estimator WLSP weighted least squares process MLE maximum likelihood estimator MLSE modified least squares estimator MLSP modified least squares process CLT WLLN Kolmogorov’s weak law of large numbers for i.i.d.r.v.’s with finite SLLN Kolmogorov’s strong law of large numbers for i.i.d.r.v.’s with finite mean WPA1 with probability approaching one; used to describe a property related to random matrices/variables that holds on sets whose probabilities approach one LRSP Laboratory for Research in Statistics and Probability

xi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction

This dissertation deals with asymptotic methods in probability and statistics, and some of their applications. It was initially influenced and inspired by some recent new trends of research in probability and statistics at the Laboratory for Research in Statistics and Probability (LRSP) at Carleton University. Namely, the papers by Csorgo, Szyszkowicz and Wang ([15], [16], [17]) and later on, as a consequence, the theme term seminar series on recent advances in invariance principles for self­ normalized and Studentized partial sums of random variables and their applications in probability and statistics, given by Professor Miklos Csorgo in Fall 2003/Winter 2004 under the auspices of LRSP, prompted me to get involved in asymptotic studies concerned with the domain of attraction of the normal law (DAN), self-normalization and Studentization.

In view of my previous experience in dealing with linear error-in-variables models (cf. [34], [35]), the idea of incorporating the ideology of Studentization into the asymptotic theory for such models became inviting and challenging. This soon led to studying linear structural error-in-variables models (SEIVM’s) as in Chapter 1, with some new assumptions on the explanatory variables and distribution-free conditions on the error terms. Developing a basic asymptotic theory by using a new Studentization way of thinking in the nontrivial context of such models required first to survey some basics on DAN, the generalized domain of attraction of multivariate normal law (GDAN), self-normalization and Studentization, and then, to obtain

1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 2

some important complements and new characterizations in regards of these areas of research. These, as well as some other auxiliary results of Section 1.2 of Chapter 1 that were inspired by the special context of SEIVM’s, not only led to the main asymptotic results of Chapter 1 for such models, but may also be of general interest in DAN and GDAN theory and demonstrative of an interplay between the latter theory and error-in-variables model (EIVM) area. The new Studentization approach to SEIVM’s of Chapter 1, naturally fitting into the context of such models, produced new asymptotic results that have distinctive features and resulted in various benefits and important applications. In contrast to the core probabilistic Section 1.2 of Chapter 1, with auxiliary results and proofs of the main results, Section 1.1 of Chapter 1 with its introduction, main results and applications, is immersed into the statistical context of EIVM’s as much as possible, and is less technical. More details follow in the outline of Chapter 1 in this Introduction.

In view of the success of Studentization approach in SEIVM’s of Chapter 1, it be­ came desirable to extend this approach to corresponding companion linear functional error-in-variables models (FEIVM’s). Via gaining an insight into the interplay of SEIVM’s of Chapter 1 and FEIVM’s of Chapter 2, similarly featured limit theorems were obtained for FEIVM’s in Chapter 2, and thus a connection was established between asymptotic theories of Chapter 1 and Chapter 2. Chapter 2 has a similar structure to that of Chapter 1. An introduction, the main results and some applica­ tions constitute Section 2.1 that is embedded into the context of EIVM’s, and it can be read independently of the more technical Section 2.2 with its auxiliary results of various nature and proofs of the main results. Chapter 1 in this regard served as a general inspirational guide in developing Chapter 2, showing the desirable forms of the main results and a global scheme of obtaining them. However, the methods and proofs of the major auxiliary results of Section 2.2 in Chapter 2 are different from those of the corresponding Section 1.2 in Chapter 1, and present a blend of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 3

invariance principles via Studentization for independent nonidentically distributed random variables and matrix analysis. Later on in the course of this Introduction we also address Chapter 2 more specifically.

We note that, in spite of being generally aware of a diversity of EIVM’s in the literature, nevertheless in this thesis we have deliberately chosen to work with the simplest in form EIVM’s. Namely, linear SEIVM’s and FEIVM’s with univariate ob­ servations, and without equation error, are considered respectively in Chapter 1 and Chapter 2. Also, another principal distinction of our models is that they are studied under assuming identifiability assumptions. This allows us to introduce and develop new Studentization ideas, and to present the respective asymptotic theories for such models thoroughly, extensively and in detail, i.e., to accomplish the major objective of this thesis without being distracted by possible secondary technical difficulties that might have arisen under a more complicated choice of the models. At the same time, we wish to emphasize that our distribution-free assumptions on ex­ planatory variables (moment-like assumptions on nonrandom explanatory variables in Chapter 2) are, to the best of our knowledge, the most general ever considered in the context of SEIVM’s and FEIVM’s so far. As to our distribution-free moment conditions on error terms, they are the best that have been used so far. We also note that in contrast with the asymptotic studies tradition of linear FEIVM’s being studied first, our approach naturally originates from first studying SEIVM’s as in Chapter 1. Further remarks on interplay of SEIVM’s and FEIVM’s will follow in this Introduction and, in more details, also in Section 2.1.3 of Chapter 2.

Below we provide some preliminary introductions to Chapters 1 and 2. As to more detailed respective introductions, for better presentation, it appears to be more convenient to locate them respectively as Section 1.1.3 of Chapter 1 and Section 2.1.3 of Chapter 2, after the respective SEIVM’s and FEIVM’s and all the estimators and processes under study have been introduced respectively in Sections 1.1.1, 1.1.2 of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 4

Chapter 1 and Sections 2.1.1, 2.1.2 of Chapter 2.

In Chapter 1, linear SEIVM’s with univariate observations, and without equation error, are considered. Depending on which one of the major identifiability conditions is assumed, well-known weighted or modified least squares estimators for the slope and intercept, as well as certain estimators for unknown error variances, are studied under some new distribution-free conditions on random explanatory variables and error terms. For all these estimators, corresponding processes are introduced, that are believed to be new objects of studies for SEIVM’s.

In Chapter 1, using a Studentization approach, various invariance principles are established for these estimators and their corresponding processes that are defined in D[0, l]-space. While some of the consistency and CLT results imply and improve on those available in the literature, our other consistency and CLT results, the obtained marginal weak invariance principles and sup-norm approximations in probability are believed to be new, first time around results. Due to their Studentized forms, all the established invariance principles are invariant in form within the introduced class of the explanatory and error variables and free of any unknown moments of these variables, that, in turn, readily facilitates their complete data-based versions. In particular, this resolves the long-standing issue in the context of SEIVM’s that CLT’s in the literature to date are dependent on various unknown parameters that are contained in the covariance matrices of the limiting normal distributions and, generally speaking, hard to estimate from the data.

All the results of Chapter 1 have strongly been inspired and motivated by gen­ uine with recent advances in DAN and GDAN via Studentization and self-normalization. As a consequence, the traditional two-moment space of the ex­ planatory variables & that has been used in SEIVM’s to date is extended here, and are allowed to be simply in DAN. This new class of £* is then seen to be nearly optimal for most of the obtained invariance principles, and also leads to some new

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 5

features of the SEIVM’s in hand. Richness of the context of such SEIVM’s in turn helps one to contribute in part to an answer to a fundamental open question of characterization related to DAN and GDAN. Section 1.2 of Chapter 1 with auxil­ iary results and proofs of the main results also contains further results that may be of interest in a general theory of DAN and GDAN, and contribute to our attempt to see this theory interacting with EIVM area.

In Section 1.1 of Chapter 1, introduction and presentation of the main results of Chapter 1 and their applications are well embedded into the context of EIVM’s. In particular, in a prelude to the main results of Section 1.1.3 of Chapter 1 we give some general historical remarks on EIVM’s and survey asymptotic results in the literature that are related to our main results. Section 1.1 of Chapter 1 is concluded with some important applications of the main results concerning finite- sample properties of the estimators for the slope, confidence intervals for the slope, and some other applications.

In Chapter 2 natural companion models to those of Chapter 1, i.e., linear FEIVM’s with univariate observations, and without equation error, are considered. Depending on which one of the major identifiability conditions is assumed, well- known weighted or modified least squares estimators for the slope and intercept, as well as certain estimators for unknown error variances, are continued to be studied in the context of FEIVM’s in hand, under some new conditions on deterministic explanatory variables and distribution-free assumptions on the error terms. Just like in Chapter 1, these estimators are accompanied by their appropriate processes here, the first time around.

In Chapter 2 we continue to explore and develop Studentization ideas in connec­ tion to some basic asymptotic theory for the FEIVM’s. Our consistency and CLT results, that imply and improve on those available in the literature, and also the marginal weak invariance principles and sup-norm approximations in probability for

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 6

the processes in D[0, l]-space are believed to be new, first time around results. Due to their Studentized forms, all the CLT’s and some of the weak invariance principles that are established here are free of any unknown moments of the error terms and parameters associated with the explanatory variables, and hence, are completely data-based and readily applicable. In contrast, the known CLT’s in the context of FEIVM’s in the literature to date are dependent on unknown parameters that are, generally speaking, hard to estimate from data. Thus, it is quite natural to address some immediate applications of the main results in Section 2.1.5 of Chapter 2.

All the results for FEIVM’s of Chapter 2 have strongly been inspired and mo­ tivated by those for SEIVM’s of Chapter 1. Multiple benefits of new, first time around asymptotic results, that are due to incorporating innovative Studentization ideas into SEIVM’s in Chapter 1, made it desirable to obtain similar results for appropriate FEIVM’s in Chapter 2. Indeed, with a proper choice of corresponding FEIVM’s, we were to establish asymptotic results that would be identical in form to those of Chapter 1. Hence, for nonrandom explanatory variables of the FEIVM’s in hand, we introduce assumptions that are true companions to those on random explanatory variables in the corresponding SEIVM’s of Chapter 1. This indeed led to the synchrony of all the main results obtained in Chapter 2 with, and genuine identity in form of many of them to, respective results in Chapter 1. As a conse­ quence, the main results of Chapter 1 and those of Chapter 2 share similar features, and a striking interplay takes place between SEIVM’s of Chapter 1 and FEIVM’s of Chapter 2. Though this interplay contains the conclusions of an already known interplay between linear SEIVM’s and FEIVM’s (cf. [22]), it also establishes a new different approach to asymptotic studies of these models and the models’ interrela­ tionship in this regard. In conclusion to describing interaction of the two chapters, we wish to emphasize that though Chapter 1 inspired Chapter 2 with a general direction towards obtaining the main results, from a technical point of view, the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 7

methods and details of the proofs in Chapter 2 are based on ideas different from those used in Chapter 1. The established interplay of SEIVM’s of Chapter 1 and FEIVM’s of Chapter 2 will be covered extensively in Section 2.1.3 of Chapter 2. Chapters 1 and 2 contain appendices as respective Sections 1.2.5 and 2.2.3. These appendices are computer codes written in “Maple” with comments, and constitute a crucial step in the key auxiliary Lemma 1.2.8 of Chapter 1 and that of Lemma 2.2.8 of Chapter 2 respectively.

Preliminary version of Chapter 1 appeared as technical report [44], while tech­ nical report [45], that is about to appear, corresponds to Chapter 2.

The thesis is concluded with Epilogue, that follows right after Appendix 2. In this Epilogue we outline some works that are immediately related to this thesis, presently in progress by the author, and could have constituted an organic part of the thesis.

In the rest of this Introduction, we give some navigatory rem arks for reading this thesis. The structure of this thesis allows convenient access for various readers. The Introduction provides an as much as possible informative thesis summary without the EIVM’s and their objects of study being introduced yet.

After reading Introduction, one may like to proceed with Chapter 1 and/or Chapter 2. Despite being put in their natural logical and chronological orders and having the earlier mentioned strong connections, the two chapters are written in a self-contained way as much as possible, for independent readability and convenient reference. That is why sometimes in Chapter 2 we prefer some repetitions to sending the reader to similar places in Chapter 1. Those who are going to read both chapters, due to similarities of the contexts in these two chapters, will find it easier to relate to the one read second. Having chosen to read any one of the two chapters, one has further reading

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction 8

options. As described at the beginning of Introduction, Chapter 1 consists of two big sections: Section 1.1 with introduction, main results and applications, and Section 1.2 with auxiliary results and proofs of the main results. In Chapter 2, Sections 2.1 and 2.2 play similar roles to those of Sections 1.1 and 1.2 of Chapter 1 respectively. If one is not interested in the core probabilistic background of the main results, one should skip auxiliary Section 1.2 in Chapter 1 and Section 2.2 in Chapter 2, and concentrate only on Section 1.1 in Chapter 1, or Section 2.1 in Chapter 2. The latter two sections axe embedded into the context of EIVM’s as much as possible, and are less technical. Furthermore, at one’s convenience, the reading of Section 1.1 in Chapter 1 and Section 2.1 in Chapter 2 can be reduced respectively to Sections 1.1.3 and 2.1.3, that are both called Prelude to Main Results and provide a good idea about the nature of main results in view of related results in the literature. On the other hand, the readers who are mainly interested in probabilistic machinery behind the main results of this thesis, should concentrate more on Section 1.2 in Chapter 1 and/or Section 2.2 in Chapter 2. Moreover, the core Sections 1.2.1 and 2.2.1 of Sections 1.2 and 2.2 respectively can be of interest for those who would simply like to learn about invariance principles via Studentization and related topics, independently of EIVM context. The numbering system of this thesis is organized as follows. For any item, such as equation, lemma, theorem, remark, observation, proposition and subsection, the first number identifies the chapter, the second number stands for the section, and the third number represents consecutive item numbering within the section. Throughout the thesis, when a subsection is identified with a number, it is, nevertheless, called a section. Similarly, the first digit of the two-digital number of a section stands for the chapter number, and the second digit is the consecutive number of the section in each of the chapters. In the course of the thesis, in some places as a convenient reminder, together with an item number we spell out the location as well.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1

Invariance Principles via Studentization in Linear Structural Error-in-Variables Models

1.1 Introduction, Main Results and Applications

1.1.1 Model and Assumptions

In linear error-in-variables model (EIVM) of this thesis we observe pairs (yi} Xi) € 1R2 according to

Vi = + Si, (1.1.1)

Xi —'■ £i + £j, (1.1.2)

where & are unknown explanatory/latent variables, real-valued slope (3 and intercept a are to be estimated and 5, and e* are unknown measurement error terms/variables, 1 < i < n, n > 1. EIVM (1.1.1)-(1.1.2) is also known as measurement error model, or structural/functional relationship, or regression with errors in variables. It is a generalization of simple (1.1.1) in that in (1.1.1)—(1.1.2) it is

9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.1 Model and Assumptions 10

assumed that two variables r) and £ are linearly related, r) = /5£ +a, however now not only rj, but also £, are observed with respective measurement errors 5* and £j. In Chapter 1, explanatory variables & are assumed to be independent identically distributed (i.i.d.) random variables (r.v.’s) that are independent of the error terms, i.e., we deal here with the so-called structural EIVM (SEIVM). The case of (1.1.1)— (1.1.2) with a known to be zero is distinguished in the literature as the model without intercept. Convenient notations that are introduced in this chapter (cf. (1.1.6)-(1.1.7)) allow us to study both the no-intercept model and the model with unknown a simultaneously. We also note that model (1.1.1)-(1.1.2) is classified in form as one without the so-called equation error (cf. Cheng and Van Ness [12] and Fuller [19] for details on equation error models). Throughout Chapter 1 one of the following two conditions are assumed on the error terms:

(A) {(8 , e), (Si, £i), i > 1} is a sequence of independent identically distributed (i.i.d.) random vectors of error terms with mean zero and positive definite covariance matrix (1.1.3)

or

(B) Sequence {((5, e), (5*,£i), i > 1} is as in (A), with the fourth error moments assumed to exist, i.e.,

ESA < 00 and E e4 < 00 .

As to the explanatory variables, we assume that they are as in (C) or (D), and their joint distribution with the error terms obeys (E):

(C) {£, 6, i > 1} are i.i.d. nondegenerate r.v.’s with finite mean E£ = m,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 11

or

(D) {£, £i, i > 1} axe i.i.d. r.v.’s (i.i.d.r.v.’s) in the domain of attraction of the normal law (DAN) (cf. Definition 1.2.1 of Section 1.2.1);

(E) (8 , e) and £ are independent.

Apart from assumptions on the joint distribution of (£, 8, e), to ensure identifia- bility of unknown parameters in model (1.1.1)-(1.1.2), it is common to make use of some side conditions in this regard (cf. Remark 1.1.6 of Section 1.1.4). In the main lines of development in this chapter, we distinguish only three major identifiability assumptions. Namely, one of the following conditions is assumed about matrix T of (1.1.3):

(1) positive ratio of the error variances A =Var 5/Vare and correlation coefficient of the error terms cox(8 , e) /VVartfVare = \ij(VXd) are known, where co v(8 , e) is the covariance of 5 and e; equivalently, matrix T is known at least up to unknown multiple 0 =Vare;

(2) Var 8 — X9 and cox(8 , e) = fi are known, while Vare = 6 is unknown;

(3) Vare = 6 and cov(<5, e) = [i are known, while Var 8 =X6 is unknown.

The rest of the major identifiability assumptions that appear in the literature in regards of SEIVM’s (1.1.1)-(1.1.2) are briefly addressed in Remark 1.1.6 and then, in Section 1.1.5.

1.1.2 Introduction of Estimators and Processes under Study

In this subsection, depending on which one of identifiability conditions in (l)-(3) is assumed, well-known weighted or modified least squares estimators are given for the slope and intercept, the principal parameters, as well as for certain estimators of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 12

unknown error variances. For all these estimators, corresponding processes are intro­ duced in D[0, l]-space, that are believed to be new objects of studies for SEIVM’s (1.1.1)-(1.1.2). Formulae for most of the estimators are not displayed separately since they are absorbed into those for the corresponding processes. For convenience, necessary notations, definitions and abbreviations that are used more than once throughout this chapter are introduced in the course of developments of this chapter at the beginning of each of the subsections where they first occur, and also summarized in List of Notations, Definitions and Abbreviations of this thesis that is just before Introduction. For further use throughout, we introduce a set of notations. For sets of real­

valued variables {«,, 1 < i < n } and {v^ 1 < i

1 A u = - $ > * (1.1.4) n i=x and 1 n $uv ~ Sj.UVl (1.1.5) n r= i where

Si,uv = (Ui - cu)(vi - cv), (1.1.6)

with constant {0 , if intercept a is known to be zero, , 1, otherwise, \ )

and also set the functions in D[0,1]

i M i = 0 < t < 1, where Ut'.— O for t 6 [0, —), (1.1.8)

and i M i Suv,t = ~ Y l si,uv, 0 < t < 1, where Suv>t := 0 for t e [0, - ) . (1.1.9) n i=i n The sign of a real-valued variable is denoted by sign(-).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 13

Deviating somewhat from the original agreement that all abbreviations are to be introduced right at the beginning of each subsection, for the sake of better pre­ sentation, we introduce new abbreviations throughout this subsection as well. When identifiability condition (1) is assumed, it is common to estimate fi and a

with weighted least squares estimators (WLSE’s) fiin and 3in (cf. [12], Section 1.3.3 of [19]). These WLSE’s minimize functional

F in ( p ,a ) = — ^ 2 nun (iVi — £fi —ca,Xi — £)r -1(yi — £fi — ca,Xi — £)T), ^,Q!GlR, 71 ?€JH. ' ' (1.1.10) where c is from (1.1.7), T-1 is the inverse of matrix T of (1.1.3) and (y, — £j3 — ca]Xi — £)T is the transpose of vector (y* — £/? — ca, — £). Formulae for fi\n and Sin are much simpler looking when fi = 0 in (1.1.3). Therefore, it is quite common to study WLSE’s via first assuming that the error terms of model (1.1.1)- (1.1.2) are uncorrelated. Then, using a data-transformation based interplay between models (1.1.1)—(1.1.2) with uncorrelated and correlated errors, one can carry over asymptotic results for WLSE’s from case ji = 0 to fj, ^ 0 (cf. Remark 1.1.7 of Section 1.1.4). In this chapter, along with studying WLSE’s, we introduce weighted least squares processes (WLSP’s) that are believed to be new, first time around objects of studies in EIVM’s area. WLSP’s are elements of D[0,1] that at t = 1 correspond to WLSE’s centered by fi and a respectively, namely fi\n — fi and — a. Assuming that n = 0 in (1.1.3) and using notations (1.1.4)-(1.1.9), we put WLSP for fi to be

(fiin~fi)t = sign(Sxy)\J({zn - z)t + z f + X-(zn-z)t-z-fi, 0 < t < 1, (1.1.11)

with

and, respectively, the WLSP for a to be

(Sin - a)t = - X (filn - fi)t + (y - xfi - a)t, 0 < t < 1. (1.1.13)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 14

When dealing with assumptions on covariance matrix T in case (1), with fx = 0, there are two further cases to consider: both variances A 9 and 9 in (1.1.3) are known, or both are unknown. In the latter case, it may also be of interest to estimate one of the variances, say 9 (cf. (d) of Remark 1.1.4 in Section 1.1.4 on where estimators for 9 may be used). Now the weighted least squares approach, via functional (1.1.10), is unable to supply us with any estimator for 9. So it seems natural to adapt here the maximum likelihood estimator (MLE) of 9, that is derived when explanatory and error variables are assumed to follow a normal distribution. Then, the MLE method not only produces estimators for j3 and a, that are known to coincide with the WLSE’s, but also gives us the MLE for 6 that is usually adjusted with factor 2n/ (n — 2). This adjustment, also called “correction for degrees of freedom”, results, in particular, in consistency of the MLE of 9. Here we introduce process in D[0,1]

(0m ~ 9)t = {9ln - 9)t + ~ l), 0 < t < 1, (1.1.14)

where

0^ _ ^ _ (Syy>t — A9[nt]/n) — 2Sxyit0in + (Sxx>t — 9\ut\/n)(3^n q < ^ ^

(1.1.15) We note that 9\n = (9\n—9)i+9 is the MLE of 9 after the aforementioned adjustment. This new process as defined by (1.1.14)-(1.1.15), as well as 0i„, are studied here along with WLSP’s and WLSE’s that are our primary interest. When either (2) or (3) is assumed, i.e., when covariance matrix T in (1.1.3) is known up to one of the error variances, there are the so-called modified least squares estimators (MLSE’s) available for slope (3 and intercept a. Originating from Cheng and Tsai [9], these MLSE’s are not new in form (basically coincide with MLE’s for /? and a). However, they are obtained by a rather general method applicable to a broad class of the models (1.1.1)—(1.1.2) simultaneously, independently of whether the explanatory variables and/or the error terms are normally distributed or not.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 15

According to this method MLSE’s of j3 and a minimize, in case (2) and (3) respec­ tively, the following functionals, as unbiased estimators of the appropriate unknown error :

F2n(P,cx) = A0 + 2/fy), /3,a&JR, (1.1.16)

and

Fan(P, <*) = ^ s ((Pi - xiP - a )2 - P29 + Wfj), /3,aeJR. (1.1.17)

We observe that this approach only requires existence of variances for the i.i.d. error terms, disregards the nature of the explanatory variables, and thus is also suitable in the context of our SEIVM’s (1.1.1)—(1.1.2) under (A). When conditions (2) or (3) are assumed, going beyond studying MLSE’s only, we present modified least squares processes (MLSP’s) for /? and a, that are also believed to be first in the context of (1.1.1)-(1.1.2). In case (2) and (3) respectively, these MLSP’s in D[0,1] are given as follows:

0 2 n - P)t = -Syy,t ~ Xd[nt]/n^ ~ t i r t V n ) ^ (1118) Oxy ft

provided that — 0 and Syy — X0 > 0, (1.1.19)

while

(fa n - P )t = ^Sxy,t ~ ^ t]/nl ~ ^ f XX,t ~ 6 [n tl l n \ 0

if Sxx- d > 0. (1.1.21)

As to the corresponding MLSP’s for a, we define

(ajn - ct)t = - x (Pjn - P)t + (y - xP - a)t, 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2 Introduction of Estimators and Processes under Study 16

We note that (fan — /3)i and (ajn — a)i are centered MLSE’s fan and a jn, j = 2 and 3, i.e., the respective minimums of (1.1.16) and (1.1.17), centered by (3 and a. Just like in case (1), for estimation of unknown error variances, 9 in case (2) and A0 in case (3), their MLE’s are adapted here too, namely,

h . = (1.1.23) P2n and A$3 n = Syy — (SXy — (J>)f33n’ (1.1.24)

Moreover, to the best of our knowledge, we are first to introduce and study here processes for model (1.1.1)—(1.1.2) that at t = 1 retain the above MLE’s centered

by 8 and A0, respectively. The somewhat complicated forms of these processes are motivated by the appeal of their asymptotic properties (cf. Theorems 1.1.2c, 1.1.3c, 1.1.4,1.1.5 and Remarks 1.1.9,1.1.12 in Section 1.1.4). Our process that corresponds

to the MLE of Q in D[0,1] is put as

fa a\ (.Syyj X9\nt\/Ti)(SXXit 9\nt\/i%) (Sxyit fj,[nt\/n)^ I 2n ~ V)t ~ ------^------TTj------Oyy AC/ + [Syy - A9 - Syyit + A0[nt]/n) w (Syy,t-X9[nt]/n-2p2n(Sxyit-fj,[nt]/n) + p$n{Sxx>t-9[nt]/n)) X $l(Syy-X8 ) ’ (1.1.25)

provided that (1.1.19) is satisfied, where fan = {fan — fai+P, and {fan —fat is from

(1.1.18). We also introduce the process for A 9 in D[0,1] as

rXh \n\ _ {Sxx,t — 8[nt\/n){Syyj — \9[nt]/n) — {Sxyj — fi[nt]/n)2 (A»3n — AU)t = ------^ ^ ------&XX V +(SXX - 9 - SXXit +6 [nt\/n) (Syy,t ~ \9[nt]/n-2fan(SXyit - p[nt]/n) +j3gn(SXXit-9[nt]/n))

X S x x - 9 (1.1.26)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 17

assuming that (1.1.21) holds true, with %n — ~ P)i + A where (fan ~ (3)t is from (1.1.20).

We note in passing that since 6\n, Qtn and \d$n (cf. (1.1.14), (1.1.23) and (1.1.24) respectively) are estimators of unknown positive variances, due to their consistency (cf. Theorems 1.1.1a, 1.1.1b), they are eventually nonnegative. Throughout this chapter, an important convention is that all the presented es­

timators, namely /?in, &in, 0in, 02n and A03„, i = 173, and their corresponding pro­ cesses in D[0,1] are introduced and studied here on assuming that model (1.1.1)- (1.1.2) is nondegenerate, i.e., that /? ^ 0. When (3 = 0, model (l.l.l)-( 1.1.2) reduces to

yi = a + 5i, (1.1.27)

Xi = Zi + £i. (1.1.28)

Thus, observations y* and re* in (1.1.27)-(1.1.28) are no longer linearly associated and, when errors <5* and e* are uncorrelated or independent, they do not seem to provide meaningful information for estimating a and (3 in EIVM context.

1.1.3 Prelude to Main Results

For better understanding and appreciation of our main results in Section 1.1.4, in this subsection we give an introduction to their origin and nature in view of related results in the literature. As to abbreviations used in this subsection, CLT stands for central limit theo­ rem, and GDAN is to denote the notion of the generalized domain of attraction of multivariate normal law. A square block-diagonal matrix is defined below by listing square matrix blocks on its diagonal as follows: diag(-, •••,•)• The subject of EIVM’s, the models that are also known as measurement error models, or structural relationships (in case SEIVM’s), or regression with errors in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 18

variables, has a considerable history dating back to Adcock ([1], [2]), who is usu­ ally regarded as the first person to have seriously considered the problem of fitting a straight-line relationship when both variables are subject to error. During over 125 years of its history, the area has been developed by a number of probabilist- statisticians, such as K. Pearson, A. Wald, D.V. Lindley, J. Neyman, E.L. Scott, T.W. Anderson, J. Keifer, J. Wolfowitz, M.G. Kendall, P. Sprent, L.J. Gleser and W.A. Fuller, to mention only some of the well-known names (cf., e.g., Sprent [62]). However, according to Sprent [62], only a slow progress was made before World War II. EIVM’s have been applied in virtually every area of science and technology and, in turn, have been stimulated by the demand of data analysis in, e.g., medi­ cal, agricultural and econometrical studies. A vast literature on EIVM’s has been insightfully reviewed in surveys by Madansky [40], Moran [52], Anderson ([3], [4]), Sprent [62], Gleser [25] and Cheng and Van Ness [11], as well as in comprehensive texts by Kendall and Stuart [33] (Chapter 29), Fuller [19], Caroll, Ruppert and Stefanski [7] and Cheng and Van Ness [12].

Though being generally aware of a diversity of SEIVM’s in the literature, never­ theless in Chapter 1 we have deliberately chosen to work with the simplest in form SEIVM’s. Namely, linear SEIVM’s (1.1.1)—(1.1.2) with univariate observations, and without equation error, are considered. We also call attention again to the fact that another principle distinction of our SEIVM’s is that they are studied under assuming the identifiability assumption in (l)-(3). This allows us to introduce and develop Studentization ideas, and to present the respective asymptotic theory for such SEIVM’s thoroughly, extensively and in detail, i.e., to accomplish the major objective of Chapter 1 without being distracted by possible secondary technical dif­ ficulties that might have arisen under a more complicated model choice. In sum, for a better presentation of our new approach to SEIVM’s, it is beyond our present challenge and intention to make a progress also in studying other special forms of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 19

SEIVM’s (1.1.1)—(1.1.2). At the same time, as will be frequently emphasized later, our distribution-free moment assumptions on the explanatory variables and error terms of SEIVM’s (1.1.1)-(1.1.2) are, to the best of our knowledge, the most gen­ eral ever considered so far. In particular, neither the explanatory variables, nor the error terms are assumed to follow a normal distribution here.

In the next two paragraphs, leaning on the earlier mentioned survey papers and texts, we will attempt to sort out the state of the art of CLT’s and consistency theorems for linear SEIVM’s (1.1.1)-(1.1.2) under (l)-(3), sometimes by of results that hold true for somewhat more general models (1.1.1)-(1.1.2). This summary is not meant to cover all the results in the vast literature. It should however provide a sufficient enough introduction to the main results in Section 1.1.4. We also note that as far as some other aspects of asymptotic theory for SEIVM’s

(1.1.1)—(1.1.2) are concerned, like for example problems of efficiencies of estimators, they will not be discussed here. On the other hand, will be briefly addressed in Section 1.1.5.

First, as a general comment on the development of consistency theorems and CLT’s in SEIVM’s (1.1.1)-(1.1.2) under (l)-(3), we would like to mention the fol­ lowing. With the possible exception of at most a handful of early works, first con­ sistency results for the models seem to appear in the immediate post-World War II era, and were obtained by many authors under different assumptions that eventually led to Gleser ([21], [22]), Cheng and Van Ness ([10], [11]) and Cheng and Tsai [9], that will be our main references in regards of consistency, and that are believed to contain the most general results for linear SEIVM’s (1.1.1)-(1.1.2) under (l)-(3). As to CLT’s studies in EIVM’s (1.1.1)-(1.1.2) under (l)-(3) in general, they seem to really begin only in 1970’s, after, according to Sprent [62], some initial estimation problems in EIVM’s had been clarified. Thus, in the 1959 survey paper Madansky [40] we find a suggestion on how to find unknown asymptotic variances for MLE’s

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 20

02n and 03n of 0 (cf. (1.1.18) and ( 1.1.20), assuming that t/, and Xi are normally distributed). In the 1971 survey paper Moran [52], the question of finding asymp­ totic variances of MLE’s (explanatory and error variables are normally distributed) in SEIVM’s (1.1.1)—(1.1.2) under identifiability conditions (l)-(3) still appears as one of the seven open problems. Initially, consistency and asymptotic normality of the estimators in SEIVM’s (1.1.1)—(1.1.2) were mainly studied under the umbrella of maximum likelihood estimation, assuming normality on the explanatory and error

variables. Since in Chapter 1 we do not assume any such normality restrictions, just like in case of consistency results, our collection of references in regards to the CLT’s for linear SEIVM’s (1.1.1)—(1.1.2) under (l)-(3) is mainly represented by relatively recent works that contain the most general CLT’s so far for the model in hand.

Speaking of our main references in regards to consistency and CLT’s, it follows from Gleser ([21], [22]), Cheng and Van Ness ([11], [10]) and Cheng and Tsai [9] that,

under conditions (1), ( 2 ) and (3) respectively, (/?in, «i„, 0i„) ((An,Si„) in case T of

(1.1.3) is completely known), (^ 2n,S 2„,02n) and (03n, a 3n, A03n) are consistent and -^/n-asymptotically normal estimators of the vector of the parameters of interest in linear SEIVM’s (1.1.1)-(1.1.2). Delegating complete description of the assumptions

required for these theorems into Remarks 1.1.1 and 1.1.3 of Section 1.1.4, we only note now that as discussed in Remark 1.1.4 of Section 1.1.4, a common feature of all these CLT’s is that their limiting normal distributions depend on error and explana­ tory variables via various unknown moments of these variables. As a consequence of this, applications of these CLT’s are aggravated by having to estimate additionally these typically unknown explanatory and error variable moments, and the latter are hard to come by in practice, except for some special cases (for example, when the mo­ ments of error terms {(<5*, e*), i > 1} are like those of a iV(0, diag(0, 6))-distribution, as illustrated in [21] and seen via [22]). Similarly, to overcome difficulties of estimat­ ing the covariance matrix of the asymptotic normal distribution, in Theorem 1.2.1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 21

of Fuller [19], provided that (3) is satisfied, 6tzn) is shown to be consistent and V^n-asymptotically normal estimator on assuming that (£, 8, e) follows a trinomial normal distribution with positive definite diagonal covariance matrix.

In contrast, in Section 1.1.4, using Studentization ideas and assuming less about the explanatory variables & of SEIVM’s (1.1.1)-(1.1.2), we establish various large sample results that are invariant in form within the introduced class of {(£, 8, e),

(£i, S{, ei), i > 1}, free of unknown distribution parameters of (£, 6, e), completely data-based and hence, do not require any additional estimation in this regard for their applications as in Section 1.1.5. The of main asymptotic results of Sec­ tion 1.1.4 goes from strong and weak consistency results of WLSE’s, MLSE’s, and estimators of unknown error variances that are obtained in Theorems 1.1.1a, 1.1.1b (collectively referred to as Theorem 1.1.1 from now on), to various invariance prin­ ciples, including CLT’s, weak invariance principles on D[0,1] and sup-norm approx­ imations in probability, proved for each of these estimators and their corresponding processes marginally (cf. Theorems 1.1.2a-1.1.2c and 1.1.3a-1.1.3c, collectively re­ ferred to as Theorems 1.1.2 and 1.1.3 from now on, and Propositions 1.1.1, 1.1.2), and finally to CLT’s for our various joint estimators of interest (cf. Theorems 1.1.4, 1.1.5 and Remarks 1.1.9, 1.1.12). In particular, all the processes we introduced in Section 1.1.2 are believed to be new objects of study in the context of (1.1.1)—(1.1.2), and thus, so are also the marginal invariance principles of Section 1.1.4 associated with them, i.e., the respective (ii) and (iii) in Theorems 1.1.2 and 1.1.3. As to our CLT’s in (i)’s of Theorems 1.1.2,1.1.3 and in Theorems 1.1.4,1.1.5, to the best of our knowledge, they present the first and only CLT’s when £ G DAN and Var£ = oo, an important case of (D) (for a description of (D) we refer to Remark 1.2.1 of Section 1.2.1). Moreover, even when 0 < Var£ < oo, but neither the explanatory variables, nor the error terms follow a normal distribution, due to their Studentized forms and respective earlier mentioned features, CLT’s of Chapter 1 appear to be new as well

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 22

(cf. Remarks 1.1.4 and 1.1.10 of Section 1.1.4). Naturally, CLT’s of Section 1.1.4 also imply the known ones that follow from [21], [22], [11], [10] and [9] (cf. Remarks 1.1.3 and 1.1.8 of Section 1.1.4). Also, consistency results of Theorems 1.1.1a and 1.1.1b extend related known results (cf. Remark 1.1.1 of Section 1.1.4). In addition to the main results, many of the remarks of Section 1.1.4 contain complementary and similar results and facts that are not organized in separate statements.

All the results of Chapter 1, including the ones of Section 1.1.4, are strongly inspired and influenced by recent advances in DAN and GDAN via Studentization and self-normalization, in particular by papers of Csorgo, Szyszkowicz and Wang ([15], [16], [17]), Gine, Gotze and Mason [20], Mailer ([41], [42]), Sepanski [61], Vu, Mailer and Klass [64], as well as by some other papers that appear therein as references. The developments in these papers prompted us to think about, and succeed in, enriching the traditional two-moment space of explanatory variables that has been used so far for asymptotic studies in SEIVM’s (1.1.1)-(1.1.2) by allowing & to be simply in DAN (cf. condition (D)). Also, for consistency results in Theorem 1.1.1 of Section 1.1.4, we just replace the classical two-moment space for by the one characterized by (C) rather than (D) (clearly, (C) is weaker than (D)). As a consequence of assuming (D) in general, the & are allowed to have an infinite variance (cf. Remark 1.2.1 of Section 1.2.1). Moreover, condition (D) led to bringing the power of Studentization ideas of the newest achievements in DAN and GDAN research into the world of EIVM’s. This, in turn, made all the invariance principles of Section 1.1.4 readily applicable, as described in Remarks 1.1.4 and 1.1.10 of Section 1.1.4. Furthermore, for our purposes in Section 1.1.4, the new class of explanatory variables as in (D) turned out to be nearly optimal in that (D) practically exhausts the choice of specifying & as in (C) for which we can have most of the invariance principles of Theorems 1.1.2, 1.1.3 (cf. Propositions 1.1.1, 1.1.2 of Section 1.1.4 for details). Also, according to Observation 1.1.1 of Section

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.3 Prelude to Main Results 23

1.1.4, condition (D) leads to y/n£^(n)-rate asymptotic normality for the WLSE and MLSE’s of /?, where £^(n) is a slowly varying function at infinity, and it goes to infinity when Var£ = oo. In addition, in part (e) of Remark 1.1.6, the new type SEIVM’s as in Chapter 1, with Var £ = oo, are shown to have a formal connection and nearness to regression models.

Just like Studentization ideas associated with the new space of explanatory vari­ ables as in (D) became beneficial for SEIVM’s (1.1.1)-(1.1.2), the special context of the latter models well turned out to be rich enough to prove results that are also of interest in a general theory of DAN and GDAN. Such results appear here as auxiliary ones in Section 1.2. In particular, we extend recently obtained Studenti­ zation invariance principles for i.i.d.r.v.’s in [16], [17] to special triangular sequences of dependent r.v.’s (cf. Remarks 1.1.4 and 1.1.10 of Section 1.1.4). Moreover, in Section 1.2.3, via first obtaining Lemma 1.2.12 and then Lemma 1.2.15, a summary on various characterizations of GDAN, we also hope to have contributed in part to an answer concerning a fundamental open characterization question related to

GDAN studies, seeking a rhyming multivariate analogue of the main result in [ 20]. This open question can be put as follows: ‘When is the multivariate Student asymptotically standard normal and, furthermore, when is its corresponding process asymptotically a standard Wiener process?’. Moreover, it is the context of SEIVM’s

(1.1.1)—(1.1.2) that suggested to us a simple alternative proof of the first part of Lemma 1.2.6, and helped us to come up with another useful result of Section 1.2.1, Lemma 1.2.7. All the obtained results that are mentioned in this paragraph con­ tribute to our attempt to see the EIVM area interacting with a general theory of DAN and GDAN.

In the process of preparing Chapter 1, the author became aware of some other connections between DAN and GDAN theory and statistical applications. This is one of the reasons why papers by Mailer ([41], [42]) play a special role in our refer­

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 24

ence list. In particular, Mailer [41] proves his main result on DAN and applies it to establish the asymptotic normality of the regression coefficient in a linear regression when the error variance is not necessarily finite. First, in Remark (iv) of [42] it is pointed out that “negligibility” conditions characterizing DAN and GDAN (cf. Lemma 1.2.1 of Section 1.2.1 and equivalence of (a) and (c) of Lemma 1.2.11 in Sec­ tion 1.2.3) have been successfully explored in asymptotic theory of generalized linear models, in and in the analysis of many other mod­ els. Then, Mailer [42] concludes the importance of understanding such conditions in statistics. It appears that “negligibility” conditions have not yet been explored in the context of SEIVM’s (1.1.1)-(1.1.2), and that our assumption (D) is introduced here the first time around. In fact, not only does this condition meet our empirical expectations when Var £ = oo in that it genuinely fits and characterizes the role of

& in ( 1.1.1)—(1.1.2) that should be dominant over the role of the error terms with

finite variances, but, from a rigorous mathematical point of view, if {£, £*, i > 1} are i.i.d.r.v.’s with finite mean, then (D) is necessary and sufficient for the Gaus­ sian asymptotic theory that we are to develop below (cf. Propositions 1.1.1, 1.1.2 of Section 1.1.4).

1.1.4 Main Results with Remarks and Observations

In this section we list our main results, Theorems 1.1.1-1.1.5 and Propositions 1.1.1, 1.1.2, with Remarks 1.1.1-1.1.13 and Observations 1.1.1, 1.1.2 on them. All the statements appearing in Observations 1.1.1, 1.1.2 are to be proved along with these theorems and propositions in Sections 1.2.2 and 1.2.4. Many of the remarks contain complementary results, frequently with immediate short proofs. First, we introduce new necessary notations, definitions and an abbreviation. p Convergence in probability is denoted by — where P is the probability measure

generated by all finite-dimensional distributions in n of the model ( 1.1.1)-(1.1.2).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 25

Sometimes, we say “convergence in P ” for convergence in probability P. For conver­ gence almost surely, frequently called a.s. convergence here, we write Notation stands for convergence in distribution. For r.v.’s X and Y, X = Y reads as X and Y are equal in distribution. By writing {W (t), 0 < t < oo}, or W (t), we mean a standard real-valued Wiener process (Brownian motion). The space of real-valued

functions on £>[0, 1] with the sup-norm metric p is denoted by (£>[0, l],p). Through­ out Chapter 1 all the vectors are row-vectors, ]| • || denotes their Euclidean norm, while (•,•) stands for Euclidean inner product of two vectors. Notation (1.1.4) is also employed for vectors in IRA Random vector Z is called full or, equivalently, Z has a full distribution if for any deterministic vector u, with ||u|| = 1, {Z, u) is a nondegenerate random variable. We say that nonrandom vectors Z\, Z2, ■ • •, Zk

in IRd are linearly independent if relationship cxZx + c2Z2 H 1- CkZk = 0 implies that constants c1; c2, • ■ •, c*, are all zeroes, k > 2. If A is a vector/matrix, then AT is used for its transpose. We write In for the unit n x n matrix. If matrix A is positive definite, we write A > 0. For an appropriate square matrix A, its Cholesky and symmetric positive definite square roots are both designated by universal A1/2,

and A ~x!2 — (A1/2) 1 and A~T!2 — {A ^l2^ (cf. more on matrix square roots at the beginning of Section 1.2.3). Throughout the chapter, const universally stands for various absolute constants. Sometimes, instead of saying that a certain property related to a sequence of random matrices/variables holds on sets whose probabilities approach one, i.e., that the property holds “with probability approaching one”, we use abbreviation WPA1. Our first main result, Theorem 1.1.1, is on consistency of WLSE’s, MLSE’s,

0ire> 02n and Xd3n. It is split into parts Theorem 1.1.1a and Theorem 1.1.1b, and collectively called Theorem 1.1.1 on occasions.

Theorem 1.1.1a. Let assumptions (A), (C), (E) and the identifiability assumption in (l)-(3) that is appropriate for the estimator in hand be satisfied. For studying

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 26

j3in and a ln, it is additionally assumed that p, = 0 in (1.1.3). Then, as n —* oo,

An^(3, ain^a , e2n^ e and Xd^^XO, i=l^. (1.1.29)

Theorem 1.1.1b. Let assumptions (A), (C), (E) and (1) be satisfied. Assume that fj, = 0 in (1.1.3). When Var£ = oo, suppose additionally that (B) and (D) are valid. Then, as n oo,

Q i n ^ Q , ifV a rf < oo, . . din d , ifVax£ = oo.

Remark 1.1.1. We note that estimators of Theorem 1.1.1a are strongly consistent both in case Var£ < oo and Var£ = oo. Furthermore, Theorem 1.1.1a requires only the existence of two moments of the error terms and the mean of the explanatory variables. This is not so for d\n in Theorem 1.1.1b: when Var£ = oo model (1.1.1)- (1.1.2) is not sensitive enough to capture consistency of 9in only under (A), (C) and (E). Finally, while case V arf = oo seems not to have been covered in the literature so far, Theorems 1.1.1a and 1.1.1b imply corresponding consistency results that

follow for j3in, 5 ln and 0ln from Gleser ([21], [22] combined), for fan, <*2n> Pzn and a3n from Cheng and Tsai [9], and for 62n and X03n from Cheng and Van Ness ([11], [10]) under (A), (C), (E) and assumption that Var £ < oo. In fact, consistency from [21], [22] combined is for the model with T of (1.1.3) where A = 1, fi = 0 and 6 is unknown, but we note that, naturally, the results for (3\n and are also valid when A = 1, // = 0 and 6 is known. Moreover, in view of upcoming Remark 1.1.7,

consistency of /3ln and Sin (i 9 is known or not), and 0ln can be extended from the case of A = 1 and p — 0 to the case of arbitrary A and fi — 0. In addition to the aforementioned conditions for their consistency results, the authors of [9] assume that S and e are independent and as in (B), and E fp < oo, while those of [11] and

[10] also make assumption that (£, 8, e) follows a normal distribution with diagonal covariance matrix. We also note in passing that on account of (1.2.19) and (1.2.21)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 27

of the proof of Theorem 1.1.1a, all the estimators under study that were introduced in Section 1.1.3 are well-defined WPAl, n —* oo.

Our second main result is Theorem 1.1.2. For convenience, it is split into Theo­ rems 1.1.2a, 1.1.2b and 1.1.2c respectively for estimators and processes of (3, a and unknown error variances. Theorem 1.1.2a. Let assumptions (B), (D), (E) and the identifiability assump­ tion in (l)-(3 ) that is appropriate for the process in hand be satisfied. When study­ ing — 0)t, suppose also that p = 0 in (1.1.3). Let

and

(1.1.32)

, (&i,xy P ) P(^i,xx ff) Then, as n —» oo, the following statements hold true: VnU(j, n)(pjn — (3)to v r-.. .. ,\l /2

y/nU(j,n)(Pjn- P)t $W (t) on(D[0,l],p); ( E £=1 («iO'in) - u(j, n) )2/(n - 1)) (iii) On an appropriate probability space for {(£, S, e), (£,-, Si, ep, i > 1} we can construct a standard Wiener process (W(t), 0 < t < oo} such that

Theorem 1.1.2b. Let assumptions (B), (D), (E) and the identifiability assump- tion in (l)-(3 ) that is appropriate for the process in hand be satisfied. When study­ ing (5in — a)t, we also suppose that p = 0 in (1.1.3). Define

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 28

Then, as n —> oo, we have: ^ ______V™(ajn ~ 0c)to ya % N (0, tQ), for t0 e (0,1]; (E "=i(Vi(j,n) - v(j,n) )2/(n - 1))

&)t (ii) m % W (t) on (£>[<>,

(iii) On an appropriate probability space for {(£, 5, e), (&, Si, £i), i > 1} we can construct a standard Wiener process {W (t), 0 < t < oo} such that

\/n(otjn a)t W(nt) sup = Op(l). 0

Theorem 1.1.2c. Let assumptions (B), (D), (E) and the identifiability assumption in (l)-(3) that is appropriate for the process in hand be satisfied. Suppose that p, = 0 in (1.1.3) when (9\n — 9)t is studied. Define

' (n - 2) (A + 0?n)/n , if j = 1, L{j,n) = 1 , if j = 2, (1.1.34) 1 , if 3 = 3, and {sityy A0) 2/3(SiiXy ff) H~ (32(siiXX 9) , if j — 1,

= P~2((si,yy “ M) - 2P(si,xy - f j ) + P2(si,xx - 9)) > if j = 2, (1.1.35) . (®i,yy ^9) 2(3(SijXy ff) ft (,SiiXx 9) , if j = 3. Then, for j ~ 1 and 2, as n —> oo, the following are valid: y/nL{j,n){9jn - 9)to 0) 57J AT(0, t0), for t0 e (0,1]; (E?=i(wiU,n) ~ w(j,n) )2/(n - 1))

v^»-^0’»«)(%n-^)t (ii) 175^ ^ ) on (£>[0, 1],p); ( J2i=xMj, n) - w(j, n) )2/(n - 1))

(iii) On an appropriate probability space for {(£, <5, e), (&,Si} £i), i > 1} we can con­ struct a standard Wiener process {W (t), 0 < t < oo} such that

y/n L{j, n) (9jn - 9) t W (nt) sup ° P { 1). 0 < t< l ( E ?=1 (tWi(j, n) - w(j, n) )2/(n - 1))1/2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 29

Moreover, (i)—(iii) are valid for {\6zn - X6)t, with Wi(3,n) and L(3, n) respectively in place ofWi(j,n) and L(j,ri).

R em ark 1.1.2. Weak convergence on (D[0, l],p) in (ii)’s of Theorem 1.1.2 is to

be understood in the following way. Let D be the sigma-field of subsets of D[ 0, 1] generated by the finite-dimensional subsets of D[Q,1]. We say that a sequence of random elements {X n(t), 0 < t < 1 ,n > 1} of D[0,1] converges weakly to W (t) on

(D[0,l],p )if h(Xn(t)) ” h(W(t)), n —► oo,

for all h : D[0,1] —> IR that are (D[0, 1], D) measurable and p-continuous, or p-

continuous except points forming a set of Wiener measure zero on (£)[ 0, 1], D). With this definition in mind, we conclude (ii)’s of Theorem 1.1.2 from respective (iii)’s. For example, (iii) for the WLSP for (3 implies that for any h as above,

^ y/nU{l,n)0ln- (3)t h '^(lS=i(«i(l,n)-tt(l,n)) 2/ ( n - 1))1/2

and since {W(nt)/y/n, 0 < t < 1} = {W (t), 0 < £ < 1} for each n > 1, we have

h = h(W (tfj, for each n > 1,

and respective (ii) of Theorem 1.1.2a follows. Clearly, (ii)’s, in turn, yield the respective cases (i) in Theorem 1.1.2. Thus, the proof of Theorem 1.1.2 will be reduced to establishing (iii) ’s only. In spite of this, cases (i)—(iii) are spelled out in Theorem 1.1.2 as separate results for further use and convenient reference.

R em ark 1.1.3. While (i) parts with to — 1 of Theorems 1.1.2a-1.1.2c seem to present the first and only CLT’s for SEIVM’s (1.1.1)—(1.1.2) if £ G DAN and Var£ = oo (as allowed by (D) in view of Remark 1.2.1 of Section 1.2.1), when Var£ < oo, they imply already known related CLT’s. As has been mentioned earlier in Section

1.1.3, vectors of estimators 0 \ n, Sin, 0in) (cf. [21], [22] combined), (fan, <$2n) and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 30

03n, 0t3n) (cf. [9]), 02n, ot2n, 02n) (cf. [11]) and 03n,oc3n,xe3n) (cf. [10]) are known to be v'ra-asymptotically normal. The result that follows from [21], [22] combined requires (1) (r of (1.1.3) is such that A = 1, fi = 0 and 0 is unknown) (B), (C), (E) and Var£ < oo. We note that this result is also valid with arbitrary A of T in view of Remark 1.1.7, that, naturally, implies -^/n-asymptotic normality for (Pin, <5in) when A is arbitrary, /j, = 0 and $ is known in T of (1.1.3). The two results following from [9] are proved under respective condition in (2)-(3), (B), independence of 8 and e, (C), (E) and < oo. We note that Cheng and Tsai could have concluded these results simply under Varf < oo, if they had combined the results of their paper for the so-called functional EIVM with [22], As to the CLT’s in [11] and [10], it is stated under (2) and (3) respectively, (B), (C), (E), Var£ < oo and the assumption that (£, 8, e) has a trinomial normal distribution with diagonal covariance matrix. Thus, in the respective (i) parts with

t0 = 1 of Theorems 1.1.2a-1.1.2c, restricting ourselves to the conditions of the aforementioned papers, i.e., assuming (C) and Var£ < oo instead of (D), but

not necessarily assuming that A = 1 in T of (1.1.3) in regards of (Pin , Sin, 0in) and (Pin, Sin), assuming independence of 8 and e, (C) and E fp < oo instead

of (D ) in regards of (Pm , ®2n) and (P3n, S3n), and assuming (C) and Var £ < oo instead of (D) and that (£, 8, e) has N ^ (E £,0,0), diag(Var£,X0, 0)) distribution VS VS as far as (Pin, «2n> #2n) and (Pzn, a3n, X03n) go, we can obtain all marginal CLT’s that follow from [21], [22] combined, and [9], [11] and [10]. Indeed, under such conditions, as n —► oo, due to (1.2.19), (1.2.69), (1.2.79), (1.2.83) of Section 1.2.2

and consistency of Pin, expressions ( E ”=i (ui(j,n) - u(j,n))2/(n — 1 ))f/- 2(j,n), YZ=i(vi(j, n) - v(j, n) )2/(n - 1) and ( ££=i(wt(j, n) - w(j, n) )2/(n - 1 ))L~2(j, n), j = 173, converge in probability to positive constants that are equal to the variances of the asymptotic normal distributions of the corresponding estimators in accordance with [21], [22] combined, and [9], [11] and [10].

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 31

Remark 1.1.4. In (a)-(c) of this remark we address the main Studentization features of Theorem 1.1.2. (a) Theorems 1.1.2a-1.1.2c may be viewed as nontrivial extensions of known limit theorems based on Studentization in that all the processes in Theorem 1.1.2,

namely, y/nU(l,n)0in -/?)*( £ ”=1(ui(l,n) - w(l,ra) )2/(n - 1)) 1/2, are essen­

tially Student processes along the lines of (1.2.13) of Section 1.2.1 in a somewhat loose sense. More precisely, according to forthcoming (i) of Proposition 1.1.1, for j = 1,3, processes in D[0,1]

y/nv'{j,n)t

(E?=i(«i0 » - u(j,n))2/(n - 1))1/2 ( Z?=1(V'(j, n) - v'(j, n) )2/(n - 1))1/2

y /n w (j,n )t and 1/ 2 ’ (1.1.36) ( E ? = i n) - w(j, n) )2/(n - 1)) where m (yi — a) — (3xi — ■ iti(l, n) , if j = 1 and Var £ < oo, 2 M 0 m (yi - a ) - fa i M p Ui(j, n) , if j = 2 and Var f < oo, n) = < (1.1.37) (yi - a ) - { 3 x i - ^ Ui(j, n) , if j = 3 and Var£ < oo, m ___ ( y i - a ) - p X i , if Var£ = oo, j = 1,3,

are the main term processes respectively for those studied in Theorems 1.1.2a, 1.1.2b and 1.1.2c, and can be viewed as special Student processes for triangular sequences {ui(j,n), 1 < i < n, n > 1}, • ■ of dependent random variables. Moreover, all the processes in (1.1.36) are handled here via our auxiliary Lemma 1.2.10 of Section 1.2.2. The latter lemma extends the recently obtained Studentization invariance principles in [16] and [17], and also serves as a universal result that is applica­

ble to all reasonable estimators in the context of ( 1.1.1)-( 1.1.2) that are based on (y, x, Syy, Sxy, Sxx) and their corresponding processes (cf. also Remark 1.2.7 of Sec­ tion 1.2.2). Due to Studentization, invariance principles of Theorems 1.1.2a-1.1.2c are invariant with respect to distribution of (£, <5, e) satisfying (B), (D), (E) and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 32

the condition in (l)-(3 ) that is appropriate for the process in hand, strikingly free of any unknown parameters of this distribution (depend only on error moments that are assumed to be known according to corresponding (l)-(3)), and the only un­

known parameter appearing in normalizers ( J27=i(ui(j,n) ~ u(j, n) )2/( n ~ 1)) ^ ,

(Y^i{vi(j,n)-v(j,n))3/(n-l)) 1/2 and ( E L i {M l,n ) ~w(j>n) f / ( n - 1)) 1/2, j = 173, is /?. Some immediate applications of Theorem 1.1.2 and upcoming Theo­ rem 1.1.3, where (3 will be replaced with its WLSE or MLSE’s that are available in corresponding cases (l)-(3 ), will be discussed in Section 1.1.5.

(b) Speaking of CLT’s, as opposed to (i)’s with to = 1 of Theorems 1.1.2a-1.1.2c,

although 1/rc-asymptotic normality of (Pin, Si„, 0in), or that of (P\n, Sin), which fol­ low from Gleser ([21], [22]) (cf. Remark 1.1.3 for details), is generally proved without assuming normality or normality-like conditions on the error terms, its applicable form requires error moments being as if (5,e) had a N(O,diag(A0,0)) distribution. This is because, in general, the expressions for the covariance matrices of the asymp-

/ S >*«s totic distributions of (P\n, 2i„, 0in) and (Pin, Sln) in these CLT’s are complicated and involve unknown error term cross-moments up to and including moments of order four, in addition to unknown parameters ft, E £ and Var£. As pointed out in Gleser [21], these error moments are hard to estimate from data. On the other

hand, when Gleser assumes that error moments are identical to those in N(0, 0/ 2), the covariance matrix of the asymptotic normal distribution of (P\n, Sin, 0in), or of (Pin, c*in)i becomes simple in form and contains only the following unknown param­ eters: P, 6, E £ and Var £. In this set-up, in addition, estimation of this covariance matrix only requires consistent estimators of E £ and Varf, that are available due to

[21], [22]. Consequently, even when Var £ < 00, CLT’s of (i)’s with t0 = 1 and j = 1 in Theorem 1.1.2 provide essentially new results as compared to marginal CLT’s for

Pin, Sin and din that follow from [ 21], [22], since our results do not involve E£, Var £ and various hard-to-handle error moments, and are almost completely data-based

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 33

(will be made exactly data-based in Theorem 1.1.3). Similarly to [21], [22], CLT’s

for 0 2 n, «2n) and (/33n, S3n) in [9] also suffer from complicated covariance matrices of their asymptotic normal distributions unless 8 and e are assumed to be normal, while

normality assumptions of [ 11] and [ 10] allow simple estimable forms for covariance /N OS ^S. matrices of the asymptotic normal distribution of 0 2 n, &2n, 62n) and (/%„, a 3n, A#3n) respectively (cf. Remark 1.1.3 for details on these CLT’s). Also, in Theorem 1.2.1 in [19], where (/33n, S3n) are shown to be consistent and -y/n-asymptotically normal, normality of (£, 8 , s) with positive definite diagonal covariance matrix had to be assumed to overcome difficulties with “Slutskying” of the covariance matrix of the

asymptotic distribution of (/? 3n,S3n). We note that consistent estimators for vari­

ances of asymptotic distributions of estimators of /3 proposed in [ 21], [22], combined,

and [19] are different from expressions n )—u(j, n) )2/( n — 1 ) ) n) and (Y%=iU?[{j,n)/n)^2U~l{j,n) of upcoming Theorem 1.1.3 and part (b) of Re­ mark 1.1.5 respectively, j = 1,3, that serve as consistent estimators of those vari­

ances in case Var£ < 00 here. In sum, we are appreciative of the fact that the basic original idea behind Studentization naturally accommodates itself in CLT’s of Theorem 1.1.2 in the advanced enough context of SEIVM’s (1.1.1)-(1.1.2) via making these CLT’s data-based and free of the unknown moments of unobservable explanatory and error variables.

(c) Concerning results of Theorem 1.1.2, we could have taken an alternative self- normalization approach (cf. more in Section 1.2.1). In fact, in this regard, Theorem

1.1.2 continues to hold true when normalizers ( £ ”=1 n )—u(j, n) )2/( n —1)) ^ ,

( n) - v(j, n) )2/(n - 1)) 1/2 and ( Y,?=1(wiU, n) - w(j, n) )2/(n - 1)) 1/2

of the corresponding processes are respectively replaced with ( £ 7=1 u?(j, n)/nj ^ ,

( E?=i v?(j, n)/n ) 1/2 and (E"=i w ^(j,n)/nj 1/2, j - 173. The proof of this fact follows from Lemma 1.2.8 of Section 1.2.2, with (X)”=i(Ci>&)2/ n) ^ in place of

( E7=i (Ci ~ C, b)2/(n — 1)) (based on the WLLN and Remark 1.2.5), and from

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 34

Lemma 1.2.10 of Section 1.2.2 with Mn>t having a new denominator +

d ^ X i + d ^ x ^ y y + d (4)s iiXy + d ^ si>xx)2>j 1/‘ (follows from the modified Lemma 1.2.8 and Remark 1.2.5, similarly to the proof of original Lemma 2.1.10). Omitting the details of this proof, we only note here that, as opposed to self-normalization, Stu­ dentization in Theorem 1.1.2 makes us benefit from normalizers n) —

v(j, n) )2/( n — 1)) ^ being a-free, and from the absence of unknown error vari­ ances in (Y,i~i{wi{j,n) — w(j,n) )2/(n — 1)) ^ 2. It appears that two approaches are equally effective for Theorem 1.1.2a. However, in view of the agreement of the latter theorem with the joint CLT’s established in upcoming Theorem 1.1.4, it is more convenient to keep (i)—(iii) in Theorem 1.1.2a in their present Studentized forms. (d) In view of the conclusive lines of (b) of this remark, one may wonder about the necessity of working out asymptotics at all for the estimators and processes for unknown error variances in Chapter 1. Indeed, $in, 0%n and X93n are not anymore needed to make the CLT’s in SEIVM’s (1.1.1)-(1.1.2) applicable. In fact, this is also true in regards of all other invariance principles of Section 1.1.4. Nevertheless, -''N when Var£ < oo, 9\n and 92n may come handy for estimating the so-called reliability ratio (cf. (c) of Remark 1.1.6) that may also serve as an indicator for reliability of confidence intervals for /5 (cf. Section 1.1.5). In any case, technically, asymptotics for 9\n and 92n under Var£ = oo are handled simultaneously with the case Var£ < oo.

Enriching the space of explanatory variables from the traditional two-moment class to DAN class in SEIVM’s (1.1.1)—(1.1.2), i.e., assuming (D), practically ex­ hausts the choice of explanatory variables from (C) that allow the respective Gaus­ sian limits in (i)—(iii) of Theorems 1.1.2a-1.1.2b to hold true, as stated in the follow­ ing Proposition 1.1.1. For a more general result along these lines, we refer to Remark

1.2.8 of Section 1.2.2, where we study special Studentized partial sums processes cor­ responding to all estimators that are reasonable functions of (y, x, Syy, Sxy, Sxx) in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 35

SEIVM’s (1.1.1)—(1.1.2) (cf. also Remark 1.2.7 of Section 1.2.2).

Proposition 1.1.1.

(i) Under the respective conditions of Theorems 1.1.2a-1.1.2c, processes in (1.1.36) are the respective main terms for the processes studied in these theorems.

(ii) Suppose that assumptions (B), (C), (E) and the assumption in (l)-(3 ) that is appropriate for the process in hand are satisfied. Then, mutatis mutandis, statements (i)—(iii) of Theorem 1.1.2a for the processes of this theorem hold respectively true for the main terms of the corresponding processes in (1.1.36) if and only if £ € DAN. In addition, mutatis mutandis, in case in (C) and Var£ < oo, (i)—(iii) of Theorem 1.1.2b for the processes of this theorem hold respectively true for the main terms of the corresponding processes in (1.1.36) if and only if £ E DAN.

(iii) Assume (B), (C), (E) and the assumption in (2 )-(3) that is appropriate for the process in hand. Suppose that intercept a is known to be zero. Then each of (i)-(iii) of Theorem 1.1.2a for MLSP’s of/3 (cases j = 2 and 3) is equivalent

to the assumption that £ E DAN.

O bservation 1.1.1. Using (i)’s with to = 1 of Theorem 1.1.2a, it will be proved in Section 1.2.2 that for the WLSE and MLSE’s of /?, as n —> oo,

(1.1.38)

where £^(n) is a slowly varying, typically unknown function at infinity (the one from (1.2.28)) that converges to infinity when Var£ = oo, and equals a positive constant when Var£ < oo, while Cj are positive constants. Thus, f3jn are y/nt^(n)~ asymptotically normal estimators of fi. In this regard we note that when Var£ = oo, as first allowed in this chapter (cf. (D)), the degree of precision of the WLSE and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 36

MLSE’s of ft increases. This effect naturally meets our empirical expectations in the

sense that intuitively, letting & in ( 1.1.1)-(1.1.2) have an infinite makes them more dominant over the errors with finite variances. This, in turn, makes observations y* and Xj more robust to noise (errors) and thus, more precise. As to estimators Sjn of a, j — 1,3, and 6\n, Bin and \9zn of unknown variances, from respective (i) of Theorems 1.1.2b and 1.1.2c, under (D) assumed, they will be shown to be v^i-asymptotically normal.

As anticipated in Remark 1.1.4, eliminating /?, the only unknown parameter in

the normalizers ( n )-u (j,n ) )2/( n - 1)) 1/2, ( E?=i(^iC?, n)-v(j, n) )2/{n~

1)) and ( 12t=i(wi(L n) — w(j,n) )2/{n — 1)) of Theorems 1.1.2a-1.1.2c, we provide below complete data-based versions of these theorems in Theorems 1.1.3a- 1.1.3c, that are readily available for some immediate applications as in Section 1.1.5 and collectively called as Theorem 1.1.3 on occasions.

Theorem 1.1.3a. Let all the assumptions of Theorem 1.1.2a be satisfied. Introduce

„ A-/?’ . , a n I ^ s i,xx s i,yy -5 s i,xy I j i f 3 ~ 1 ; A + PL \ Pin J (11 3 9 x (s i,yy ~ @2n{s i,xy ~ fi) > * / j ~ 2, (Si,xy fi) Psn{^i,xx j = 3.

Then, as n —* oo, statements (i)—(iii) of Theorem 1.1.2a continue to hold true with

Ui(j,n) in place ofui(j,n), j = 1 7 3 .

Theorem 1.1.3b. Suppose that all the conditions of Theorem 1.1.2b are valid. Let

ViUtn ) = (Vi - a ) - PjnXi - -J J -Q - n j Ui(j, n ) , j = 1,3, (1.1.40)

where U(j,n) andui(j,n) are as in (1.1.31) and (1.1.39), j = 1,3. Then, a sn —> oo,

statements (i)—(iii) of Theorem 1.1.2b with Vi(j,n) replacing Vi(j,n) remain valid, j=T73.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 37

Theorem 1.1.3c. Assume all the assumptions of Theorem 1.1.2c. Let

ipi.yy n (.SiiXy A * ) “ I " 3 =

P 2n (i s i,yy ~ ~ ^P2n(si,xy ~ p) + @2n(si,xx ~ > */3 = 2,

(isi,yy 2/?3n{Si,xy P) T 03n{Si,xx &)1 tf j ~ 3. (1.1.41) Then, as n —> oo, statements (i)—(iii) of Theorem 1.1.2c, with wi(j,n) replacing u>i(j, n), continue to hold true, j — 1,3.

Newly introduced condition £ e DAN turns out to be nearly optimal also for the invariance principles of Theorems 1.1.3a-1.1.3b, as stated in the following companion of Proposition 1.1.1.

Proposition 1.2.1. Mutatis mutandis, (i) and (ii) of Proposition 1.1.1 hold for the respective processes that are under study in Theorem 1.1.3.

R em ark 1.1.5. (a) Just like CLT’s of Theorem 1.1.2 (cf. (b) of Remark 1.1.4), CLT’s of Theorem 1.1.3 are first CLT’s that are completely data-based and free of the unknown moments of unobservable explanatory and error variables, both in case Var£ < oo, when neither the explanatory variables, nor the error terms are assumed to be normal or normal-like, and, of course, in case Var£ = oo. In particular, under Var£ < oo without the normality assumptions, normalizers */nU(l,n)(^?=l(ui{l,n) - 2(1,n) )2/(n - 1)) 1/2, V»»(E5U(«i(l,»0- v(l,n) )2/

(n — 1)) and y/n( ^ ”=i(u>i(l, n) — w(l, n) )2/(n — 1)) ^ can be viewed as first estimators for the square roots of the inverses of the respective variances of the asymptotic distributions in CLT’s that follow from [21], [22] combined and [9] (cf. (b) of Remark 1.1.4 for details on these CLT’s). These normalizers are different from the estimators of the asymptotic variances in the aforementioned CLT’s that are available under Var£ < oo and normality or normality-like assumptions on the error terms.

(b) Theorem 1.1.3 is also valid when denominators ( YZ=i(ui(j, n)—u(j, n) )2/(n —

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 38

1)) 1/2, ( Ei=i(vi(j, n) - v(j, n) )2/(n - 1)) 1/2 and ( n) - w(j, n) )2/{n -

1)) 1/2 of the corresponding processes are replaced with (E"=i^iO>n)/n) ^

(Ei=iVi(j>n)/n) 1/2 and (E”=i^iO »/«) 1/2, respectively, j = 173. The proof of this result is based on its companion in part (c) of Remark 1.1.4 and the proof of the original Theorem 1.1.3. We note that such Theorem 1.1.3a is completely data-based. (c) Process

______y/nL(l,n) (flin - 0)t______yjn w(l,n)t______

(£?=i(wh(l,rc) -w(l,n))2/(n- 1))1/2 (E"=i(^*(1,^) - w(l,n))2/(n- 1))1/2

studied in Theorem 1.1.3c, 0 < t < 1, is yet another special univariate Student process that is built on a triangular sequence of dependent random variables. For a collection of our other special univariate Student processes in Chapter 1 we refer to (1.1.36) and Remark 1.2.7 of Section 1.2.2.

Remark 1.1.6. In this remark we address (l)-(3) and the rest of major iden­ tifiability assumptions for SEIVM’s (1.1.1)-(1.1.2) that are frequently used in the literature. (a) As emphasized already right at the very beginning, the whole estimation inference of Chapter 1 is always based on the appropriate additional identifiability assumption in (l)-(3). For some discussions on identifiability of unknown parame­ ters in SEIVM’s (1.1.1)—(1.1.2) concerning the areas of research where identifiability assumptions are commonly found, on how to obtain identifiability assumption in­ formation, and on some alternative approaches to EIVM’s that do not require such supplementary information to make model parameters identifiable, we refer, e.g., to

texts [12] and [19], and Gleser [21]. As to identifiability condition ( 1), we note in particular that situation of uncorrelated errors (/i = 0) with equal variances (A = 1)

seems to present quite a reasonable and natural special case when ( 1 ) is automat­ ically satisfied. It is interesting to note that any identifiability assumption (e.g.,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 39

one of (l)-(3)) was originally introduced to make the normal SEIVM’s (1.1.1)—

(1.1.2) ((yi, Xi) are normally distributed) identifiable, while the parameters in the non-normal models are identifiable in some cases without further assumptions (cf. Section 1.2.1 of [12]). In such non-normal models, for estimation that does not require any supplementary information, instead of the estimators introduced in Sec­

tion 1.1.2, estimators based on the method of sample higher moments or cumulants and the method of characteristic functions may also be used (cf. Van Montfort [63] and texts [12], [33]). These methods are not within the scope of the present work, since we do not exclude the possibility of normal SEIVM’s (1.1.1)-(1.1.2) for which one has to assume one of (l)-(3) in any case. More importantly, we attempt to develop asymptotic normal theory under best possible moment assumptions on the explanatory and error variables. Moreover, independently of their technical origins, restrictions of identifiability assumptions (l)-(3) may, in general, be viewed as the

ones that make intuitive sense, in that they fairly concretize model ( 1.1.1)—(1.1.2), and thus help to control and diminish the effect of its error terms for the sake of the possibility of meaningful inference under reasonable conditions on £ and the error moments. (b) As to further major identifiability assumptions for SEIVM’s (1.1.1)-(1.1.2) that are found in the literature along with (l)-(3 ), but left out from the main lines of development of Chapter 1, they are:

(4) reliability ratio — Var £/ (Var £ + Vare) is known, provided that Var£ < oo, E(6e) = 0 and intercept a is known;

(5) intercept a is known and ^ 0.

In fact, in (1), the cases of T being known up to an unknown multiple and V being completely known are usually treated as two separate identifiability assumptions in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 40

the literature. We note that the weighted least squares approach (cf. Section 1.1.2)

allows us to accommodate them in single ( 1) here. (c) From [12], assumption (4) is commonly found in the social science and psy­ chology literatures, and is also referred to as heritability in genetics. Coefficient k adjusts the estimator Pon for consistency (under (A) with H = 0 in (1.1.3), (C), (E) and Var£ < oo):

Pin = k^pon = &XX As always,

® in — y X P in

serves as consistent estimator for a, while, according to [12], the MLE’s (the ex­ planatory and error variables are normally distributed) for unknown error variances are

d4n — (1 k^)Sxx and A04n = Syy Pin ?

provided that Sxx SXy/Pin — 0? $xx Bin — ^yy ~ ^Bin ^ Oj ^yy ^xyPin ^ 0

and sign(5a:j,) = sign(/? 4n). It is noted in [12] that case (4) is closely connected to case (3). We also note that p4n, a4n and A04n are similar in form to p3n, otzn

and A03„, introduced in Section 1.1.2. From Section 1.1.2 of Fuller [19], assuming

that (&, Si, £{) are i.i.d. full normally distributed random vectors and that c = 1 in (1.1.7), we have

r p ______k \/nSxx{P in P )______

( E ”=i ((Vi ~y)~ kPin(xi - x)fl{n - 2 ))1/2 which is known to have a Student t distribution with n — 2 degrees of freedom, and

Tn N (0, 1), n —> oo.

(d) We learn from [12] that (5) serves as yet another identifiability assumption for normal SEIVM’s (1.1.1)—(1.1.2), and the respective MLE’s are

P§n = — i X 7^ 0, XBfrn = \&3n and 05n = ^2ra> X

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 41

«■—**» <>s where X03n and 02n are as in (1.1.24) and (1.1.23). Omitting the details, we note that {fan, X0sn, fan) and its properly defined corresponding process can be studied asymptotically via using the methods of Chapter 1, without assuming normality for (£,d,e). In particular, under (A) and (C) with E £ ^ 0, it is easy to see that, as n —> oo,

rp ______V fix05n - p) ______v u n~ / / \ 2 \l/2 ' ’ ' ( E L 1 ({Vi -y)~ P(xi - z)) /(n - 1)) and ______V n x 0 sn ~ P)______

( E L i (ivi -y)~ n0i - x)f f(n - 1))1/2 However, if (6, e) has a normal distribution, an important observation here is that Tn has a Student t distribution with n — 1 degrees of freedom. We also note in passing that while, on one hand, throughout Chapter 1, distinction between no-intercept model and the one with unknown intercept is made disregarding the identifiability problem, on the other hand, when intercept a is known to be zero, one can also use

estimator y /x in place of Pin, fa n and Pzn, that are studied respectively under ( 1), (2) and (3). (e) In Chapter 1, SEIVM’s (1.1.1)-(1.1.2) with

(D’) £ € DAN and Var£ = oo

are studied under (l)-(3) along with their “classical” companions with explana­ tory variables satisfying 0 < Varf < oo. However, it is interesting, and also of interest, to observe that none of the identifiability conditions (l)-(3 ) is necessary for constructing consistent estimators for ft and a under (D ’). For example, using (1.2.19) and (1.2.21), it is easy to see that pn = Syy/ $xy Pn — Sxy/Sxx are consistent estimators for p. Defining reliability ratio fcg of (4) to be 1 if (D ’) is satisfied, estimator Sxy/S xx is seen to be equal to fan used under (4) in part (c) of this remark. The existence of these consistent estimators for p implies that p is

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 42

identifiable, and the latter fact, when (5, e) has a normal distribution, can also be

concluded from Reiers 0l [56], as it holds if and only if £ is not normally distributed. As to a consistent estimator for a under (D ’), one has y — xj3n, while construction of respective consistent estimators for A 9 and 9 is being developed in [48]. In fact, it is reasonable to believe that no side conditions are needed in order to provide consistent estimators for all the parameters of interest in SEIVM’s (1.1.1)-(1.1.2) under (D ’), namely for (3, a, A6 and 9. This is because under (D ’) our model is more robust to resisting the impact of the error terms with finite variances in X{, close enough in spirit to, and behaves as if it were the ordinary regression model, which, by the way, is identifiability problem free. Such a view of SEIVM’s (1.1.1)—(1.1.2) under (D’) can also shed light on why the ordinary least squares estimator fio„ for (3 does not require an adjustment in this case (cf. also (c)).

Remark 1.1.7. When studying /?lm a\n and 9in in Theorems 1.1.2 and 1.1.3, we assume for convenience that /z = 0 in (1.1.3) and thus, the respective identifiability condition (1) reduces to the ratio of the error variances being known. Between such

a model and the other type of the model covered by ( 1), with ^ 0 and matrix T known up to an unknown multiple, there is a close data-transformation based interplay that follows from [21]. We note that this interplay concerns the so-called functional EIVM with A = 1 and /z = 0 in T of (1.1.3) and the one with arbitrary F as in (1), and is also valid in the respective context of our SEIVM’s (1.1.1)—(1.1.2) with explanatory variables satisfying (C) or (D), and (E). Moreover, it allows one to convert the WLSE’s of /? and a and the MLE of 9 obtained in the model with arbitrary T as in (1) to corresponding /3i„, Sin and 6in with A = 1. Hence, this

interplay can be adapted to obtain asymptotic results for the model with /z ^ 0 in T of (1.1.3) from Theorems 1.1.2, 1.1.3. It can also be used to translate asymptotic results from SEIVM (1.1.1)-(1.1.2) with A = 1 and /z = 0 to the one with arbitrary A and /z = 0. We also note in passing in regards of /?i„, ci\n and 9\n that, though

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 43

it does not take any extra effort to deal with arbitrary A in (1.1.3), in view of the described interplay, we could have assumed that A = 1, similarly to [21].

The rest of this section is dedicated to various multivariate CLT’s for the esti­ mators studied heretofore.

Theorem 1.1.4. Let assumptions (B), (D), (E) and, depending on the joint estimators in hand, the identifiability assumption in (1) (T of (1.1.3) is known up to an unknown multiple ) with p = 0 in (1.1.3), (2) or (3) be satisfied. For j = 1,3, assume also the respective conditions

vector f(C, bj), (C, af), (C, /i)) is full, if Var£ < oo, (1.1.42) {vector f(£, af), (£, h)) is full , if Varf = oo, where

C = ((£ — cm)S, (£ — cm)e, 5, e, 5s — p, 52 — X$, e2 — (1.1.43)

2/?(A - /32) 2j32 2A/32 ^2/3, —2/?2, 0, 0, A + /32 ’ A + /32’ A + /?27 ’ i f j ~ ll bj — < (1.1.44) (/?, —/32, 0, 0, -13, 1, 0) , i f j = 2, (1, -/?, 0, 0, 1, 0, -P ) , i f j = 3,

/ 7 7 7 (0, 0, 1, -/?, 0, 0, 0) - b i, i f j = 1 and Var£ = M < 00, 2 M(3 m (0, 0, 1, ~/3, 0, 0, 0) - b2 , i f j = 2 and Var £ = M < 00, a,- (1.1.45) TTt (0, 0, 1, — (3, 0, 0, 0) - — 63 , i f j = 3 and Var £ = M < 00,

(0, 0, 1, ~/3, 0, 0, 0) , ifV ar£ = 00, j = 1,3, and ft = (0, 0, 0, 0, -2 0 , 1, 0 2), (1.1.46)

with m — E £. For j = 173, using Ui(j,n), Vi(j,n) andwj(j,n) 0/(1.1.32), (1.1.33) and (1.1.35) respectively, define vector

Pi(j,n) = (ui{j,n), Vi(j,n), wfij, n)) (1.1.47)

and matrix V(j, n) = Y,(PiU, n) - p(j, n) )T(pi(j, n ) - p(j, n )). (1.1.48) i=i

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 44

Let U (j,n) and L (j,n ) be as in (1.1.31) and (1.1.34). Then, for j = 1 and 2, as n —»oo,

V™(jJ(j>n)(Pjn — (3), (ajn-a), L(j,n)(6jn-6)) ( ( n - l ) _ 1V (j,n)) T/2 iV(0,J3).

Moreover, this convergence remains valid when U (j,n), /3jn, ajn, 6jn, L(j,n) and V (j,n ) are replaced with U(3,n), fizn, a 3n, A03n, L{3,n) andV(3,n), respectively.

Remark 1.1.8. This remark presents similar points as Remark 1.1.3 does for Theorem 1.1.4. First, it appears that Theorem 1.1.4 provides first time CLT’s in case £ € DAN and Var £ = oo (as allowed by (D) in view of Remark 1.2.1 of Section

1.2.1) and CLT’s for (/32n, S2n, 02n) and (fan, a3n, A03re), when 0 < Var£ < oo, and neither the explanatory variables, nor the error terms are normally distributed. Second, just like CLT’s of Theorem 1.1.2 imply corresponding marginal CLT’s that follow from [21], [22] combined, [9], [11] and [10] (cf. Remark 1.1.3), multivariate CLT’s implied by the latter papers, namely v^-asymptotic normality for estimator triples 0in, 5in, 0in), 0 2n, S2„, 02n) and (/33n, S3n, A03n), respectively, are partial cases of Theorem 1.1.4. For example, restricting ourselves to the conditions in [21], [22] combined in regards of (&„, ain, 0i„), i.e., assuming (C) and Var£ < oo instead of (D), but not necessarily assuming that A = 1 in T of (1.1.3), the arguments go as follows. Since for Bn of (1.2.191) of Section 1.2.4, y/nBn = A f 1^2, where A x =Cov({C, &i), (C, ai), (C, fc>) > 0 on account of (1.1.42), convergence in (1.2.204) with such Bn implies that, as n —> oo,

((n - I )-1 V (l,n ))T/2 A j/2. (1.1.49)

Next, due to (1.2.19) and Lemma 2.2 of Mailer [42], as n —> oo,

d iag(C /(l,n), 1, L (l,n )) -£• Dx and (diag(t/(l, n), 1, L ( l,n ) ) ) 1 -£■ D f 1,

with some diagonal matrix D\ having nonzero diagonal elements. Therefore, as n —> oo, convergence

Vn(U(l,n)(/3in -0 ), (aln - a), L(l,n)(dln - 0)) ((n - 1)_ 1V(1 ,n)) T/2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 45

= V n (A n -? , S in -a , h n - 6 ) diag(t/(l,n), 1, ((n-1) *V(1, n.)) ® JV(0,h)

leads to y/n(fiin ~ P, Si n - a, eXn - 0) % N ( 0, DxlA xDxl),

where matrix Dx xA xDx l coincides with the covariance matrix of the corresponding

asymptotic normal distribution that is obtained from [ 21], [22]. As to condition (1.1.42) being assumed in Theorem 1.1.4, it is necessary and sufficient for A x > 0 in (1.1.49) and thus, also, guarantees nondegeneracy of the latter normal limiting distribution. In particular, when the error terms S and e are assumed to be inde­ pendent and normally distributed, (1.1.42) is satisfied (cf. part (a) of Observation 1.1.2). Further remarks and observations on condition (1.1.42) are summarized in Observation 1.1.2. Similar arguments and Remark 1.1.9 can be applied to show that cases j = 2 and j = 3 of Theorem 1.1.4 contain CLT’s respectively in [9] and [11], and in [9] and [10] (cf. Remark 1.1.3 for respective conditions of these CLT’s).

Remark 1.1.9. Naturally, conditions of Theorem 1.1.4 imply also joint CLT’s for every pair in (/ 3ln, a Xn, 6Xn), i.e., for (PXn,a Xn) (also holds when T of (1.1.3) is completely known), ( a Xn,9 ln) and 0 Xn, 0Xn), and marginal CLT’s in respective (i)’s of Theorem 1.1.2 for each of fiXn, aXn and 9ln. To see this, we first note that, just like Theorem 1.1.4 itself, all these CLT’s are based on our auxiliary Lemma 1.2.17 of Section 1.2.4. Therefore, it suffices to show that condition (1.1.42) of Theorem 1.1.4 is stronger than its respective prototypes in (1.2.144) and (1.2.145) in regards of any 2- and 1-dimensional CLT’s at which we aim. While this fact is true in general, for all these CLT’s, we are to see now how this implication works in some particular cases. For example, for ( fiXn, Sin) (under (1)), (1.2.144) and (1.2.145) read as

f vector ((C, &i), (C, oi)) is full, if Varf < oo, , . \ Var (C, ax) > 0 , if Var£ = oo,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 46

where bx and ai are from (1.1.44) and (1.1.45) (the second condition in (1.1.50) is automatically satisfied due to (B)), while, say, for (Sin, d\n) these conditions reduce to vector ((C, aj), (C, h)) is full, (1.1.51)

with h of (1.1.46). Clearly, on using the definition of a full vector, both (1.1.50) and (1.1.51) follow from (1.1.42). Naturally, this remark is also true in regards of the CLT’s for 02n, S2„, 02n) and 0 3n, a3n, X03n).

Remark 1.1.10. Supplementing Remark 1.1.4, we further underline the power of Studentization in our main results that distinguishes them from related results in the literature. In particular, due to the fact that the properly centered and normalized triples of estimators studied in Theorem 1.1.4 (e.g., ^(U (l,n)0in — 0), (3ln — a), L (l,n)(& „ - 6)) ((n - \)-lV{\,n))~T'2) are essentially special Student statis­ tics (cf. (1.2.135) for the definition), as follows from the proof of Theorem 1.1.4 and Remark 1.2.20, CLT’s of Theorem 1.1.4, just like those of Theorem 1.1.2, have the following features. They are invariant with respect to distribution of (£, <5, e) satis­ fying (B), (D), (E) and, depending on the joint estimators in hand, the condition in (1) (T of (1.1.3) is known up to an unknown multiple) with (j, = 0 in (1.1.3), (2) or (3), do not contain unknown parameters of this distribution (depend only

on error moments that are assumed to be known according to corresponding ( 1)- (3)), and the only unknown parameter appearing in normalizing matrices V (j,n) of Theorem 1.1.4 is f3, j — 1,3. Some immediate applications of Theorem 1.1.4 may require complete data-based form of this result, i.e., estimating (3 of V (j,n). The latter is to be accomplished in our upcoming main Theorem 1.1.5 in this regard. In contrast, as discussed in (b) of Remark 1.1.4, related results in the literature, namely CLT’s that follow from [21], [22] combined and [9], contain a number of unknown hard-to-estimate parameters in covariance matrices of asymptotic normal distributions and thus, this, in general, aggravates estimability of these matrices and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 47

hence, applicability of corresponding CLT’s. In fact, matrix (n — l)_ 1V (l,n) and in view of Remark 1.1.9 appropriate normalizing matrices in regards of our CLT’s for (fan, fan) and (fan, fan) are, to the best of our knowledge, the first consistent esti­ mators of the just mentioned corresponding covariance matrices if Var£ < oo, and neither the explanatory variables, nor the error terms are assumed to be normally or normally-like distributed. This shows CLT’s of Theorem 1.1.4 to be new ones of this kind. As to CLT’s in [11], [10] and Theorem 1.2.1 in [19] (cf. (b) of Remark 1.1.4), to avoid difficulties of estimating the covariance matrices of the asymptotic normal distributions, (£, 6, e) is assumed to be a priori normally distributed with covariance matrix diag(Var£, Xd, 6) > 0. Coming back to the CLT’s of Theorem 1.1.4, we also note that they are obtained with the help of Lemma 1.2.17, a nontrivial extension of a Studentization CLT that is universal for all reasonable joint estimators based on (y, x, Syy, Sxy, Sxx) (cf. Remark 1.2.20 for details).

Rem ark 1.1.11. To be in total correspondence with univariate invariance principles of Theorem 1.1.2, it is desirable that multivariate Theorem 1.1.4 also should contain weak invariance principles and sup-norm approximations in probability, companions to respective parts (ii) and (iii) of Theorem 1.1.2.

We conclude Section 1.1.4 with a multivariate companion to Theorem 1.1.3, a complete data-based version of Theorem 1.1.4 where /3 in normalizing matrices V(j, n), j = 1,3, is estimated.

Theorem 1.1.5. Let all the assumptions of Theorem 1.1.4 be satisfied. For j = 1,3, viau(j,n), v(j,n ) andw (j,n) from (1.1.39)-(1.1.41) define vector

Pi(j, n) - (ui(j, n), Vi(j, n), Wi(j, n)) (1.1.52)

and matrix

V(j, n) = J2(pi(j, n) - p(j, n) )T(pi(j, n) - p(j, n ) ). (1.1.53) i=l

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 48

Then, Theorem 1.1.4 continues to be true after replacing V (j,n ) with V (j,n), j = M.

R em ark 1.1.12. Employing similar arguments to those used in Remark 1.1.9, from conditions of Theorem 1.1.5 one can also obtain various immediately applicable 2- dimensional CLT’s for our estimators of interest, as well as univariate CLT’s of Theorem 1.1.3.

Observation 1.1.2 . In (a)-(d) below, we comment on conditions of (1.1.42) in Theorems 1.1.4 and 1.1.5. These conditions are introduced for handling the main terms in expansions for the centered and normalized estimator triples studied in Theorem 1.1.4 (cf. the proof of Theorem 1.1.4 in Section 1.2.2). Also, (1.1.42) is a multivariate analogue of assumptions Var {(, af) > 0 (case Var£ = oo and/or E £ = m — 0) and Var(£, h) > 0 that play similar roles in the proof of Theorem

1.1.2 and are easily checked within that proof without appearing as conditions in the formulation of Theorem 1.1.2. As opposed to such verifiable univariate conditions, verification of (1.1.42) in general is more challenging and is a subject of author’s current investigations. While a more general answer to when (1.1.42) is satisfied is still on its way, we provide below some immediate sufficient conditions for (1.1.42), and also discuss some equivalent conditions to (1.1.42) that may be helpful for its further analysis. (a) Using part (a) of Remark 1.2.19 on sufficient conditions for prototypes (1.2.144) and (1.2.145) of (1.1.42), we conclude that (1.1.42) holds true if error- based vector (8, e, Se, S2, e2) is full. In particular, the latter vector is full if the moments and cross-moments of (5, e) are identical, up to and including moments of order four, to the corresponding moments of the nondegenerate N(0, diag(A0,9)) dis­ tribution, since the covariance matrix of (S, e, Se, S2, e2) is positive diagonal in this case. We note that the second respective parts of sufficient conditions for (1.2.144) and (1.2.145), given in (a) of Remark 1.2.19, are automatically satisfied in the case of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.4 Main Results with Remarks and Observations 49

(1.1.42) in hand. Namely, vectors bj, dj and h of (1.1.44)-(1.1.46), when Var£ < oo, and dj and h, when Var£ = oo, are linearly independent, j — 1,3.

(b) As discussed in part (b) of Remark 1.2.19, conditions in (1.1.42) are respec­ tively equivalent to inequalities det (Cov((£, bj), (C, aj), (C, h))j > 0 and det (Cov((C, o,j), (Ci /*))) > 0, j = 1,3, that, in turn, are verifiable, at least in prin­ ciple, with computer software like, e.g., “Maple”. Having acquired some experience in this regard (cf. Section 1.2.5 Appendix), the author hopes to be able to conclude her further current investigations on (1.1.42) by using this approach.

(c) Some other equivalent conditions are available for (1.1.42). In Section 1.2.4, the respective assumptions in (1.1.42) will be seen to be equivalent to the following data-based conditions:

n~lV(j,n) positive definite matrix , if Var £ < oo,

< E t i (?iU,n) n)> &iU,n) ~w(j, n))T(viU, n) - v(j, n), Wj(j, n) - wjj, n))

—>p positive definite matrix n , if Var£ = oo. (1.1.54)

(d) Naturally, mutatis mutandis, remarks and observations on conditions of (1.1.42) in (a)-(c) above also hold in regards of the respective conditions required for any of 2-dimensional CLT’s for our estimators of interest (cf. Remark 1.1.9, 1.1.12).

Remark 1.1.13. Due to convergence in (1.2.196) and (1.2.206) with invertible matrix Bn of (1.2.191), norming symmetric matrices V(j, n) and V(j, n) respectively in Theorems 1.1.4 and 1.1.5 are positive definite WPA1, as n —> oo (cf. also Remark 1.2.13 of Section 1.2.3). Hence, according to the definitions related to matrix square roots, that are given in Section 1.2.3, V~r/2{j,n) and V~T/2(j, n) are well-defined WPA1, n —> oo. A similar remark holds true for the norming matrices regarding the CLT’s of Remarks 1.1.9 and 1.1.12.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 50

1.1.5 A Note on Some Applications

In this section we address some immediate applications of the main results of Section 1.1.4. Abbreviations LSA and Cl stand for large-sample approximate and confidence

interval respectively, while za/2 and £n>a/2 denote the 100(1 — a /2)th of the standard normal distribution and the Student t-distribution with n degrees of freedom respectively, 0l.

Finite-sample properties of estimators for /3

So far, we have only been discussing various aspects of asymptotic theory for SEIVM’s (1.1.1)—(1.1.2). In particular, all the estimators for slope /? that are introduced in Section 1.1.2 are seen to have reasonable asymptotic properties. We are now to address some unsatisfactory finite-sample properties of these estimators. In order to compare performances of several competing estimators in finite sam­ ples under a common identifiability condition in (l)-(3 ), or under different as­ sumptions in (l)-(3), one may like to use a popular mean square error comparison.

However, none of fan, fa n and fa n of Section 1.1.2 can be compared to another com­ peting estimator of using this moment-based criterion. This is because fan, fan and fan have neither means, nor variances, that also leads to failure of properties such as unbiasedness, minimum variance and minimum mean square error. It is well known that if (£, S, e) follows a normal distribution with covariance

matrix diag(Var £, \ 8 ,6) > 0, then E\fan\ = 00, j — 1,3 (cf. Section 2.3 of Cheng and Van Ness [12]). Moreover, it is noted in [12] that this holds under some general,

not necessarily normal distributional assumptions, as well as that Var fan = 00,

j = 173, and that the same moment properties are shared by fa n introduced in (d) of Remark 1.1.6. An exception in this regard is fan from (c) of Remark 1.1.6, since it is a multiple of the ordinary least squares estimator, which possesses finite mean

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 51

and variance.

According to [12], Anderson [3] was the first one to discover unfortunate mo­ ment properties of slope estimators when he was dealing with Pin, as an MLE of /?, under (1) and fx = 0 in (1.1.3), in the context of the so-called functional EIVM’s, companions to SEIVM’s (1.1.1)-(1.1.2). Based on the density of the exact distribu­ tion for Pin, that is too complicated to be widely useful, it is concluded in [3] that “no moment of positive integral order exists”. In order to favour consistent fiin to

inconsistent ordinary least squares estimator Pqn of Remark 1.1.6 in finite samples, Anderson [3] introduces distributional concentration as an alternative to expected mean squared error comparison (cf. also Section 2.3 of [12] for details).

The unfortunate lack of moments for most estimators of P in SEIVM’s (1.1.1)- (1.1.2) prompted some researchers to “fix” these estimators so that the new estima­ tors would have finite moments. For example, in Section 2.5.1 of Fuller [19], the au­ thor modifies /%„ = Sxy/{Sxx — 9) in the context of SEIVM’s (1.1.1)-(1.1.2), assum­ ing that (£, 5, e) is normally distributed with covariance matrix diag(Varf, A0, 9) > 0 and (3) is satisfied. In fact, Fuller modifies the denominator of Psn so that it becomes bounded below by a positive number. The resulting estimator is nearly unbiased and has a finite variance. Another approach is proposed in Cheng and Van Ness [11], where the authors argue that in practice it seems quite reasonable that one would know a bound K on |/3|, and hence Pjn,K — min(|/?jn|, K)sign(Pjn) can serve as a modified estimator for any Pjn, j = 1,2,3,5. The new estimator is consistent and, clearly, has finite moments of all orders.

It would be of interest to know if the situation about moments of Pjn, j = 1,5, is any different under our newly introduced £ E DAN and Var £ = oo.

In conclusion, we propose an observation that allows one to look at the problem of slope estimator modification, which is addressed in the second to last paragraph, from a somewhat different angle.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 52

O bservation 1.1.3. As mentioned earlier, fan, fan and have no means and variances when 0 < Varf < oo. Suppose now that (B )-(D ) are satisfied and

intercept a is known to be zero. Then the self-normalized /52n and /? 3n, namely, d ______n(SXy l-i) (/32n ~ fi)______

and n(Sxx - 6)0zn ~ P)

are uniformly sub-Gaussian in the sense that

sup E etBjn < 2ec{2 for all t e 1R and some c < oo, j = 2,3. nglN Hence, moments of all orders of J32n and Bzn exist, both in case Varf < oo and Var £ = oo. As to the proof of this fact, using part (c) of Remark 1.1.4 and the proof of (iii) of Proposition 1.1.1, it is easy to see that B^n and B$n are self-normalized partial sums £™=1 Li Z 2)1^2, where {Z, Zi, i > 1} is a sequence of i.i.d.r.v.’s such that EZ — 0 and Z E DAN. Further, part (i) with to = 1 of Theorem 1.1.2 and part (c) of Remark 1.1.4 imply that I?2n and Bzn are stochastically bounded. Finally, according to one of the key results of Gine, Gotze and Mason [20], such

f?2n and Bzn are sub-Gaussian in the above sense. Arguing similarly under (5) of Remark 1.1.6, but not assuming that a is known to be zero, we conclude that

n(/? 5n — /3) ( Y£=i(yi ~ xifi)2) is also sub-Gaussian.

Confidence Intervals for Slope j3

Completely data-based CLT’s of Theorem 1.1.3a and its self-normalized companions from part (b) of Remark 1.1.5 allow us to construct LSA 1 — a Cl’s for /?, 0 < a < 1, respectively as follows:

3 (£?=i(ui(j, n) - u(j , n) )2) 2 n) - u(j,n ) )2) 2 Pjn ~ Za/2~ y — — P ~ Pjn d" Za/2 y/n{n-1) U(j,n) \Jn{n—\) U(j,n) (1.1.55)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 53

and

where, due to (1.2.19), U(j, n) ^ 0 WPA1, n —> oo, and j = 1,3. Under (2) and (3), in addition to CI’s of (1.1.55) and (1.1.56), yet another two LSA Cl’s for ft may be easily available from Theorem 1.1.2a and part (c) of Remark 1.1.4. In Section 9.4.2 of Casella and Berger [8] on approximate inter- val estimation the authors remark that “Generally speaking, a reasonable rule of thumb is to use as few estimates and as many parameters as possible in an ap­ proximation” and they support this point of view with concrete examples. This prompted us to derive LSA Cl’s for (3 from CLT’s of Theorem 1.1.2a and the corre­ sponding self-normalization CLT’s of part (c) of Remark 1.1.4, where ft in normal- izers ( £?=1{u^j, n) - u(j, n ) )2/(n - 1 ) ) 1/2 and ( £?=1 u?(j, n)/n) 1/2 respectively, j = 2,3, is left unestimated, as opposed to the corresponding CLT’s of Theorem 1.1.3a and part (b) of Remark 1.1.5, that are used to obtain (1.1.55) and (1.1.56). Now, we obtain such Cl’s when (2) is satisfied. The case of (3) being assumed is handled similarly. Using part (i) with t0 — 1 and j = 2 of Theorem 1.1.2a and its companion CLT from part (c) of Remark 1.1.4, we define respective LSA 1 — a: Cl’s (generally, confidence sets) for ft as

(1.1.57)

and n |5»y - tA | fan - P\

Sets of (1.1.57) and (1.1.58) are respectively equivalent to

{ 0 : Q in (P ) < 0} and {/3 : Q 2n(P) < 0},

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 54

with quadratic functions

Q in (P ) = (n(n - l)(Sxy - fj) 2 - z2/2 - Sxy)2^j p 2

+ ^—2n(n — l)(Sxy — fl) 02n + 2za/2 ^X^Xsi,vv ~ Syy) (si,xy — Sxyj'jP

+ n(n - 1 ) ^ - fj)2pin - zl /2 jr(si>yy - Syy)2, (1.1.59) 1 = 1 and

Q2M = (n2(Sxy - fj.)2 - z2a/2 ± ( s iiXy - n)2^ p 2

+ ^“ 2n ( S xy — (J,) p2n + 2za/2 ')~'Xsi,yy ~ ^ ) ( si,xy ~ /Xj^j P

+ n (Sxy — fJ.) P 2n — Z^/2 yX(si.yy ~ ^ ) 2j (1.1.60) i = l whose respective discriminants are

71 ~ 2 D ln — 4za/2n(n — 1) (Sxy — fj,) (i^i,yy ~ Syy) ~ P2n(si,xy ~ Sxy)') i= 1 ( n n / n ^ j{Si,yy ~ Syy) y^XSi>XV ~ SXy) ~ I ~ Syy)(Siixy ~ SXy)

4=1 4=1 \i=l (1.1.61)

and

n ^ 2 D2n = 42a/271 (Sxy fi) y ] ((si,yy ^ ) P2n(^i,xy fX)) 4=1 n n / n \2 ' ] ^ ( SW — ^ ^ )2 ^ / ( s i,xy ~ (X)2 ~ ( ^ , ( s i,yy ~ M ) ( s i,xy ~ fX) ' (4=1 4=1 \4=1 / (1.1.62)

It is crucial to define the signs of Din, D2n and the coefficients near p 2 in Q \n (P )

and Q 2n(P ) in order to proceed. While it does not seem immediate to describe all the cases when these signs stabilize WPA1, n —> oo, it is clear that if, e.g.,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 55

Eg4 < oo, then on account of (1.2.19), (1.1.38) and Theorem 1.1.3a, and the WLLN

in regards of (Sxy - fi)2, £ " = i ((siiJ/2, - X6) - ^ 2n (si,xy - /x))2, and £"=1(siiJ/2/ - Syy)2 and £ ”=1(siia!j, - Sxy)2 respectively,

sign(£>i„)

/ , _ . 9 J2i=l ( ( s i,yy ~ Syy) ~ 02n(Si,xy ~ Sxy) ) = sign! (£*,, - fi)* *------i— -2 Zq/2 n —1 n2 n2

— sign ^{Sxy fi)2 ^2 (isi,yy Syy) 02n(s i,xy Sxy)J /jlJ — 1

and

sign ^n(n - l)(Sxy - fi)2 - z\ /2 J2(siiXy - Sxy)2^

= sign (^(Sxy - fi)2 - z2a/2 Y X si,*y ~ Sxy)2/(n(n - 1)) j = sign((5xy - fi)2) = 1,

and, similarly,

sign(Z)2n) = 1 and sign (n2(Sxy - fi)2 - z2a/2 Y^isi,xy ~/-t)2 j = 1,

WPA1, as n —► oo. Consequently, in this case the Cl’s defined by(1.1.57) and (1.1.58) are respectively equivalent to

£ f . < e < B l (1.1.63)

and B L < 0 < B l, (1.1.64)

WPA1, n —» oo, where B\n and B%n are the respective smaller and bigger real roots of Qtn{/3) of (1.1.59), and B2n and B2n are those of Q2n(/5) of (1.1.60),

n(n 1 )(Sxy £t)2/?2ra za/2 ^2=lisi,yy Syy)(SijXy Sxy) yjDin/ 4 ln= n(n - 1)(S„ - fi)2 - z2a/2 TZ=1(sitXy - Sxy)2 (1.1.65)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 56

and

ni _ 'n?(Sxy li)2^2n ^a/2 Y% =l(Si,yy (Si,xy I1) ~F \ / ^ 2n/ 4 nHSxy- ^ - z l/2 j:U(si,xy- n )2 ’ U ‘1, j

with £>in of (1.1.61) and D 2n of (1.1.62). It follows from (c) of Remark 1.1.6 that under (4) of (b) of Remark 1.1.6, assum­ ing that (£i,Si,£i) are i.i.d. full normally distributed vectors and c = 1 in (1.1.7), an exact pivot-based 1 — a Cl for /? is given by .2,, —\l/2'l ( TUi ((yi - y ) ~ k ^ 4n(xj - ap) /(n - 2)) P • \0 4 n P\ ^ tn —2,a/2

(1.1.67) where Sxx 0 WPA1, n —> oo. If (5) of (b) of Remark 1.1.6 is assumed, then on account of (d) of Remark 1.1.6 ((A) and (C) with E£ ^ 0 are assumed),

y/n\x\ \ f y n - P \ _ ( 1.1.68 ) /?: 1/2 < ^a/2 ( HU ((2/i - y) - P{xi - *)) /(n - 1))

and

( E L i ((Vi - y ) ~ fanjXj - x )f/(n - 1))1/2 P ■ IPsn - P \ < z a/2 (1.1.69)

where a; ^ 0 WPA1, n —> oo, are LSA 1 — a Cl’s for /3, while if, additionally, (5, e) has a normal distribution, we have a pivot-based Cl for /3 as follows

y/n\x\ \ P s n - P \ . 1/2 — tn -l,a /2 ► , (1.1.70) ( E ”=i ((2/i - y) - 0(*i - *)) /(n ~ 1))

As to the explicit form of the Cl of (1.1.68), similar arguments to those in regards of (1.1.57) lead to the conclusion that WPA1, n —> oo, (1.1.68) is equivalent to

(1.1.71)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 57

where Bgn(za/2) and B 2n(za/2) are the respective smaller and bigger real roots of the quadratic function

Qsn(za/2) = (n(n - l)(x)2 - zli2 ^ (a * - x)2^ fi2

+ (~ 2n(n - l)(x)%n + 2a£/aJ 2 (yi - 17) (x* - z)^ fi

+ n(n - l)(x)2fign - z 2a/2 Y(v* - y)2, (1.1.72) Z=1 with discriminant n ^ 2 Dzn{za/2) = 4zl/2n(n - l)(x )2 Y ({Vi ~V) ~ Ptnfa - x)) i=1

~ 44/2 \J 2 (yi-y)2J2 (xi- ^ ) 2- ( j 2 (yi-y)(xi-^ )] ) fi-i-73) \i=l i=1 \i=l / J where due to the WLLN and E£ ^ 0, the CLT’s of (d) of Remark 1.1.6 and the WLLN, and (1.2.28) and the WLLN in regards of (x)2, £"=1 ((t/,—y)— fi^n{xi —x))2,

and E"=i(2/i ~ y )2 and TZ=i(xi - %)2 respectively,

sign(D3n(W 2)) = 1 and sign (n(n - l)(x)2 - z2/2 ^ (x * - x)2j = 1,

WPA1, n —> oo. Similarly, (1.1.70) is equivalent, WPAl, n —> oo, to

■®3n(^n-l,a/2) 5: fi ^ -R3n(^n-l,a/2)> (1.1.74)

where B^tn-i.a/a) and B 3n(t„_iiQ,/ 2) are the respective smaller and bigger real roots

of the quadratic function Qzn(za/2) of (1.1.72), with tn- i ,a/2 in place of za/2. Availability of any LSA Cl for fi in SEIVM’s (1.1.1)-(1.1.2) under an identi­ fiability assumption should be generally appreciated, since exact distributions for fijn, j = 1,3, are too complicated to be made use of in this regard (in fact, only a handful of results exist under (1), cf. Section 2.3 of Cheng and Van Ness [12]), and also exact pivot-based Cl’s are difficult to come by (cf. Section 2.4 of [12] and Sec­ tion 12.3.4 of [8] for a few results available under normality of explanatory variables

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 58

and/or error terms under (1), and for the fact that no pivot under (2) and (3) is known). As far as the situation about LSA CPs for (3 in SEIVM’s (1.1.1)-(1.1.2) under identifiability assumptions goes in the literature, one can get an idea from the summary on corresponding CLT’s from Section 1.1.3 and Remarks 1.1.3, 1.1.4 (part (b)), 1.1.5 (part (a)), 1.1.6 (parts (c), (d)) of Section 1.1.4.

To summarize our contributions to Cl’s for (3 in SEIVM’s (1.1.1)-(1.1.2) under identifiability assumptions that are due to the main results (CLT’s) in Section 1.1.4, we first note that in view of part (a) of Remark 1.1.5, LSA CPs of (1.1.55) and (1.1.56), j = 1,3, are the first CPs under Var£ < oo, when neither the explanatory variables, nor the error terms are assumed to be normal or normal-like, and also under Var£ = oo. When Var£ < oo and the explanatory variables and/or the error terms are normal or normal-like, LSA CPs (1.1.55) and (1.1.56) are different from those in the literature (cf. Remark 1.1.5). This is also true in regards of CPs of (1.1.63) and (1.1.64), and their analogues under (3), that all require additionally that E £ 4 < oo. Secondly, our approach has provided a variety in CPs for /3 under each of (l)-(3) and (5). Thus, under (1), we now have (1.1.55) and (1.1.56) with j = 1, in addition to the LSA Cl for (3 followed from [21], [22] combined that requires (B), (C), (E), Var£ < oo and normal or normal-like distribution for (6,e) (cf. Remark 1.1.3, (b) of Remark 1.1.4 and Gleser [23]). In cases (2) and (3), we have proposed four CPs of (1.1.55) and (1.1.56) with j = 2, (1.1.63) and (1.1.64) under (2) and their four analogues for (3), while assuming (2) or (3), (B), (C), (E), Var£ < oo and normal or normal-like distribution for (S,e), one can derive a (different) Cl for /? from the corresponding CLT in [9] (cf. Remark 1.1.3 and (b) of Remark 1.1.4). Also, under (5), we have worked out three CPs of (1.1.69), (1.1.71) and (1.1.74). Naturally, having now a variety of CPs for under each of (l)-(3) and (5), it would be of interest to compare, at least numerically, the performances of the corresponding CPs. This important investigation is not within the scope of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 59

this dissertation.

In the rest of this section, we discuss liabilities that some of LSA CI’s for fi in SEIVM’s (1.1.1)-(1.1.2) may suffer in view of the so-called Gleser-Hwang effect, and address the types of the models that allow to prevent this effect and provide reasonable CI’s.

Unfortunately, although the coverage probability of an LSA Cl for fi may ap­ proach 1 — a as n becomes large for fixed values of the parameters of a SEIVM (1.1.1)—(1.1.2), not for every LSA Cl does it so uniformly over the parameter space. Hence, there may be no sample size n for which the minimum coverage probability (which defines the confidence of the Cl) of an LSA Cl for fi equals 1 — a. More precisely, the following phenomenon is discovered by Gleser and Hwang [27] and also summarized in [8], [11], [12], [23] and [24]. For SEIVM’s (1.1.1)-(1.1.2) under (A), with ( 5, e) assumed to have a joint density satisfying some general assumptions, (E) and 0 < Var £ < oo, it follows from [27] that every LSA Cl of finite length for fi has minimal coverage probability equal to zero regardless of how large the sample size n may be. Conversely, every LSA Cl for fi with nonzero confidence level 1 — a is unbounded with positive probability. This implies that such LSA Cl has infinite ex­ pected size. Moreover, for any scalar function g{fi) with finite range, any LSA Cl for g(fi) which has confidence level greater than 1/2 must have positive probability of containing the entire range. In Cheng and Van Ness [12] the described phenomenon is referred to as the Gleser-Hwang effect.

Despite the limitations of LSA CPs for fi that are due to the Gleser-Hwang effect, it is nevertheless possible to ensure reasonable CPs with a proper, robust choice of SEIVM’s (1.1.1)-(1.1.2). In this regard, Gleser [23] examines SEIVM’s (1.1.1)—(1.1.2) under (1) with A = 1 and (£,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 60

Var £/'Vare and concludes that if k is at least as large as 1 (or reliability ratio = k/{k + 1) > 0.5, cf. (b) of Remark 1.1.6), then the confidence of his Cl based on the moderate sample size n > 25 is reasonably close to the desired level. Hence the importance of prior knowledge about k$.

We note that the idea behind the Gleser-Hwang effect is, roughly speaking, based on that, when 0 < Var£ < oo, SEIVM (1.1.1)—(1.1.2) can be close enough to the degenerate one with Var£ = 0, i.e., to y* = E£/3 + a + Si, Xi = E £ + where the explanatory variables do not vary and make it impossible to fit a unique straight line through the data points. If one a priori restricts Var£ to be bounded away from zero, the effect disappears. This is, perhaps, an intuitive predecessor of the idea of Gleser [23] that signal to noise ratio k should be large enough to ensure reasonable LSA Cl’s for p.

Gleser’s [23] idea of reasonable SEIVM’s (1.1.1)-(1.1.2) with large enough signal to noise or reliability ratio (0 < Var £ < oo) in regards of LSA CPs for (3 well rhymes with our general idea of robustness of first time around SEIVM’s (1.1.1)-(1.1.2) in which the explanatory variables with infinite variance (as allowed by (D) in view of Remark 1.2.1) dominate the error terms with a finite variance (cf. conclusive lines of Section 1.1.3, Observation 1.1.1, part (e) of Remark 1.1.6). Indeed, in view of the basic idea behind the Gleser-Hwang effect (cf. previous paragraph), the latter models are a priori resistant to this effect, as opposed to those in [23] that require preliminary estimation of reliability ratio in order to be concluded as robust enough to this effect. In fact, we can extend the definition of k^ to the models with Var£ = oo by putting k$ = 1 and view such models as those with maximal possible reliability ratio k$. Hence, although LSA CPs for (3 derived under (l)-(3) and (5) of (b) of Remark 1.1.6 in this section have not yet been carefully examined when (D) with Var £ = oo is assumed, due to discussions in conclusive lines of Section 1.1.3, Observation 1.1.1, part (e) of Remark 1.1.6 and disappearance of the Gleser-Hwang

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.5 A Note on Some Applications 61

effect in this case, one has every reason to believe that these Cl’s are reasonable ones, i.e., their confidences are close enough to the desired level for large enough n. Some interesting comments on the Gleser-Hwang effect from the practitioners’ point of view can be found in Hasabelnaby, Ware and Fuller [30].

Other Applications

As to some other applications of our main results of Section 1.1.4, CLT’s of Theorems 1.1.3b, 1.1.3c and their companions in part (b) of Remark 1.1.5, and CLT’s of Theo­ rem 1.1.5 and Remark 1.1.12 can be used to construct LSA confidence sets/intervals for intercept ot (if intercept is unknown) and unknown error variance(s), as well as for various vectors composed of /?, a and unknown error variance(s). Naturally, LSA Cl’s for the parameters of interest of SEIVM’s (1.1.1)-(1.1.2), in particular the LSA Cl’s that we have just constructed for /?, can also be inverted the usual way to obtain asymptotic size tests for corresponding statistical hypotheses. Also, weak invariance principles in (ii)’s of Theorems 1.1.2 and 1.1.3 can be used to study appropriate functionals of interest of the WLSP’s, MLSP’s and the processes for unknown error variances. These possibilities are not explored in this thesis.

1.2 Auxiliary Results and Proofs

In Section 1.2, with their auxiliary results and proofs, Sections 1.2.1 and 1.2.3 are special in that they can independently serve as surveys of, with our complements to, some main results on domains of attraction of univariate and generalized (multi­ variate) normal laws (DAN and GDAN), and on recent advances in Studentization and self-normalization. These surveys are then utilized in developing our main lines of contributions in Section 1.2, respectively in Sections 1.2.2 and 1.2.4. Conclusive Section 1.2.5 constitutes a crucial step in the key auxiliary Lemma 1.2.8 of Section 1.2.2 and amounts to a computer code in “Maple”.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 62

1.2.1 Basics on DAN, Self-Normalization and Studentization with Some Complements

In this subsection we take a little detour into the world of DAN, self-normalization and Studentization. We review some basic well-known results introduced as Lemmas 1.2.1-1.2.5 here, give a very simple proof in the context of Chapter 1 for the first part of the already known result of Lemma 1.2.6 and also establish Lemma 1.2.7. Further developments in Sections 1.2.2-1.2.4 are heavily based on these lemmas. Survey of Section 1.2.1 may also be of independent interest. First, we introduce an abbreviation and a few new notations. WLLN is short for the Kolmogorov weak law of large numbers for i.i.d.r.v.’s with finite mean. Notation op(l) is used for a sequence of r.v.’s that converge to zero in P, while o(l) stands for a nonrandom numerical sequence that converge to zero. Now, we recall a well-known definition.

Definition 1.2.1 Let {Z, Zi, i > 1} be a sequence of i.i.d.r.v.’s. We say that Z belongs to DAN if there are constants an and bn, bn > 0, for which

R em ark 1.2.1. Here are some known facts related to Definition 1.2.1. In (1.2.1) on can be taken as n E Z and bn = n ^ 2£(n), where £{n) is a slowly varying function at infinity defined by the distribution of Z. In fact, as will be seen from the proof of the upcoming Lemma 1.2.2, £(n) = const > 0, if Var Z < oo, and £{n) f oo, as n —> oo, if Var Z — oo. Also, Z has moments of all orders less than 2 and the variance of Z is positive, but need not be finite.

One of the several necessary and sufficient conditions for {Z, Zi, i > 1} to be in DAN is commonly associated with O’Brien [53] as, e.g., in [20] (for more details see also [42], p. 194), and it reads as follows.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 63

Lemma 1.2.1. Let {Z,Zi, i > 1} be i.i.d.r.v.’s. Then,

n Z € DAN if and only if max Z \ f ^ Z 2 4 0, as n —» oo. (1.2.2) i=l The following result will come handy for us later on. This lemma was redis­ covered by Mailer [41] to answer in part a conjecture of Logan, Mallows, Rice and Shepp [39] (cf. the necessity part of Lemma 1.2.3 here) and is essentially a variation of Theorems 4 and 5 on pp. 143-144 in [28]. From Mailer [42], we learn that there is also a converse to this lemma (cf. Feller [18], p.236). However, for our purposes, one-sided Lemma 1.2.2 will be sufficient.

Lemma 1.2.2. Let {Z, Zi, i > 1} be i.i.d.r.v. ’s in DAN. Then, as n —> oo,

it(Z, - E Z)H? Z 1, (1.2.3) 2=1 where bn is a positive sequence of numbers such that

Y i(Zi -E Z )b ~ 1 ^ N ( 0.1). (1.2.4) 2—1 Proof. Suppose that Var Z < oo (from Remark 1.2.1, Var Z > 0). Then, on account of the CLT, bn — (riVarZ)1/2 in (1.2.4), and (1.2.3) holds true by Kolmogorov’s WLLN. When Var Z = oo, one finds a direct simple proof of the following fact in [20]. For such {Z, Zi, i > 1} and bn as in (1.2.4)

52 Zfb~2 4 1, n —> oo. (1.2.5) 2=1

In this case not only is it true that bn — n ^ 2£(n), where £{n) is a slowly varying function at infinity defined by the distribution of Z, but also £(n) S' oo, as n —> oo. The latter fact together with Kolmogorov’s WLLN imply (1.2.3) as follows:

f ^ Z i- E Z f b ? = f^Z^-iiEZ^Ztb-2 + n(EZ)2b~2 ^ i=l i=l i=l

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 64

R em ark 1.2.2. When Var Z = oo, Lemma 1.2.2 can also be viewed as an addition to a converse to the Kolmogorov law of large numbers found on p.80 of [59]. This converse states that 1 71 if E Z+ = oo and E Z ~ < oo, then — V'Zj-^+oo, n —* oo, (1.2.7)

where { Z, Zit i > 1} are i.i.d.r.v.’s and

7+ j Z , if Z > 0, , j —Z , if Z < 0, z = \o ,ifz< 0, 311(1 z = \0 ,iiz>0. On assuming that E Z exists and Var Z = oo, by (1.2.7) one concludes

- f ^ i Z i - E Z )2 oo. (1.2.8) n i= l In view of (1.2.8), under Z € DAN, (1.2.3) of Lemma 1.2.2 specifies how fast rT1 YJi=\{Zi — EZ )2 converges to infinity in probability.

In the aforementioned conjecture of Logan, Mallows, Rice and Shepp [39] on characterization of DAN via asymptotic normality of self-normalized partial sums, the most difficult part of the question had remained open until Gine, Gotze and Mason [20]. Now, this fundamental characterization reads as follows.

Lemma 1.2.3. Let {Z, Zit i > 1} be i.i.d.r.v. ’s. Then, as n —► oo, n /n \ —1/2 Z G DAN if and only if Y X Zi - EZ)( £ (Z * - E Z )2) % N( 0,1). (1.2.9) i=l \ i=l ) Proof. On assuming that Z G DAN, convergence in (1.2.9) was concluded by Mailer (1981) via (1.2.3) and (1.2.4) of Lemma 1.2.2. Conversely, assume now that convergence to normality in (1.2.9) holds true. The main achievement of [20] is in showing that, for i.i.d.r.v.’s {Z, Z*, i > 1},

n / n \ —1/2 E 2? Stf(0,l), n->oo, (1.2.10) i=l \ i=l J

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 65

implies Z e DAN and E Z = 0. (1.2.11)

After showing that Z G DAN is centered if (1.2.10) obtains, this proof reduces to v 4 ( X £ iZi(J2?=i Z\)~ll2\ in order

to verify convergence in Lemma 1.2.1. On noting also that Z G DAN if and only if

Z — E Z G DAN, Gine, Gotze and Mason [20] conclude the second part of (1.2.9) via (1.2.10) implying (1.2.11). In view of the thus obtained equivalence of (1.2.10)

and (1.2.11), that EZ — 0 in (1.2.11), rhymes very well with the role of centering or the choice of a n in Definition 1.2.1 that is also underlined in Observation 2.1 of [17]. □

Lemma 1.2.3 leads to a full justification of the title of [20]: “When is the Student f-statistic asymptotically standard normal?”. Namely, for the classical Student t- statistic y/n ~Z = ------:------TI75- (1.2.12)

the conclusion of Gine, Gotze and Mason [20] can be stated as Lemma 1.2.4 below. We note that if {Z, Zi, i > 1} are i.i.d.r.v.’s in DAN, then on using (1.2.3) of Lemma 1.2.2 and Remark 1.2.1, the denominator of Tn(Z) is well-defined with probability that goes to one, as n —» oo.

Lemma 1.2.4. As n —> oo, the following two statements are equivalent:

(a) Z G DAN and E Z = a;

(b) Tn{Z -a )% N (0,1).

Proof. Easily follows from representing f-statistic as

______S t i Z i/(S ?.i f f ) 1/z______T.(Z) = , \ 1/2* Un - ( H U Zi/CO.1 Z?)1/2) )/(n - 1)J

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 66

noticing that Z G DAN if and only if Z — a G DAN and applying equivalence of (1.2.10) and (1.2.11) for Z - a. □

Recently, Csorgo, Szyszkowicz and Wang [16], [17] extended Lemma 1.2.4 by obtaining characterization of DAN also in terms of asymptotic behaviour of the sequence of Student processes defined in D[0,1] as

y/n Zt T„,t(Z) = 7------:— :L^ - L ------171, o < t < l. (1.2.13) ( TZ,l(Zi-'Z)2/(n -l ))

More precisely, they have additionally shown that (a) of Lemma 1.2.4 is equivalent to weak convergence of (1.2.13) to a standard Wiener process W(t) (cf. Remark 1.1.2 for the notion of weak convergence on (D[0,1], p)), as well as to sup-norm approximation in probability of (1.2.13) by a Wiener process, as summarized next.

Lemma 1.2.5. Let {Z,Zi, i > 1} be i.i.d.r.v.’s. As n —> oo, the following state­ ments are equivalent:

(a) Z G DAN and E Z = a;

(b) Tn>t0(Z — a) Q N(0,to), for to 6 (0,1];

(c) r„,t(Z -a)Z W(t) on (H[0,1], p);

(d) On an appropriate probability space for {Z, Zi, i > 1}, one can construct a standard Wiener process (W(£), 0 < t < oo} such that

sup Tn,t(Z - a) - W(nt)/yfo oP{ 1). 0

We are now to introduce the first of the two lemmas that are of supreme impor­ tance in Section 1.2.2. The first part of the first one says that DAN class of r.v.’s is closed under multiplication operation. The second part of Lemma 1.2.6 amounts to a converse. Lemma 1.2.6 is established by Mailer in [41] and is also applied there to prove the asymptotic normality of the regression coefficient in a linear regres­ sion when the error variance is not necessarily finite. The proof of Lemma 1.2.6 in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 67

[41] is quite technical, and is based on checking the classical conditions of Theo­ rem 2 on p. 128 in [28] guaranteeing (1.2.1) for suitably chosen constants bn. Here, we present a simple proof of the first part of Mailer’s result in the context of our model, namely under similar conditions to those in (B), (D) and (E), and state the converse without a proof. Lemma 1.2.6. Let {(U,V), (Ui,Vi), i > 1} be i.i.d. random vectors and assume that U and V are independent. If U £ DAN and V £ DAN, then UV £ DAN. Conversely, if E V 2 < oo and UV € DAN, then U £ DAN.

Proof. We prove here the first part of Lemma 1.2.6, assuming additionally to V £ DAN that E V4 < oo, similarly to (B). If EU 2 < oo, then, since U and V are independent and nondegenerate (both are in DAN), we have

Var UV = E U2E V2 - (E U)2(E V)2 > 0.

Thus, on account of the CLT, via Definition 1.2.1, UV £ DAN. Suppose now that EU 2 — oo. First, without loss of generality, we assume that E V2 = 1 and prove the following key observation: n n (1.2.14)

Since U £ DAN, then U — EU £ DAN and, combining (1.2.9) of Lemma 1.2.3 and (3.7) of [20], one of the key results of that paper, we have

(1.2.15)

For any e > 0, on account of independence of U and V, and (1.2.15),

= nE(V2 - 1)2£((£/i-EU f/'tiU i - EU)2) V 2 = o(l), ' 2=1 '

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 68

i.e., we have (1.2.14). Furthermore, without loss of generality, we can assume that E U = 0, since when E U ^O , it is easy to see that

( U - E U)V e DAN implies UV e DAN. (1.2.16)

Indeed, if ( U — EU)V £ DAN, then Lemma 1.2.2, (1.2.14) and the fact that E U2 = oo yield, as n —> oo, »“ > where slowly varying function £{n) /* oo. Also, since 0 < Var V < oo,

1), n —. oo. (1.2.18) JnVaiV(EU)i

Hence, on account of (1.2.17) and (1.2.18), with £{n) from (1.2.17), as n —> oo,

Y.UiUiVi-EU-EV) Zti(Ui-EU)Vi , E ? = i(^ ~ EV)EU v ATfn 1N y/nt(n) ~ y/Zt{n) + ^i£{n) ^ 1 ’

i.e., UV £ DAN via (1.2.16) with E U ^ 0 and EU 2 = oo. Continuing the proof when EU = 0 and EU 2 = oo, by Lemma 1.2.1, one needs to verify that max U2V ?/ UfV2 = oP(l), as n -» oo, i=l or, on account of (1.2.14), that

max UfV2/ Y JUf = oP(l), as n ^ oo. l 0,

p(m& UfVf/±U}> e) < npfuiVf/itu! \ --- i=1 / \ i=l ^ ^ = 0(1).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.1. Basics on DAN and Studentization with Some Complements 69

R em ark 1.2.3. Having established (1.2.14) here, we can also give a simple proof for a part of the conclusions in Remark of Mailer [41] in the context of our model.

Namely, assuming that U e DAN and V € DAN and EV 4 < oo, rather than

EV 2 < oo as in [41], we conclude from (1.2.14) and Lemma 1.2.2 that E"=i(27* -

EU)Vi and Y%=\.{Ui — EU) are of the same order of magnitude, i.e., ££=1 (Ui -

EU)Vi(\/riV&rV^(n))_1 3 N(0,1) and E?=i(^i - EU){^i(n))~l ^ iV(0,1),

n — ?• oo, with the same slowly varying function at infinity £(n).

The results of Lemma 1.2.6 will usually be coupled with those of the next one.

Lemma 1.2.7. Let {(U, V), (Ui, Vi), i > 1} be random vectors with E\UV\ < oo.

(a) IfUe DAN and V 6 DAN, then U + V e DAN, provided that P(U + V = const) ^ 1.

(b) Conversely, ifU + V e DAN, U e DAN and E U2 < oo, then V e DAN, provided that P(V = const) ^ 1.

Proof, (a) If EU 2 < oo, EV 2 < oo and P(U + V = const) ^ 1, then 0 < Var (U + V) < oo and due to the CLT, U + V e DAN. Assume now that, say, EU 2 = oo. Then, on account of Lemma 1.2.1, Lemma 1.2.2 for {27, Ui, i > 1} with bn = y/n£(n), where slowly varying function i(n) /* oo, and also the Kolmogorov WLLN for {UiVi, i > 1},

mexls i&,(U, + Vjf ______maxi<,-

Via Lemma 1.2.1, this proves that U + V e DAN.

(b) Since EU 2 < oo and also E\UV\ < oo, then E\(U + V")(—27)| < oo, and via applying part (a) of Lemma 1.2.7 to U+V and —27, one concludes that V e DAN. □

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 70

1.2.2 Auxiliary Results and Proofs of Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2 and Observation 1.1.1

The main aim of this subsection is to provide the proofs for univariate main results of Section 1.1.4. First, Theorems 1.1.1a and 1.1.1b are to be proved. Then, to obtain invariance principles in Theorems 1.1.2, 1.1.3 and Propositions 1.1.1, 1.1.2, and also to prove Observation 1.1.1, we will employ two auxiliary processes, namely the one in (1.2.42) and Mn

The developments of this section are based on the results of Section 1.2.1.

New abbreviation used below is SLLN that means the Kolmogorov strong law of large numbers for i.i.d.r.v.’s with finite mean. Notations 0(1) and Op{l) stand respectively for bounded nonrandom numerical sequences and for sequences of r.v.’s that are bounded in probability P. For r.v.’s X and Y, by X =' Y, we mean that P(X = Y) = 1. If Z is a d-dimensional vector, then Z® is its j th component, while

Z(k,k+ i) _ ) z(k+1\ , Z^k+l^) is a subvector of Z that has all the components of Z starting with Z ^ and ending with Z^k+l\ 1 < k < d — 1, 1 < I < k + l < d, d > 2. For a square matrix A, Amin (A), det(A) and tr(^4) denote respectively the minimum eigenvalue of A , determinant of A and trace of A.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 71

P roof of T heorem 1.1.1a. First, we show that, as n —* oo,

and Sw - Xe (1.2.19 ) btt % We do this by proving only the second convergence in (1.2.19), as an example. We have % ^ = /? + % £ + § 2 + % ^ . (1.2.20)

On account of (C), the SLLN and (1.2.8) of Remark 1.2.2,

% = “ ZX& “ c?)2 ^ E ? - cm? > 0 , if Var£ < oo, ^ g ^ s S# +oo , otherwise.

Observing that (A), (C) and (E) imply that E(£e) = 0, E(£8) = 0 and E(8e) = n, one concludes from the SLLN and (1.2.21) that

andl — ---- are all ii Oi.S. = o(l), /. \

and thus, via (1.2.20), arrives to the second convergence in (1.2.19).

Now, assuming that n = 0 in (1.1.3), consider estimator p in = (/?ln — p ) x + (3, with 0 in — p)t from (1.1.11). It follows from (1.2.19) and (1.2.21) that

sign(5XJ,) sign(/?), n~* oo. (1.2.22)

On combining (1.2.19) and (1.2.22), p Xn is concluded to be a strongly consistent

estimator of p . Similarly, for i = 2,3, (1.2.19) implies

Pin ^ P , n —^ oo. (1.2.23)

As to the strong consistency of ain, i = 1,3, it follows from the SLLN and (1.2.23) for i = 173. Consider now estimator Qin from (1.1.23). Thus,

$2 — 6 = ^ xx ~ ~ ^ ~ ^ xy ~ (1 2 24) Syy — A 6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 72

where

( S xx - 9 ) ( S yy - X 6) - ( S xy - Ai)2

= (% + 2% + S ee ~ e ) ( S x / 3 2 + 2S & 0 + S ss - XO)

~ (&&& + 5 + S&P + Sge — ft)

= ( s y 2 + 2 S te S x /3 2 + (See ~ 0 )S ^ 2 + 2 S & S & 0 + 2 S ^ 2 S & 0

+ (See ~ 0)2Sssf3 + S^iSss- xe) + 2 S& (Su - X9) + (Se£ - 6)(Sss - X0))

- ( 5 f ^ 2 + S is + s y 2 + (Sse -11? + 2% / ? % + 2S x /S S & P

+ 2Stffi(Sse — fj) + 2 S s s S te 0 + 2,S^s(Sse — n) + 2S^e/3(Sse — A 4 ) )

= s((0 2{ - |(S fc - n ) + i ( S « -xe) + (s „ - 6)) + r (1.2.25)

with

&. = -«?<-S&P + 2S(SSc,/3 + (See - 0)(Sss - A0) - (5 * -

+ 2S(S0(S„ - « ) + 2S£«(Sm - AS) - 2S£5(5j« - /a)2S(.P(Ss. - - /»). (1.2.26)

From the SLLN, as n —> oo,

- /i) + ^ ( S « - A0) + (5ee - 9) o(l) and B=* o(l), (1.2.27)

that combined with (1.2.19), via (1.2.24) and (1.2.25), yield strong consistency of

02 n-

For A03„ of (1.1.24), from (1.2.25), (1.2.27), the SLLN and (1.2.19) we easily conclude that, as n —* oo,

« (s„-exsm-xt>)-(s„-tf ™Zn Av — C /) & X X V

s J - 20(S„ -Ii) + (Ss, - AS) + 0>{S« - «)) +K .. .„. = &xx " ='o(1)' □

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 73

P roof of Theorem 1.1.1b. Assuming that /4 = 0 in (1.1.3), consider 0ln = (6in — 0)i + 9 of 9, where (&in — 9)x is as in (1.1.14). If Var £ < oo, then strong consistency for 9in follows from strong consistency of <%, 0in and (1.2.19). It is to be seen below that the proof for the case Var£ = oo is not straightforward and only available here for weak consistency of 0ln, under (B) and (D) additionally assumed. First, we prove the following useful observation that will also be used for further developments in Section 1.2. There exists a slowly varying function £${n) such that

% p 1, n -» oo. (1.2.28) £\{n)

Indeed, if Var£ < oo, then by the WLLN followed from (1.2.21) £j(n) = E£2—cm2 > 0, with c of (1.1.7). Otherwise, when Var£ = oo, on account of (D), Remark 1.2.1 and Lemma 1.2.2, there is a slowly varying function at infinity £$(n), such that £^{n) y7' oo and

4 1, n-* oo. (1.2.29) n«|(n) Moreover, function £%(n) in (1.2.28) can be chosen as the one in (1.2.29). Indeed, (1.2.29) and the WLLN imply that, as n —» oo,

% (S-c£)2 (£-m)2 + 2(£-ra)(m-c£) + (m —c£)2 , , /iN e\(n) - <|(n) q w -1+opw .

In case Var£ = oo, we rewrite 9\n — 0 as

n (Syy - xe) - 2Sxyp + (Sxx - 9)f32 n - 2 A + An n (Sxx - 9 )0 2n - (32) - 2Sxy0 ln - (3) n - 2 A + A2n

+ Z~2e d -2-30) =: Vi + ^ + H, (1.2.31)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 74

where Vi, V2 and V3 are respectively for the first, second and third summands in

(1.2.30). Clearly, as n —> 00,

y _ re (Sss -0)- 2Ss,/3 + (S„ - 0)0* aj. , . 1 n - 2 A + f e and V:, = o(l). ,P*X Hence, to conclude consistency of $in, it suffices to show that, as n —> 00,

V2 = oP( 1). (1.2.32)

The arguments for (1.2.32) are not immediate. We make use of (1.2.28) for having the exact rate of convergence in probability of Sg to infinity (cf. also Remark 1.2.2) and also need the information about the rate of consistency of /?i„. Prom (1.1.38) of Observation 1.1.1 in Section 1.1.4 (conditions (B) and (D) are additionally as­ sumed) we have

Pln ~ (3= ^ l ( n) ’ n ~*°°’ (1-2.33) where £^(n) is as in (1.2.28). Finally, (1.2.33), (1.2.28) and the WLLN yield, as

n —» 00,

V, ^ ( A + Pi) = (S „ - 0)01 - 02) - - 0) 71 = s « (& » -0? + 2S(, 0 ln - 0)2 + (S„ - 0)0*, -0)0*. + 0) -2SiS(0ln-0)-2S s,0 ln-0)

= S x 0 * . - P? + op(l) = Op(l) £?(re) + op(l) = op(l), (1.2.34)

and, therefore, V2 = op(l). 1=1

Rem ark 1.2.4. For further asymptotic studies, when dealing with the proofs related to the invariance principles for (@in — p)t of (1.1.11), it suffices to replace it with process

(Pin - P)t = sign {P)yj({zn - z)t + z f + X - % - z)t - z - 0 < t < 1, (1.2.35)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 75

where ( zn — z)t and z are as in (1.1.12). For example, in view of Theorem 1.1.2a,

we have for each t € [0,1],

VnU(l,n)0in-P)t => y/nU{l,n)0in-0)t + ^ ^ (E"=i(wi(l, n )-u ( 1, n))2/(n-1))1/2 (E?=iMl, n) -u ( 1, n))2/( n -1 )) 1/2 (1.2.36) with U(l,n) of (1.1.31), Uj(l,n) of (1.1.32) and (E£=i(wi(l>n) —it(l,n ))2/( n —l) ) 1^2 being well-defined WPAl, n —» oo (cf. forthcoming Remark 1.2.6 applied to Mnjt = y/nu(l,n)t(^Y,i'=i(ui(^^n) ~ u(l,n) )2/(n — 1)) ^ 2). Moreover, due to (1.2.22),

sup \pn,t | = oP(l), (1.2.37) 0 < t< l since for any e > 0, as n —> oo,

P (0sup|pn>t| > e) < P(sign(Sxy)^sign(/3)) < P(|sign( 2) -+ 0. (1.2.38) Similarly, in the proofs of Theorems 1.1.2b, 1.1.3b and Propositions 1.1.1, 1.1.2, we will study

(5iTl - a)t = - x 0 in - P)t + (y - x(3 - a)v 0 < t < 1, (1.2.39)

in place of (5i„ — a)t of (1.1.13). This remark is also valid in regards of the proofs of Theorems 1.1.4, 1.1.5 and Remarks 1.1.9, 1.1.12. Namely, j3in and 3in in various couples and triples of estimators appearing therein can be replaced with their cor­ responding auxiliary estimators fiin = 0 in — P)i+ @ and = (5m - a)i + a as in (1.2.35) and (1.2.39).

Introduce vectors

£ = ((£ -cm)5, (£ — cm)e, 8, e, 8e — 82 — X9, e2 — 0s) (1.2.40)

and

Ci = ((6 ~ cm)8i, (& - cm)eh 8h ei} Si£i - fj., 8? - \0 , e? - 0), i = T“n, (1.2.41)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 76

with c as in (1.1.7). Let b = (&W, • • •, b^) be a nonzero vector of constants. We are to study a version of Student process (cf. the definition in (1.2.13)), namely

7------4 ^ * --TI75. 0 < t < 1. (1.2.42)

Process in (1.2.42) is one of the two key auxiliary processes for the proofs of uni­ variate main results of Section 1.1.4 (cf. introduction to Section 1.2.2). Invariance principles for (1.2.42) are summarized in our next lemma (for the notion of weak convergence on (Z)[0,1], p), we refer to Remark 1.1.2).

Lemma 1.2.8. Let assumptions (B), (D) and (E) hold true. When b ^ = bW = 0, assume additionally that Var (C, b) > 0. (1.2.43)

Then, as n —> oo, the following statements are valid and equivalent:

(a) ------jV(0,i0), for to € (0,1];

0>) 7------^ 6>‘------n rs ® w(t) on (D[0, l],p); (Elite-C.»>7("-1))

(c) On an appropriate probability space for {(£,8, e), (&, Si, ef), i > 1} we can con­ struct a standard Wiener process {W(t), 0 < t < oo} such that

y/n (C. b)t W(nt) sup = Op( 1). 0

Proof. By Lemma 1.2.5, (a)-(c) are valid and equivalent whenever

(C, b) G DAN (1.2.44)

and E (£,b ) = 0. While from (B), (D) and (E) the latter is true, (1.2.44) needs to be shown.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 77

If = fe® = 0, then, since (1.2.43) holds true, we have 0 < Var (C, b) < oo and thus, (1.2.44) as well.

Suppose now that | 6^ j + |6^ | > 0. Then, on account of Lemma 1.2.6 and the fact that £ e DAN if and only if £ — cm e DAN,

(£ - cm)(6(1)

where, due to the fact that error covariance matrix T is positive definite,

Var(6(1)5 + 6(2)e) > 0 . (1.2.46)

If Var((C,&) — (£ — cm)(&W(5+&®e)) = 0, then (1.2.45) implies (1.2.44). Otherwise, (1.2.44) is implied by part (a) of Lemma 1.2.7 applied to (£ — cm)(b^S + b^e) and — cm)(b^S + b(2h)J that are now both from DAN. Two conditions of this lemma are left to be verified. First, from the finiteness of Z?£, independence of £ and (5, e), and existence of the fourth error moments, it is seen that

e|(£ — cm)(b^8 + &®e)((C, &) — (£ — cm)(b^S + 6^ e ))|

= i?|(£ — cm)(b^8 + b^e)

x (b®6 + &(4)e + b{5)(Se - fi) + b^(82 - X6) + b«\e2 - 0 ) ) | < oo. (1.2.47)

If Var£ = oo, then the second assumption of part (a) of Lemma 1.2.7, i.e., that

P((C, b) = 0) 7^ 1 is automatically satisfied, since Var (£, b) = oo under | 6^ )|+ |6^ | >

0. Otherwise, the proof of that Var(£, 6) > 0 is less obvious and to be argued next. It is left to be shown that Var{£, b) > 0 under assuming Var£ < oo and l&^l +

|6^ | > 0. If V ar(^3,7), b ^ ) = 0, then, on account of (B), (D) and (E),

Var(C,&) = Var(C(1>2),6(1’2)) = (#(£ - cm )2) V 1,2)r, &(1,2)) > (Var£)2Amin( r ) ( ( ^ ) 2 + (&<2>)2) > 0.

Suppose now that Var(C^3,7\ M3,7)) > 0. It is not hard to see that there exists a full subvector of and nonrandom vector $ 3,7) such that <£(3.7), 6(3,7)) (f(3,7)J(3l7))>

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 78

Indeed, if <^3,7^ is full, then we put <^3’7^ = <^3,7^ and = b^3,7\ Otherwise, there is a vector of constants e, such that ||e|| = 1 and

( C ^ e ) = 0,

and we construct the desirable C3,7) and as follows. First, we cut £^3,7^ by one of its components that has nonzero coefficient in the latter equality and denote the reduced vector, with keeping the original order of the remaining components, by £(3,7) Then, using the same equality, the cut component of <^3,7^ gets also eliminated from 7) is over. Otherwise, replacing (<^3,7\ e) =' 0 with (C^3’7\ e) =' 0, where e is some nonrandom vector, ||e|| = 1, and expression (<^3,7\ b ^ ) with (C3,7^ 3,7^)> we keep on cutting Q3’7^ and modifying W ’A until Q3’7^ is full, each time saving the same notations for the newly obtained vectors. Clearly, on account of positivity of error covariance matrix (cf. (B)), <^3,4) = (5, e) is full, and one can always produce C3,7) out of C^3,7) by preserving these two components in C<3,7). Hence, the described construction of C^3,7^ and &^3,7^ has at most three steps and the dimension of these vectors varies from 2 to 4. Having obtained C^3,7) and b^3,7\ we proceed as follows. It is well-known that for a symmetric k x k matrix A, the eigenvalues of A Amin (-4) = < A^(.<4) < ■ • < A(fc)(.i4) are real and related as follows

k k 52 A« = tr(A) and JJ A(i> = det(A). i=l i=l

Hence, also due to (^3’7\b^3,T)) =' {^3,7\b^3,7^), y

Varfc, b> = Var((C<1’2), i»(1'2)> + « <3’7), 6<3’7>)) = Var(((C(1'2), C*3’7*), (f-M , &(3'7))>) > \^ (C ov(< ™ fM ))((6W )2 + (6<2>)2) > det (C o v « M , C<3'7))) ( tr(Cov(C(1'2), C(3'7))) ) _)2) ,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1,1.2, Observation 1.1.1 79

where k is the dimension of vector (C^’2\ C3'7^)- Finally, combining the latter in­ equality, the fact that by (B),

0 < tr(Cov(C^1,2\ C(3,7))) < 00

and factorization

det (Cov(C(1’2), C(3,7))) = (E(t - c m f - (E(£ - cm))2)2 det(r) det (CovC(3,7)) = (Var£)2 det(r) det (Cov^3’7)),

where due to (B) and (D), det(r) > 0 and Var£ > 0 and on account of full­ ness of constructed <^3,7\ C ov^3,7^ > 0 and det (Cov<^3’7)) > 0, we conclude that Var{£,b) > 0. We note that the latter factorization (the first equality) is obtained with the help of computer software “Maple” by verifying this identity for all pos­ sible Cov C(3,7) that correspond to vectors C3,7\ whose first two components are S and e and the rest of the components (maximum three), if any, compose a subvector of vector (8e — (i, 52 — A0, e2 — 6), with the same order of components as in ( of (1.2.40). In other words, this factorization is checked for all 8 possible matrices Cov(£(1,2\ C(3,7)) defined by varying their right lower block C ov^3,7). For details of the computer code in “Maple”, we refer to Section 1.2.5 Appendix. □

Remark 1.2.5. Let (B), (D) and (E) be satisfied. If Var£ < oo and/or = &(2) = 0, then the WLLN, (1.2.43) and positivity of Var{£, b) under Var£ < oo and |&^| + 1 | > 0, that is shown in the last paragraph of the proof of Lemma 1.2.8, result in, as n —► oo,

1 E « i - C, b}2 Var (C, 6) > 0, (1.2.48) n ~ 1 i=i and thus, from part (b) of Lemma 1.2.8 and the fact that su p o ^ ! |W(£)| = Op( 1),

sup |(C&)t| = o

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-L1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 80

On the other hand, if Var£ = oo and |6^| + |6^| > 0, then Remark 1.2.1, Lemma 1.2.2, Remark 1.2.3 and (1.2.46) imply, as n —» oo,

+ A V a r t ^ + & s) > 0,

where £${n) oo satisfies (1.2.29) (or (1.2.28)). As n —► oo, the latter convergence and the WLLN yield

4 Vai(&(1,<5 + t » e) > 0, n«|(n)

and therefore,

£? =1 ((6 - cm)(6(1)<5j + b&hi) - (£ - cm)(b^8 + &(%) ) 2 n£%(ri) £"=i(& ~ cm )2(6(1)5i + b^Ei)2 - n( (£ - cm){b^8 + M%) ) 2 n£%(ri) Var(&^5 + b^e) > 0.

Similarly, since £7=1 ((£— cm)(6^ &)2/ ( n^2(n)) in case Var£ = oo and |6^| + |6^| > 0, then, as n —> oo, _

( l 2 -49) and hence, sup \(C,b)t\ = ^ O p ( l ) , o

The following result concludes that for i.i.d.r.v.’s {£, &, i > 1} having finite

mean, condition £ E DAN is optimal for having (a)-(c) of Lemma 1.2.8 for process in (1.2.42) with l&^j + |&^| > 0, i.e., for the process in (1.2.42) that is truly based

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 81

on {&> * > 1}, as opposed to the case b ^ = = 0, when (1.2.42) is error-based only. Lemma 1.2.9 is a special version of Lemma 1.2.5 of Section 1.2.1, where (a) exhausts the class of sequences {Z,Zi, i > 1} of i.i.d.r.v.’s that possess invariance principles in (b)-(d) of Lemma 1.2.5.

Lemma 1.2.9. Suppose that (B), (C) and (E) hold true and that l&^l + l&^l > 0. Then, as n —» oo, for process in (1.2.42) the following statements are equivalent:

(a) £

y / n (C> b ) to jd (b> V ■ - TyS ” JV(0. *o). *0 e (0, l];

(<=) 7------ri7S^W(t) on (D%l),p);

(d) On an appropriate probability space for {(£,5, e), (&, 8i} ef), i > l } we can con­ struct a standard Wiener process (W(i), 0 < t < oo} such that

y/n (C, b)t W(nt) sup = <>f (1). 0< t< l (EiLi

Proof. According to Lemma 1.2.5, (b), (c) and (d) above are equivalent to (1.2.44) and E((,b) = 0. The latter condition is satisfied on account of (B), (E) and finiteness of E£. Thus, we need to show that (1.2.44) and (a) are equivalent or, due to the proof of Lemma 1.2.8, only that (1.2.44) implies (a). If V a r^ C .t)- cm)(b^6 4- b^h)^ — 0, then (1.2.44) immediately implies (1.2.45). Otherwise, (1.2.45) follows from part (b) of Lemma 1.2.7 applied to U+V = (C,b) and U = (C,b) — (£ — cm)(b^S + b^2h). Assumptions of this lemma are easily seen to be satisfied. We have E\UV\ < oo on account of (1.2.47). Clearly, E U2 = E(b&8 + 6<4>e + b&(8e - /*) + b^{S2 - A 0) + b^(e2 - 0)f < oo due to (B). Finally, we are to verify that for V = (£ — cm)(b^8 + b^e)

P(V = const) 7^ 1,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 82

where, since # ((£ —cm)(b^5 + b^£fj = 0, the only candidate for this const is zero. So, we will show that

P((£ - cm)(bil)S + &(2)e) = o) ^ 1.

Let sets Ai = {£ — cm = 0} and A2 = {6^^ + b^e — 0}.

Since Var£ = Var(£ —cm) > 0 and (1.2.46) is valid (\b^\ + |6®| > 0),

P(Ai) < 1 and P(A2) < 1.

Therefore, also taking into account (E), we get

P((t-cm)(bV5+bWe) = 0) = P(A1UA2) = P(A1)+P(A2)-P (A 1)P(A2) < 1,

on account of

Thus, we have checked conditions of part (b) of Lemma 1.2.7 for obtaining (1.2.45). Now, from (1.2.45), (1.2.46) and the converse part of Lemma 1.2.6 we conclude that ( ( - c m ) € DAN and hence, also, (a). □

Lemma 1.2.8 allows us to study the Studentized process MUjt (cf. upcoming (1.2.52)), a prototype for the main terms in expansions for our original processes of interest (WLSP’s, etc.), and thus, we establish Lemma 1.2.10. In view of Re­ mark 1.1.2 of Section 1.1.4, conclusion of (b) follows from (c), while (a) is a simple consequence of (b). Hence, the proof of Lemma 1.2.10 reduces to verifying (c).

Lemma 1.2.10. Let assumptions (B), (D) and (E) be satisfied, d — (d^l\ • • •, d^) be a nonzero vector of constants such that

S l)f3 + d(2) = 0 and d^(32 + d(A)fi + d(5) = 0 (1.2.50)

and vector e be given by

e = (2 0 V + d P \ 0 V + 2 d<5), dW,d(2), d(4), d(3), d(5)) (1.2.51)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 83

When e^ = e® = 0, assume additionally that (1.2.43) with e in place of b is satisfied. Then, as n —» oo, for process

Mnit = \Jn(n - l)x

.1 2\2 12 (dW (yi-y)+dW(xi-x) +d(3)(siiyy-Syy)+dw (siiXy- S xy)+d^)(siiXX-S xx) (1.2.52) the following statements are valid:

(a) MnM 5 JV(0,to), for to € (0,1];

(b) M„,t -5 W(t) on (D[0,l],p);

(c) On an appropriate probability space for {(£,5, e), (&, ef), i> 1} we can con­ struct a standard Wiener process {W(£), 0 < t < oo} such that

o

Proof. On account of (1.2.50),

dw (yi - a) + d^Xi + d(3)(sxy - fi) + d(5)(sj)Xa; - 0)

= + Si) + d ® \£ i + £i) + d ^ (si£ £ @ 2 + 2si£sfi + (s^ss ~ Xd)')

+ (si^/3 + S{£s + Si££(3 + (sit$e — p,)) + df®(si££ + 2s,-^e 4- (si)ee — 0))

= (2fid® + dW)(fc - cQ(6t - cS) + (/3dP + - c^)(si - ce)

+d^8i + S^Si 4- S^{(5i — cS){£i - ce) — p)

+d(3) ((<5i - cS)2 - Xd) + d(5)((e4 - ce)2 - d)

= (Ci ^ + cRiin), (1.2.53)

where vectors Q and e are as in (1.2.41) and (1.2.51), c is from (1.1.7) and term Ri{n) is

Ri(n) = e(1) ( - 8(& - m) + (m -£)($, - 5)) + e(2) ( - ?(& - m) 4- (m - £)(et - e))

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 84

+e(5) ( - e <5* - 5 e* + <5 £:) + - 2 6Si + (<5)2) + e(7) ( - 2 e £* + (e)2) . (1.2.54)

If intercept a is known to be zero, i.e., c ~ 0, then processes in (1.2.42) and (1.2.52) coincide in view of (1.2.53), and therefore, Lemma 1.2.10 amounts to Lemma

1.2.8 . Suppose now that a is not known to be zero, i.e., c = 1. It suffices to prove (c). First, we will show that sup | y/nR(n)t\ = Op(l) , if Var£ < oo and/or = e® = 0, o 0, o< t oo. Typical summands in y/nR(n)t are handled as follows. From (D), part (c) of Lemma 1.2.5 and the CLT, we get hgg«-m ),| < OM sup IVg«-H.),| = O r® = o< t

sup 0 ,(1 ) = oP(l) (1.2.57)

and sup \y/nSet\ < -~y}) = op(l). (1.2.58) o

sup ------\y/nR{n) ------= 0p(1) (1 2 59) osgl(ai(Cf-C,o)2/(n-l))

Part (c) of Lemma 1.2.8 for (C, e)t(E£.i{C»“ C>e)2/(n —1)) (under condition (1.2.43) with vector e in place of b), (1.2.53) and (1.2.59) result in (c) of Lemma 1.2.8 for

vTlfdW(i, - a)t+ri(% ,+ d(3) (S„f - +d(i> + d<5) (S„,, - ~ ^ ) )

(HU <6-C,e>7("-i))1/2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 85

in place of (C>&)*(]C?=i(Cj - C >fy2/(n — 1)) 1/2. Thus, to finish the proof of (c) of Lemma 1.2.10, it suffices to show that, as n —» oo,

C,e)2 4 1 . (1.2.60)

In order to show (1.2.60), using (1.2.41), one writes

(Ci - C, e) = e(1)((& - n»)ft - (£ -m)8^ + e(2)((& - m fa - (f - m)e) + e(3)(ft - 5) + e ^ fa - e) + e(5)(ft£j - Si) + e(6)(ft? - J2) + e(7)(e2 - e2), (1.2.61)

while from (1.2.53),

d{l\yi - y ) + £ 2){xi - x ) + d{3)faiyy - Syy) + d(4)faiXy - Sxy) + d{5)fatXX - Sxx)

= d ^ (ft — S) + d ^ fa — e) + (2f3(si£s — + fa,55 — Sss))

+ d ^ (j3fa^e — S^e) + f a , t f — S$s) + (Si,Ss — Sse)^j

+ d^ (2(Si£g — S^e) + (Sj.ee — S££)}

= e ^ (fa - £)(ft - 5 ) - ( £ d - ?3)) + e<2>((£ - $ (* -e) - fa i- ?E))

+ e(3)(ft -S) + e ^fa - e) + e(5) ((ft - 6) fa - e) - (Ss — Se)J

+ e<6> ((ft - 6)2 - ( 5 s - (5)2)) + e(7) ( fa - e)2 - ( ? - (e)2)). (1.2.62)

Due to (1.2.48), (1.2.49) and the Cauchy-Schwarz inequality, both in case Var £ < oo and Var£ = oo, it suffices to prove that

- X) (d{1\yi - y ) + d{-2\x i - x) + d{3)faiyy - Syy) + d{4)fa>xy - Sxy) n i=1 + ^6)(si.« - S») -(0 - C, e))2 = op( 1). (1.2.63)

On using the Cauchy-Schwarz inequality again, the latter convergence follows from the following statements for the corresponding components of (1.2.61) and (1.2.62):

£ E (e<1) («< -?)(«■- J) - (& ~ P )) ~ «(1) (fe m)Si- - (( - m)f)) = oP(l), 1=1 (1.2.64)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 86

“ £ ( e(2)((& “0 (£i e)) “ e(2)(fe - ™)£i ~ (f ” = oP{ 1), 1=1 (1.2.65)

” ]C (e(5)((

i E (e<6)((* - s)2 - ( & - m ) - em(5f - P ))2 = op( 1) (1.2.67) 71 i= 1 and

“ £ (e(7)((£i “ e)2 - (e1 - (e)2)) - e(7)(e2 - e2)) 2 =<>p (1). (1.2.68) i= l We conclude (1.2.64) via applying the same Cauchy-Schwarz argument, the WLLN, the CLT and the Marcinkiewicz law of large numbers for (& — m)2, where £"!(& —

to )2 |1/2 < oo, as follows:

- £ ((6 ~ ?)(5* “ - (& ~ mM<)2 = ~ ~ _ m)^+ (m ~ - ^))2 n i= 1 n i= l

< “ £ (& ~ m)2(5)2 + ^ X)(5* - 5)2(to - £)2 n i = i n i= i

= ° H i) “ 2 S(& ~ m)2 + M l) = M l) n i=1 and _ _ I £ -gs-fj))2 = = op(i). i = l In the same manner (1.2.65) is obtained. Now, by noting that all (1.2.66)-(1.2.68) are handled similarly, we only show (1.2.66). The proof of (1.2.66) easily results from the WLLN as follows. 1 / — \2 1 JL. / _ — \ 2 - £ ((Si - S)(£i -£)-Si£iJ = ]T- ( - $i£ - fei + $ £)

(<*?©2+*?©2 + (^)2) = Ml) <^En t= i and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 87

This completes the proof of (1.2.60), and hence also that of (c) of Lemma 1.2.10. □

R em ark 1.2.6. Let all the assumptions of Lemma 1.2.10 be satisfied. It is easily seen from (1.2.48) and (1.2.49) of Remark 1.2.5, and from (1.2.60) that

'E?.l (^l\v,-t)+^ K xi-x)+^ i,m -S„)+ ^ iHsi„ -S ^ )+ ^ (sitXX-Sxx)f n — 1 EILi -y ) + d(2)(^i- x)+ d(3)(silW- Syy) +d (4)(siiXy- Sxy)+ (siiXX-S'**))' (n — l)£|(n)

p f positive constant,, if Var £ < oo and/or = e® = 0, —► < n —► oo, (1.2.69) (positive constant , , if Var£ = oo and |e(1^| + je^| > 0, where vector d is as in (1.2.50) and relates to vector e via (1.2.51), while slowly varying function &$(n) /* oo is from (1.2.28). Hence, from (1.2.69) and part (b) of Lemma 1.2.10,

sup IdP(y - a)t + d ^ x t + d(3) (Syyit - [nt]\0/n) + d ^(S XVlt — [nt]/j./n) o<*t - [nt]0/ra)|

, if [eWl + |e(a)| > 0, y/n (1.2.70) O p ( 1) , if = e® = 0, . y/n with lf{n) of (1.2.28). We also note in passing that (1.2.69) well defines the denom­ inator of Mntt process in (1.2.52), WPA1, as n —> oo.

Remark 1.2.7. Using nonzero vector of constants d in (1.2.50) and random vector

Vi(n) = (.Vi ^i,yy ^i,xy &i,zx ^)» (1.2.71)

process MUit in (1.2.52) can be rewritten as

M = ______yrc (v(n), d)t______(E"=i(^(n) - rj(n), d)2/(n - l ) f /2

Thus, Mntt is a version of the Student process (cf. (1.2.13)) in a somewhat loose sense, for special triangular sequence {{rji(n), d), 1 < i < n, n > 1} of dependent

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 88

r.v.’s. Therefore, Lemma 1.2.10 can be viewed as a nontrivial extension of invariance principles based on Studentization that are summarized in (b)-(d) of Lemma 1.2.5. Moreover, in the context of model (1.1.1)-(1.1.2), Lemma 1.2.10 can be applied to all the estimators that are linear combinations of (y, x, Syy, Sxy, Sxx) and their corresponding processes, and thus, also, to various reasonable estimators based on the latter vector and their corresponding processes.

R em ark 1.2.8. Suppose that |e ^ | + |e®| > 0 in Lemma 1.2.10, i.e., as seen from (1.2.53), process Mn,t of (1.2.52) truly depends on {£, i > 1} (when = e^> — 0, Mnjt is error-based only). Then, condition (D) is practically optimal for the invariance principles in (a)-(c) of Lemma 1.2.10 for Mn e)2/(n — 1)) ^ (cf- proof of Lemma 1.2.10 for details), the latter process is the main term for Mn>t in (1.2.52). In the sense of Lemma 1.2.9 for this main term process (|e(1)| + \e^\ > 0), when a / 0, we say that (D) is nearly optimal for asymptotics in Lemma 1.2.10.

P roof of Theorem 1.1.2a. By Remark 1.1.2 of Section 1.1.4, it is enough to prove (iii). In view of (1.2.36) and (1.2.37) of Remark 1.2.4, to prove (iii) of Theorem 1.1.2a for y/nU(l,n)0in—/?)t(E£=i(wi(l,n)— u(l,n)2/(n— 1)) ^ 2, it suffices to establish (iii) for y/nU(l,n)(Pin-0)t (£F=i(^(l,n)-w(l,n))2/(n -l)) 1/2, with (&„-£)* of (1.2.35). First, we argue that y/nu(l, n)t( ^ ^ (^ (l, n)—u(l, n) )2/( n —1)) ^ is a special case of Mn,t in (1.2.52) and obeys (c) of Lemma 1.2.10. Indeed, since fj, in (1.1.3) is assumed to be zero, t«i(l,n) = -2/32(A + /32)-1(A SitXX - sitVy — ^ -1(A - /32)siiXy) =

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 89

-2/32(A + /32)-1((si)XX - 0)X - (siiyy - A9) - 2z(siiXy - corresponding vector d is d — — 2/52(A + /32)_1(0, 0, —1, —/3_1(A — /32), A) and condition (1.2.50) with such d is satisfied, since d ^ = df® = 0 and — (2/32)-1(A + f32)(d^02 + + d(5)) = —02 — /3_1(A — /32)/? + A = 0. Moreover, for vector e of (1.2.51) that corresponds to d, lei1*! + |ef2>|=|2/3d<3> + d<‘>\ + \/}d<4> + 2dO. From (1.2.19), (1.2.28) and (1.2.70) of Remark 1.2.6 (case |e ^ | + |e®| > 0), for (zn - z)t of (1.1.12),

sup \(zn - z)t\ = —A + — /?2 - sup_ I777TT\u(l,n)t =_ -pT-r 1 kip) 0 P{ 1) = oP(l), (1.2.72) 0 < t< l 4jJzb xy0< i< l m n ) V n

with &z{n) of (1.2.28). Consequently, as n —*■ oo, for process in (1.2.35), for each fixed t e [0,1], via applying the Taylor expansion in (zn — z)t around zero,

(z m 2/52 ^ , sign(/3)A (zn - z ) 2t ^ 0 «- - /* -- * + ? < * - *>•+2((^ : ,)(+ , )2 + ^

with some random ^ = v(n) G (0,1). Inequality *_____

uniformly in i G [0,1], and (1.2.73) yield

______-y/n U (1, n )0 in — f3)t______

(£?=i(ui(l,n) - u(l,n) )2/{ n - 1))V2

= 7------7172 “ z)t (1 + “ *)*) (E£.i(«i(l,n) -w (l,n))2/(n - 1)) X + @

= t ------™ ( i + 0 (i)% - ,),) (£2=i («*(!»n)-u(l,n))2/(n- 1)) ______Vnu(l,n)t + p^t______i/2- (1.2.74) (£JU(u»(l,n) - u ( l ,n ) ) 2/( n - 1))

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1,1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 90

On account of (1.2.72) and (1.2.70) of Remark 1.2.6 for it(l,ra) v

sup pfitl < 0 (1 ) SUP |\/n u (l,n )t| sup \(zn - z ) t\ 0 < t< l

°(l)£e(n)Op(l ) ^ J nj ^ °p ( 1)» (1.2.75)

that combined with (c) of Lemma 1.2.10 for y/nu(l,n)t[Y%=i{ui(l,n)—u(l,n) )2/ (n — 1)) ^ and (1.2.69) for 2"=i (wi(l>n) — M(L n) )2/(n ~ 1) yields (iii) of Theorem 1.1.2a for the initial left hand side process in (1.2.74). The proof for MLSP’s of (3 follows directly from Lemma 1.2.10, since

______V™ U0) n)(t3jn - (3)t______= ______y/nu(j, n)t______( Zi=i(ui(j, n ) - u(j, n) )2/( n - 1))1/2 ( E?=i(ui(j, n) - u(J, n) )2/( n - 1))V2 (1.2.76)

j = 2 and 3, i.e., since so normalized (Pjn — 0)t of (1.1.18) and (1.1.20) are versions of Mnit as in Lemma 1.2.10, with respective vectors d as in (1.2.50) and e of (1.2.51) such that |e^| + |e^| >0. □

P ro o f of T heorem 1.1.2b. It suffices to prove (iii), since (iii) implies both (ii) and (i) (cf. Remark 1.1.2 of Section 1.1.4). Via Remark 1.2.4, the proof of (iii) for the WLSP of a reduces to the one for the process in (1.2.39). Suppose first that Var£ — M < oo and consider process in (1.2.39). Using (1.2.73), we have

Vn(ai„ - a)t

= y/n((y - x p - a)t - x0 ln - /3)t) 'fft ______(y - a)t - p x t - «(1,n)t

+ { s xy( X + P2) Mj3{ X + P2)) 2Sxy^ n z)t + 0 (1) x ( ^ *)?) =: y/n v'{l,ri)t + p% (1.2.77)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 91

where u<(l,n) is as in (1.1.37) and, on account of the WLLN, (1.2.19), (1.2.28) and (1.2.72)

sup UJ1| o < t< i1 1 2VS>8* x m I'S'avl sup \(zn-z)t\ + Vn\x\0(l) sup \{zn-z)\ A + ,32 Sxy M/3 0

= y/^0p{l)i\{n) 0P( l ) ^ + = 4(n)op(l) = opC1). (1-2.78)

where £^(n) of (1.2.28) is such that £g(n) = const > 0 when Var£ < oo. Com­

bining (c) of Lemma 1.2.10 for y/nv'(l,n)t^ ^ =1(vi(l,n) — v'(l,n) )2/(n — 1)) ^ (respective vector d satisfies (1.2.50) and e of (1.2.51) is such that |e^)| + |e ^ | > 0 if m ^ 0, and if m = 0, = e^> = 0 with (1.2.43), i.e., with inequality Var(5 — (3s) > 0 satisfied on account of (B)), (1.2.77), (1.2.78) and (1.2.69) ap­ plied to X)£=i(vKl)n) ~ u'(l,n) )2/(n — 1), one concludes that (c) of this lemma is also valid for process

______Vnjain - a)t______= ______y/nv'(\,n)t + p $______(Ei=i(vi(l,n)-v'(l,n))y(n-l)y/2 (E?=iW(l,n) - v'(l,n) )2/(n - 1))1/2 Thus, to complete the proof of (iii) of Theorem 1.1.2b for the WLSP of a when Var£ < oo, the following convergence has to be shown: - v(l,n) )2 Hi=i(Vi(l,n) - v'(l,n))2

= ((.Vi-vypfa-*) - 2f c M 1*") ~ “ft*w) ))2 ,p ! n ^ oo (1 2 79)

£?= i((yi-y)~P(xi-x) - -u(l,n)))2 On observing that the WLLN, (1.2.19) and (1.2.69) of Remark 1.2.6 for E ”=1(tti(l, n)

—u{ 1, n) )2/ (n — 1) imply, as n —> oo,

_ \ 2

(m ~ tk) S (Mhn) ~w{1’n) )3 * °’ the proof of (1.2.79) follows from the Cauchy-Schwarz inequality and (1.2.69) for

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs; Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 92

Assume now that Var£ = oo. We are to prove that (c) of Lemma 1.2.10 continues to be valid for

______y/n{a\n ~ a)t______(E ”=1(vi(l, n) -v(J~n})2/(n - l) f/2 = y/n(y-xP- a)t - y/nx0ln - 0)t ( E iU M 1. n) - v'jl, n) )2y /2 (eLlW (i.») )V(n - 1))1/2 V ~ )2 /

= . ______y/ny'(l,n)t + p$______f E U m ,n) - v'i1*n) )2^ 1/2 (ESLiM(l.n) - )2/(n - 1))1/2 ' - v(1’n) )2 / (1.2.80)

where v'^l,n) is as in (1.1.37). As n -* oo, by (1.2.72), (1.2.73) and the WLLN,

sup Pnl = sup \y/nx0ln -p)t\ = = op(l), (1.2.81) 0

1 A — r n) - 1/( 1, n) )2 = - — - £ ((y* - y) - 0(xt - x)) n 1 i=i n l i=1

= r^T £ (to _ F))2 i=l Var(<5 - 0e) > 0. (1.2.82)

Clearly, (1.2.19), (1.2.28), WLLN for x and (1.2.69) for E iL iC ^lj n)—u(l,n) )2/(n- 1) imply ( X \ 2 I A , /1 v —7r— 7. 2 Op(l) ... \ 2 SxyJ n - lf5Z(«i(1»n) ( “ “(!»n) ) = ~W7Z\ *f(n) = °p(1) and hence, via the Cauchy-Schwarz inequality and (1.2.82)

EJLi («<(!»«) ~ v(l,n) )2 E2=i(«i(l,n) -u'(l,n))2 . 2 Eti ((y< ~ y )- ^ ~ ^-afcW .n )- «(!.")))' p 2 n ^ (1 % ^ E?=1 ((y* - y) - ^(x* - x))'

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1,1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 93

Finally, combining (1.2.80), (1.2.81), (1.2.83) and (c) of Lemma 1.2.10 for y/nv'{ 1, n)t (lwLi(^(l)n) ~ v'(l,n) )2/{n — 1)) (respective vector d is as in (1.2.50), e of (1.2.51) is such that = 0 and condition (1.2.43), i.e., inequality Var( 0, is satisfied on account of (B)), one concludes the proof of (iii) for the process on the initial left hand side in (1.2.80). The proof for the WLSP of a is now complete. The proof for MLSP’s of a follows the same pattern. □

Proof of Theorem 1.1.2c. Only part (iii) of Theorem 1.1.2c has to be proved (cf. Remark 1.1.2 of Section 1.1.4). Assuming that p = 0 in (1.1.3), for process in (1.1.14) we have

V ^L (l,n)(0ln - 0), = ^ ( A + &)(«,„ - 0), \fn = Vn((Syy}t - A9[nt]/n) - 2Sxy>tfi + (Sxx,t - 0[nt]/n)fi2)

=: y/nw(l,n)t + p$, (1.2.84)

with W i(l,n ) of (1.1.35). It is to be shown that, as n —> oo,

sup \p\i\I (3)1 0

= 0sup \Vn((SXXj - 8[nt]/n)(fi?n - /32) - 2Sxy,t(Pln - fi)) + ^=(A + fi?n)\

= op(l). (1.2.85)

Clearly, for (1.2.85) it suffices to verify that

Vn sup \(SXXtt - 8[nt}/n)0?n - fi2) - 2S^tifim ~ fi)\ = y/n° P^ = oP( 1). o < t< i 1 1 n ( 1.2.86 ) On using Remark 1.2.5, (1.2.28) and part (c) of Lemma 1.2.5 applied to &,

sup |%,t|= sup |fe-c£)(e-ce))J 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 94

sup |f(£ — m)e) I + sup |raet| , if c = 0, o

fOp(l)

where £^(n) is as in (1.2.28) or (1.2.29). Similarly,

sup \Szs,t\ = ^j^-O p(l), sup |5ee,t - B[nt\Jn\ = , o[nt\/n\ = and sup | Sss,t ~ X9[nt]/n\ — . 0

Employing (1.2.87), (1.2.88), (1.2.33), (1.2.28) and the lines in (1.2.34), we get

y/n sup USxxjt - 6[nt]/ri)(p?n - P2) - 2Sxy>t{Pm - P)\ o

+ ( S « ,t - 9[nt]/n) 0]_n ~ P)(P\n + P)

-2 % * (& «- p ) - 2SSe>t(pln ~ P) I

< d P2 ( y Op( 1) , W Op( 1) , 1 Op( 1) , kin) Op( 1) \ - V ^ W ni2{n) + y/E nq(n) + y/E yfc y/^{n)J /— Op{l) , N = V n = op(1). n

The latter proves (1.2.86) and hence, also, (1.2.85). Next, we argue that y/nw(l,n)t (£?=iM l,n) — w(l,n) )2/(n — 1)) 1//2 satisfies Lemma 1.2.10. Since for vector e of (1.2.51) corresponding to respective of as in (1.2.50) we have that = 0, condition (1.2.43) has to be verified. Suppose that (1.2.43) fails, i.e., that Var(J2 - X9 - 2pSe + p \ e 2 - 6)) = 0. Then (8 - pe)2 = A9 + p26, 8 - pe = y/(X6 + P26)2 and, since E(8 — Pe) = 0, A0 + P26 = 0. The latter equality contradicts positivity

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 95

of T of (1.1.3) (cf. (B)) and proves (1.2.43). Combining part (c) of Lemma 1.2.10 for y/nw(l,n)t[j2i=i(wi(l,n) -w (l,n))2/(n - 1)^) 1/2, (1.2.69) implying that ( n - l)-1 X£=i(wi(l,ra) — w(l,n) )2 positive constant , n —> oo, (1.2.84) and (1.2.85), we get (iii) of Theorem 1.1.2c for process y/nL(l,ri)(9ln — 0)[nt](EiLi(wi(l>n) “ w(l,n))2/(n - i)) 1/2. Now, we are to establish (iii) of Theorem 1.1.2c for \fnL{2, n)(02n — @)t x ( Z)iLi(iyi(2, n)—w(2, n) )2/(n — 1)) . On account of Observation 1.1.1 in Section 1.1.4,

and f e 2 = r 2 + (1.2.89) y/n^{n) Vn£5(n)

where i${n) is as in (1.2.28). Due to the lines in (1.2.25) and (1.2.89), for (92n - @)t of (1.1.25) we have

02 n — Q)t _ (Syy,t - X6[nt]/n)(SXXit - 6[nt]/n) - (Sxyit - p,[nt]/n)2 Syy — A 9

+ 2 + ^ ! ( i ) ) ^ + (‘S'w.t ~ Afl[nt]/w) - 2P(Sxyit - Ai[nt]/n) + - 9{nt)/n) + R2,n,t X Syy ~ X 9 _ (Syy,t ~ X9[nt]/ri)(Sxx,t - 9[nt]/n) - (Sxyit - p[nt]/n)2 Syy ~ X 9 /?2( % - Stftt)((Syy,t - X9[nt]/n) - 2/3(Sxytt - fj{nt}/n) + /32(SXXjt - 9[nt]/n)) + P(s„- A9) + i?3,n,t _ S ^ tp 2((3-2(Syytt - X6[nt]/n) -2p-1(Sxyit - p,[nt]/n) + - fl[nt]/n)) + Rn,t Syy ~ X 9 /32(S# - S ^ tt)((Syy,t - X9[nt]/n) -2/3(Sxy,t - fi[nt]/n) + /32(SXXit - 9[nt\/nf) + f?(S„ - M) + R3,n,t

= Si- y (2’-% + ^ (12.90) &yy

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 96

where

R \,n ,t — 2/3(S'^ — Sm) + (S$g — 9 — Sgs,t + 9[nt\/n), (1.2.91)

R 2,n,t = Z02n - P) (Sxy>t - /Ji[nt\/n) + (/?22n - p2) (SXXtt - e[nt]/n), (1.2.92)

R-3,n,t

= 2(3 ( % — S ^ it)R2,n,t + P 2Ri,n,t ((Syy,t ~ \9]nt\/n) — 2P(Sxy,t ~ lAn37

+ - *[nt]/n) + fl2irM) + -^ ^ (/? 2(% - % t) + i2w )

((

(1.2.93)

and, similarly to (1.2.26),

Rn,t = -Sls>t - S2£t/32 + 2SfttSfatP + (See,t ~ 9[nt]/n)(SSs,t - A9[nt\/n)

~(Sde,t - l*[nt]/n)2 + 2 St5,tP(S££tt - 9[nt]/n) + 2S ^ t{S6S>t - A9[nt]/n)

-2 Szs,t(s te,t ~ n[nt)/n) - 2S^itP(SsE,t - fi[nt]/n).

On account of (c) of Lemma 1.2.10 for process y/nw( 2, n)t (Er=i(wi(2, n)—w{ 2, n))2/ (n—1)) (cf. similar arguments in regards of y/nw{ 1, n)t ( n)—w{ 1, n))2 /( n - 1 )) 1//2), (1.2.69) for (n— I)-1 YlZ=i(wi( ^ n) ~ w(2,ra) )2> (1.2.19) and (1.2.28), to complete the proof of part (iii) of Theorem 1.1.2c, it is left to be shown that, as n —> oo, y/n sup Ii2n.fl = op(l) (1.2.94) Syy — A9 0

y/n sup |i? 3,n,f| = op(l). (1.2.95) 0

Op{ 1) = Op(l), y/n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 97

while for in (1.2.91),

sup |i2i,n,t| = ~^ 7=^ 0p( 1) + —j=z 0p( 1) = -^7=^ 0p{ 1), (1.2.96) o

where slowly varying function ^(n ) is as in (1.2.28). Mutatis mutandis, the proof of (1.2.86) and (1.2.89) lead to

sup \R2>ntt\ = (1.2.97) o

sup | S# - % lt| = Z\{n) 0 P( 1). (1.2.98) 0

(1.2.70) of Remark 1.2.6 (case = e® = 0) yields

sup \p~2(Syyit - AB[nt]/n) - 2/3~1(Sxyit - /j.[nt]/n) + (SXXit - 9[nt]/n)I = ° pj- ) . o

y/n sup |f t 3,n,t| 0

+ (W+^)) W

= ° ^ W ) ^ 0f{1)+ej^ 0pW) =

The proof of Theorem 1.1.3c for (A 03„ — A9)t of (1.1.26) goes the same way as XV the one for ($ 2n — 9)t and hence, is omitted here. □

Proof of Proposition 1.1.1. Statement (i) follows from Remark 1.2.4 and the proof of Theorem 1.1.2. For example, that for j — 1 the first process in (1.1.37) is the main term for y/nU(l,n)0in — /?)t(D”=1(ui(l,n) — u(l,n) )2/(n — 1)) ^ is seen from (1.2.36) and (1.2.37) of Remark 1.2.4, (1.2.74), (1.2.75), (1.2.69) for

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 98

£ ”=i(wi(l,n) — u(l,n) )2/(n — 1) and Lemma 1.2.10 for v ^ w(l>n)f(l^Li(wi(l>rc) - u(l,n))2/(n - 1)) 1/2. We are to show (ii). The first and, if m / 0 in (C) and Var£ < oo, the second process in (1.1.37) are examples of general Mn,t process of (1.2.52) with Je^| + |e ^ | > 0 for the corresponding vector e in (1.2.51). Next, as summa­ rized in Remark 1.2.8, the main term processes for such Mn>t are the processes y/n (C, e)f(Er=i(Ci-C» e)2/(n — 1)) with e chosen respectively. Moreover, mu- tatis mutandis, statements (i)-(iii) of respective Theorems 1.1.2a and 1.1.2b hold for these y/n (£, e)t^?=i(C i — C> e)2/(n — 1)) ^ processes if and only if £ € DAN. By noting that y/nU(j,n)(j3jn - /3)t(£”=i(ui0',n) - u(j, n) )2/(n - 1)) 1/2 are cases of Mn,t process in (1.2.52) with |e ^ | + |e(2)| > 0 , j = 2,3, and applying Remark 1.2.8 (case a = 0), we arrive at the conclusion of (iii). □

P roof of O bservation 1.1.1. On account of (1.2.69) of Remark 1.2.6 (case |e ^ | + |e ^ | > 0), as n —> oo,

S t= i(ui(i)n) u 0»n) ) _P p0Sitive constant, j = 1,3, (1.2.100) (n — l)t^{n)

with k(n) of (1.2.28). Now, (1.2.100) combined with (1.2.19) and (1.2.28) results in

______U(J,n)______k (n) ( E"=i n) ~ uti, n) )2/{n - 1))1/2 _ U(j, n) ( (n-l)lf(n) \ 1/2 k(n) \E”=i(«i(j»n) ~ UU, n) )2/ p ___ —> nonzero constant, j = 1,3.

Prom the latter convergence and (i) with to = 1 of Theorem 1.1.2a, we get (1.1.38). Due to (1.2.69) for £ ”=i(v-(j, n) ~v'(j, n) )2/ ( n - 1), with v'/j, n) of (1.1.37) (case eW = = 0 when Var£ = oo), by (1.2.79) and (1.2.83), as n —» oo,

A positive constant, 3 = T J. (1.2.101) n — 1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 99

Similarly, via (1.2.69) for }2t=i(wi(j,n) ~ w(jtn) )2/(n ~ 1) (case = e ^ = 0), as n —► oo,

ELM(j,n)-wU.n)r p fo sm e aw to |A ,=T^ ( 1.2.102) n — 1

Hence, (1.2.101), (1.2.102) and respective (i) with to = 1 of Theorems 1.1.2b and

1.1.2c imply that both in case Varf < oo and Var£ = oo, ajn, j — 1,3, i ^ln> $2n and XQzn are v'fi-asymptotically normal. □

P roof of T heorem 1.1.3a. First, we will show that as n —» oo,

1-H=l(ui(■?> n) ~ u(ji n) )2 P -i • (1.2.103) Eti(«i(j,n) - ^ ) ) 2 ’

To establish (1.2.103), we are to verify that, as n —* oo,

E?,i(s«s-5ss)2 = Op(l), (1.2.104) n2ti(n)

EiLl(Si,£<5 S$s)2 _ /-X J Z)iLl(Si,|e S&)2 (1.2.105) — ^50— () — — p()’ and that I X ^ 0 * £ = Op{1), S f c M ! = 0p(1)

(1.2.106) “ d - °p(1)' Clearly, on account of (1.2.28), (1.2.29) and the WLLN g,ifa,g-Sa)2 2Etite~c|)4 ! 2"^ n2£|(n) n2£^(n) n2£f(n) 2 E3U(6 - m + m - c£)4 + Op(l) n2^|(n) n(m-c£)4\ m - 16( n%(n) +~^(nj 'J+0P(1)

= O p(l), a ‘fe ~ ^ 4'8 + o p (D.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 100

where, due to (1.2.15) and identical distribution of random variables (& - m)4

x ( - ra)2) 2, 1 0,

,( IXdii-rn)* \ , ( fo -m )4 \ n —» oo, U H U te - m Y Y > V - - m)2)2J ( )’

and thus, (1.2.104) is proved. As to (1.2.105), on using Remark 1.2.5, (1.2.29) and the WLLN,

E?-i(*««- S(. f , 2ET.i(te-cD(e<-cg))!! 2nSl n2£^(n) ~ n2£j (n) n2£2(n)

< 4 ( 3tx k tii ~ cm)2(£j - ce)2 + c(m - Q2E^i(g» ~ ce)2\ + oP(l) n2£|(n) n2£2(n) J n£j(n)

where £^{n) obeys (1.2.28) and (1.2.29). The other statement in (1.2.105) can be proved in the same way. It suffices to establish (1.2.106) on the example of Eti(Si,« - S«)2(n2«f(n))'‘. Due to the WLLN

S U f a .a -S ss ? , 2Zl,i(Si-cSy 2n S l n2£|(n) ~ n2f?|(n) n2£|(n)

, 1r E& .i^ 0,(1) Op( 1) n2£%(n) n2£2(n) n£%(ri) n£|(n) P

Convergence (1.2.103) for j — 2 follows from (1.2.100) and the Cauchy-Schwarz inequality, since by (1.2.89) and (1.2.104)-(1.2.106),

S ”=i ((^»(2,n) - 5(2, n )) - (t^(2, n) - u(2, n) ))2 n£|(n)

_ EjLl (si,xy ~ S xy)2{f3 — /?2n)2 _ /.. \ Ei=l(5i,a:y ~

^ ^ E?=1 ((si,?e - S^)2(32 + (Si£S - % ) 2 + (Sj^e - S^e)2f32 + (Sije - Sse)2)

- p W = op(l). (1.2.107)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 101

For j = 1 and j = 3 the proofs of respective (1.2.103) go the same way and are based on (1.2.104)-(1.2.106), with the additional note for j = 1 that f3\n is a consistent estimator of (3 and

A-02 A-02, A C&„-0) + (®,-0*) _ KT K i 0p(1)(A" “ Finally, (1.2.103) allows us to replace Ui(j,n) in (i)-(iii) of Theorem 1.1.2a with respective Ui(j, n), j = 173. □

P ro o f of T heorem 1.1.3b. Similarly to the proof of Theorem 1.1.3a, this proof reduces to establishing convergence

T?=i(viU,n) ~ v(j,n ))2 P i „ . i ’ n °°’ (1-2-108) All Vi(j,n) possess similar forms and hence, (1.2.108) is shown for j = 1 only. From (1.1.38) of Observation 1.1.1, (1.2.28) and the WLLN

^J2(P-An)2(xi-x)2 < 2(P-pln)2(s#+-f2(£i-?)2) n i=i \ n i= i / = (%(n)0P(l)+0P(l)) = oP(l). (1.2.109)

Due to (1.2.19) and (1.2.28),

fay 2 - w* (i-2'no) while (1.2.100) for j = 1 and the version of (1.2.107) for Uj(l,n) and €t*(l, n) yield

“ £ (M 1*71) “ «(!,»)) ~ Ml>n) - u(l,n) ))2 = t\{n)oP{ 1).

Thus, by the latter equality, (1.2.109) and (1.2.110),

n) — Vi(i,ri)) - ((^(l.n) -i/i(l,n)))2 i=l

= {(P-Pin) (*< -x ) - ((«i(l. n) - «(1»»)) - M l, w) - «(1, n) ))^

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.2 Proofs: Theorems 1.1.1-1.1.3, Propositions 1.1.1, 1.1.2, Observation 1.1.1 102

< ~ ~ An)2(®i “ X? n i=1

+2(t/(l,n)) n

= 0p(1) + ^ ^ (n)0p(1) = 0p(1)’

that combined with the Cauchy-Schwarz inequality result in (1.2.108) for j = 1. □

P roof of T heorem 1.1.3c. Analogously to the proofs of Theorems 1.1.3a and 1.1.3b, it suffices to show

g . 1(

^ X n ) ~ w {l,n)) - (wi(l, n ) - w{ 1, n) ) f = oP( 1). 2=1 Using (1.1.38) of Observation 1.1.1, the lines in (1.2.34) and (1.2.104)-(1.2.106), we get

- ^ (V i)) - ((^i(l,n) - to(l,n) ))2 i=1

= - X ( “ 2 ( 3 l»~ P )(si>*V ~ S xy) + (Pin ~ P 2)(si,xx - S xx) ) 2 n i=1 = I E((s««-% )(A »-«2+2(s«.-S£.)(A„-/3)2 71 i= l +(Si,ee~See)(Pln— P)0in+ P ) ~ 2(Si,£<5 ~ Stf)(Pin ~ fi)

—2{Sii$e — Sse){Pln ~ P'S)

^ p /i \ EiLl(Si,^ — S #)2 . . E?=l(Si,^e — S &)2 . q /.>. E?=l(Si,ee — S e s f - ° p(1) — ------+ ° r W + 0p(1) t f q w

I n /'i\ E i= l(si,?(5 — S tf)2 . p / H \ Ei=l(5j,fe — Sge) _ t-t\ p +0p(1) — T u r n — + 0 p (1 ) — -op(1)- D

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 103

P roof of Proposition 1.1.2. Follows from Proposition 1.1.1 and convergence in (1.2.103), (1.2.108) and (1.2.111). □

1.2.3 Survey on Major Results on GDAN and Studentization with New Characterizations

In this subsection we will deal with the notion of, and results on, generalized domain of attraction of a d-variate normal law, denoted here by GDAN in view of previously used DAN for the univariate case. Lemmas 1.2.11—1.2.15 survey the major known, as well as a few new results on GDAN and Studentization. Section 1.2.3 attempts to achieve two major goals. On the one hand, our ex­ cursion into basic known results on GDAN helps us first to obtain the observation of Lemma 1.2.12 and then to conclude Lemma 1.2.15. In the context of Chapter 1, the latter lemma is a crucial auxiliary tool for further developments in Section 1.2.4. On the other hand, the survey of the results in this subsection can also be read independently of Chapter 1, just like its univariate companion Section 1.2.1. We emphasize, however, that it is the richness of the special EIVM context of the present Chapter 1 that suggested to us to take a close look at the relationship of the notions and general results of Sections 1.2.1 and 1.2.3 and to come up with new additions to the topics of these sections. Consequently, when proving characteriza­ tion Lemmas 1.2.12 and 1.2.15 in this subsection, we also hope to be contributing to a general theory on DAN, GDAN and Studentization. Lemma 1.2.15 features an extensive summary on various characterizations of GDAN, combining the results of Lemma 1.2.5 of Section 1.2.1 and those of Lemma 1.2.12 that is obtained here. In particular, Lemma 1.2.15 contributes in part to answering the fundamental open problem of characterizing when the multivariate Student statistic is asymptotically standard normal and when its corresponding process is asymptotically a standard Wiener process (cf. Remark 1.2.15).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 104

All the notations, definitions and abbreviations below are in the list in this regard that is provided in the thesis. We are to introduce here only those that first appear in this subsection. Notation of (1.1.8) is employed below for vectors in IRd. Random vector Z is called spherically symmetric if all ( Z, u) with nonrandom vectors u, ||w|| = 1, have the same distribution coinciding with the distribution of each single component Z ^ \ 1 < j < d- Spherically symmetric distribution is the distribution of a spherically symmetric random vector. For random matrix Bn = (&n)i=I^,j=T7 and matrix B = > 38 n —» oo, Bn B means that each entry of Bn converges in P to the corresponding entry bij of matrix B. For the space C( [0,1], Ht^) of JRd-valued C[0, l]-functions that is endowed with the sup-norm metric p, we apply the notation (C( [0,1], IRrf), p). {Wd(t) = (W(1)(t), • • •, W(rf)(i)), 0 < t < oo}, or Wd(t), is used for an IRd-valued standard Wiener process, i.e., the components W ^\t), • • •, WW(t) are i.i.d. standard real-valued Wiener processes. For a positive definite matrix A, two kinds of square roots are to be introduced. = l/2 1/2 A denotes the (left) Cholesky square root of A. A is the uniquely existing lower triangular matrix with positive diagonal elements and is such that A ^ (A = A. = 1/2 Clearly, A is invertible. Yet another “version” of a matrix square root of A we will deal with is the symmetric positive definite square root of A denoted here by A 1^2. The latter exists and satisfies [A1^2^ — A. Here are some further relevant definitions: 3 ' 1/2 = ( X ^ , 1"1/2 = (tf'T *, T'* = end T ™ = (~A ^ 2)T. Sometimes, the Cholesky and symmetric positive definite square roots of A are both designated by universal A 1/2, and A-1/2 = (A 1//2) \ A T^2 = (A l/2)T and A - W = (A -1/2)T. Now, we introduce a definition of GDAN in view of Hahn and Klass [29] and Mailer [42].

Definition 1.2.2 Let {Z,Zi, i > 1} be i.i.d. random vectors in JR.d. We say that

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 105

Z belongs to GDAN if there exist nonstochastic sequences of vectors an and d x d matrices Bn, such that

^pZi-a^B Z Z N(0,Id), as n ^ oo. (1.2.112)

R em ark 1.2.9. It was shown in Mailer [42] that if (1.2.112) holds, then £'||Z||“ < oo for 0 < a < 2 and On can be taken as nEZ, while norming matrix Bn is invertible for large enough n and may be chosen to be symmetric (Bn = Bf). Also, Bn converges to zero, as n —» oo. In fact, according to Meerschaert [50], we may assume that Bn = n_1/2Ln, where matrix Ln is slowly varying, which means that L[\n]Lnl -* Id. for all A > 0, n —> oo. As a general fact, it is also known that (1.2.112) implies that Z is full (cf. Lemma 3.3.3 in Meerschaert and Scheffler [51]).

In the main result of Mailer [42] various equivalent characterizations of GDAN can be found. Such equivalences from Theorem 1.1 of [42] that are most relevant for the aims of our survey in this subsection are summarized in the following lemma.

Lemma 1.2.11. Let {Z,Zi, i > 1} be i.i.d. random vectors in IRd having a full distribution. As n —» oo, the following statements are equivalent:

(a) Z e GDAN, i.e., there exist nonstochastic sequences of vectors an and matrices Bn as in (1.2.112);

(b) there exist nonstochastic square matrices Bn such that Bn'E(Zi-W (Zi-Z)BZ 4 Jj; i=l

(c) Z f 4 0;

x2P(\{Z,u}\ > x) (d) sup - V V'- ...... — 0. item*, Ml=i E ({ Z ,u }2l {\{z,u)\

R em ark 1.2.10. In fact, equivalence of (a) and (d) under EZ = 0 is one of the main results of the seminal paper by Hahn and Klass [29] that has apparently

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 106

stimulated intensive studies of GDAN. On the other hand, condition (c), also called “quadratic negligibility” in [42], is a natural d—dimensional version of O’Brien’s univariate condition in (1.2.2) and first studied for d > 1 in [42]. There, matrix Z f Zi is seen to be invertible WPA1, n —» oo. In turn, condition (b) can be interpreted as a modified d—dimensional condition (1.2.3) for Zi. From the course of the proofs in [42], matrix Bn in (b) is invertible for large enough n and may be chosen to be symmetric. Furthermore, equivalence of (a) and (b) in Lemma 1.2.11 (with the same Bn) can be viewed as a modified generalization of Lemma 1.2.2 of Section 1.2.4.

R em ark 1.2.11. Via (d) of Lemma 1.2.11, it is easy to see that if Z G GDAN, then each component Z ^ of Z is in DAN, 1 < j < d. Indeed, (d) implies that

x2P(\ZV)\ > x) — 7---- * ----- !------+ 0, n —* oo. (1.2.113)

The latter is known as Levy’s necessary and sufficient condition for Z ^ to be in DAN (cf. [37]). On the other hand, all Z® G DAN alone is not sufficient to guarantee that Z G GDAN. Indeed, modifying somewhat Remark (ii) on p. 193 of [42], suppose that EZ = 0 and all Z® are identically distributed and belong to DAN. Then, in (1.2.3) of Lemma 1.2.2, with z\^ replacing Zi, bn may be chosen to be the same for all sequences {Z-J\ i > 1}, j = 1, d, and therefore,

g -'f" 2 4 1, n -* oo. (1.2.114)

Further, from Remark (ii) on p.193 of [42], via p.236 of Feller [18], (1.2.114) is equivalent to (1.2.113) with \\Z\\ in place of \Z^\, and such a form of (1.2.113) does not imply (d) of Lemma 1.2.11. However, as an exception in this regard, the class of spherically symmetric random vectors Z is pointed out in Remark (ii) on p.217 of [42]. As noted there, for this class of Z, it is seen that Z G GDAN is equivalent to each Z ^ G DAN, on account of each projection of Z having the same distribution.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 107

In fact, getting away from the condition that components Z^) of vector Z are identically distributed as in the exceptional case of Remark 1.2.11, we are to give now yet another interesting example of special vectors Z = (Z ^\ • • •, Z ^ ) for which Zii) e DAN for all j = 1, d characterizes that Z e GDAN. This special class of random vectors was suggested to us by the context of model (1.1.1)-(1.1.2) in this chapter, and it is crucial for Lemma 1.2.15 and Lemma 1.2.16, the main auxiliary results for Section 1.2.4.

Lemma 1.2.12. Let Z = (Z ^ \ ■ ■ •, Z ^ ) be a vector in IRd. Suppose that

for j / k, E\Z^Z^\ < oo, if E(ZW))2 = oo and/or E(Z{k))2 = oo. (1.2.115) For vector Z formed by all the components of Z whose second moments exist, if any, assume that Z is full. (1.2.116) Then the following two statements are equivalent:

(a) Z € GDAN;

(b) Z ^ G DAN for all j = lfd.

Proof. That (a) implies (b) follows from Remark 1.2.11. Conversely, assume that (b) holds true. Via equivalence of parts (a) and (b) of Lemma 1.2.11, the proof of (a) of this lemma reduces to verifying convergence in (b) of Lemma 1.2.11 for suitably chosen matrices Bn. If E(Zfr'))2 < oo for all j = 1, d, then from (1.2.116) Z = Z is full. The notion of a full vector enables one to conclude that Cow Z > 0 and, due to the WLLN applied component wise, part (b) of Lemma 1.2.11 is satisfied with matrices

Bn — n~^2 Cow Z ^ and Bn = n~^2 Cow Z . (1.2.117)

Suppose now that, without loss of generality, E (Z ^)2 = oo for all j = l,m , m < d, while E (Z ^ )2 < oo for all j = m + l,d. First, note that such vector

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 108

Z is full and thus, Lemma 1.2.11 is applicable. Indeed, for any scalar unit norm vector u, Var (Z, u) = £jL i(^)2VarZ® + 2 u^u^cov(Z^\ Z^), and, if |uW[ + ••■ + > 0, on account of (1.2.115), Var(Z, u) = oo, while when wW = ... = ^(m) = 0, Var (Z, u) > 0 due to (1.2.116). Next, it is not hard to verify convergence in part (b) of Lemma 1.2.11 with block-diagonal matrices ______I/O Bn = n"1/adiag((^1)(n))_1, • • •, (^ (n ))" 1, CovZ(ro+M )

and (1.2.118)

Bn = n -1/2diag((^1>(n))"1, • • •, ( ^ ( n ) ) -1, C avW +^~l/2),

where 6 l\n ) / oo, • • •, &^{n) / oo are slowly varying functions, such that, as n —> oo, ~ EZ&) x> N( 0,1), j = l,m , (1.2.119) y/n£W(n) (cf. Remark 1.2.1). Note that matrices in (1.2.118) are well-defined on account of (1.2.116) for Z = Z^m+l,d\ For matrices in (1.2.118), we have

= (ej‘)iteT3= ( ^ r f f ) . (1-2.120)

where respective symmetric matrices En are given via their matrix blocks as follows:

= _ with e* _ S".1(zjJ)-W )(z|t)-z

El = (en)j=T^,k=^+T4 > where ejk = C(mst g =1(4j) -z® )(zlm+1) -W * $ ) n n£(fi(n)

+ " ' + ’ (1.2.122)

with constants in (1.2.122) depending on j and k, and

El = CovZ(m+1-d) 2

T CovZ(m+1’d) ■ 2 n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 109

and (1.2.123)

El71 = CovZ(TO+1>d)~^ Y'n / ~(™+M) _ 7(m+l,d)'\T/'nr{rn+\,d) _ 7(m+iid)\ ______j x Z±i= i^ i ------" ------LSfi------L CovZln+W \ n

As n —* oo, on account of (1.2.119) and Lemma 1.2.2,

4 1 for all j = l,ra, (1.2.124)

and, due to (1.2.115) and the fact that £^(n) /* oo, j = 1 ,m,

e£fc 4 0 for j ^ k, j — 1,m and k = l,d, (1.2.125)

while, clearly,

^ 4 I d - m . (1.2.126)

Thus, for matrix En in (1.2.120), (1.2.124)-(1.2.126) result in

En 4 Jd, n —> oo, (1.2.127)

i.e., convergence in part (b) of Lemma 1.2.11 for matrices in (1.2.118) holds true. Finally, the third type of vectors satisfying (1.2.115), (1.2.116) and having all their components from DAN consists of vectors Z with E (Z ^ )2 = oo for all j = 1 ,d. Such vectors Z are full, since for any scalar unit norm vector u, on account of (1.2.115), Var {Z, u) — oo. This allows to apply Lemma 1.2.11 via checking convergence in part (b) of Lemma 1.2.11 with

Bn ~ n_1/2diag((^(1)(n))_1, • • •, (£(d)(n))-1), (1.2.128)

where £^\n) are as in (1.2.119). For elements ejfc of matrix En in (1.2.120) defined by Bn of (1.2.128), similarly to (1.2.124) and (1.2.125), we obtain, as n —> oo,

J k _ ES.1(zP -M )(zf-IW ) p fl, j = k, n»>(„)f«0(n) \0, j ? k .

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 110

Hence, (1.2.127) for such En is valid. □

Remark 1.2.12. Further comments follow on conditions (1.2.115) and (1.2.116) defining the special class of vectors Z for which (b) implies (a) in Lemma 1.2.12. It was shown there that if Z is as in (1.2.115), (1.2.116) and (b) of Lemma 1.2.12, then Z belongs to GDAN with norming block-diagonal matrix Bn of (1.2.117), (1.2.118) or (1.2.128) in Definition 1.2.2. In the course of the proof, when showing that such ma­ trices are suitable for convergence in (1.2.127) with respective matrices En, assump­ tion (1.2.115) appears to be quite reasonable. More precisely, when E(Z^ ) 2 = oo for some j, given the fact that respective t^\n ) f oo are slowly varying and typi­ cally unknown functions, condition (1.2.115) well takes care of nondiagonal elements of En

M(n) ana ntw • 3 r K'

namely, it brings them in P to zero, as n —» oo. In particular, (1.2.115) may follow from independence of Z® and Z^k\ According to the Cauchy-Schwarz inequality, (1.2.115) is also satisfied when, e.g., Z^> € DAN with E (Z ^ )2 = oo (from Remark 1.2.1, E{ZW)2~A < oo for any A € (0,2]), while E (Z ^ )2+A < oo for some A > 0, j = 2, d. Condition (1.2.116) is a natural one. It guarantees that whole vector Z is full and also allows to use Lemma 1.2.11 for the proof of Lemma 1.2.12. We also note that the cases of diagonal norming matrices Bn in Definition 1.2.2 were earlier considered by Resnick and Greenwood [58] in connection with studies on stable distributions in IR2.

Motivated by a natural demand for matrix Studentization results for random vectors converging in distribution to a spherically symmetric random vector, Vu, Mailer and Klass [64] establish the following rather general result of such a nature.

Lemma 1.2.13. For n > 1, let Cn bedx d real invertible nonstochastic matrices, Vn be d x d real symmetric stochastic matrices and Sn be 1 x d stochastic vectors.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 111

If, as n —> oo, CnV„C£ 4 h (1.2.129)

and S„C£ S Z, (1.2.130)

where Z is a spherically symmetric random vector in ]Rd, then

£ Z (1.2.131)

and S„V~1/2 £ Z. (1.2.132)

R em ark 1.2.13. As noted in [64], (1.2.129) implies that Vn is positive definite —-T/ 2 WPA1, n —> oo. Hence, square roots of Vn and, consequently, Vn in (1.2.131) and V~1^2 in (1.2.132) are well-defined WPA1, n —» oo. It is also noted there that if instead of (1.2.129) and (1.2.130) one assumes, as n —> oo,

An 1/2 Vn A~T/2 Id and Sn A~T/2 % Z

or (1.2.133)

A ;1/2VnA~ll2^ I d and SnA~1/2^ Z ,

with dx d real positive definite matrices An, n > 1, then spherical symmetry of Z is not required for conclusions in respective (1.2.131) or (1.2.132). Moreover, we note here that if (1.2.133) is assumed to begin with, then the proof of respective (1.2.131) or (1.2.132) in [64] reduces to showing that, as n —> oo,

=-1/2 ==-T/2 p . ==T/2=-T/2 (=-l/2 = l/2\T p An VnAn —► Id implies An Vn = I Vn An J —► Id

or (1.2.134)

l » r Id .m plies

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 1 1 2

Hence, the nature of the probability space of vector Sn and thus, also, the kind of convergence in distribution in (1.2.131)-(1.2.133) are irrelevant, and Sn can be viewed as a random element of any metric space.

For sequence of random vectors {Zi, i > 1}, generalizing univariate definitions of (1.2.12) and (1.2.13) from Section 1.2.1, we define the multivariate Student statistic

n \ —T/2 ( n - l )-l’£(Zl-Z)T(Zi-Z )\ , (1.2.135) i=l / and the multivariate Student process in C{ [0,1], lRd)

/ \ - t/2 Vn>t(Z) = VH Zt[(n - I)’ 1 f ^ Z i - Z)T(Zi - Z )J -iii + ( n t - M )n-I/2ZM+1f(n-l)-1y;(Z(-Z)T(Zi-I) ,(1.2.136) \ i=l where, clearly, n \ —T/2 / n \ —1/2 Z ( z , - Z f ( Z t - Z )j = ( | '(Zi - W (Zi - Z)J (1-2.137)

if (i:U(Zi-W (Zi-Z)) is the symmetric positive definite square root of

D”_i(^i—Z)T{Zi~Z). As explained in forthcoming Remark 1.2.16, if {Z, Zi, i > 1}

are i.i.d. random vectors and Z € GDAN, then matrix ((n— l)-1 Ya=i(Z i — Z)T{Zi —

Z)J Ty/2 in (1.2.135) and (1.2.136) is well-defined WPA1, n —> oo. A combination of results on GDAN enables one to conclude convergence in (a) and (b) of Lemma 1.2.14 below, the respective weak convergence of the multivariate Student statistic of (1.2.135) and the weak invariance principle for the multivariate Student process of (1.2.136) on (C{ [0,1], lRd), pj. The results in Lemma 1.2.14 are natural multivariate one-sided generalizations of Lemma 1.2.4 and of the equivalence of (a) and (c) of Lemma 1.2.5 in Section 1.2.1. Though, naturally, (b) implies (a) (cf. Lemma 4 in [61] for an equivalent characterization of convergence in (b) of Lemma 1.2.14), we state (a) and (b) below as two separate statements and also provide independent proofs of them according to their historical order.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 113

Lemma 1.2.14. Let {Z, Zi, i > 1} be i.i.d. random vectors in ]Rd and Z € GDAN. Then, as n —> oo,

(a) V„(Z-EZ)ZN(0,Id);

(b) V^(Z-EZ)Zw d(t) ).

Proof. As noted in Remarks of [64], (a) follows from the conclusions of Lemma 1.2.13 via combining (1.2.112) and (b) of Lemma 1.2.11 with Bn invertible for large

n (cf. Remark 1.2.10). Applying Lemma 1.2.11, we note that Z in GDAN has a full distribution (cf. Remark 1.2.9). The proof of part (b) is based on the same arguments, only (1.2.112) is replaced here with weak convergence, as n —> oo,

n Z t B l + {nt - [nt])Z[nt]+i B% % Wd{t) on (C( [0,1], IR**), p), (1.2.138)

that amounts to Theorem 1 in Sepanski [61], with norming matrix Bn as in (1.2.112). □

Remark 1.2.14. Though Studentization in Lemma 1.2.14 and its more general parental result in Lemma 1.2.13 can be performed by both the Cholesky and sym­ metric positive definite square roots, one’s preference, as pointed out in Vu, Mailer and Klass [64], depends on the problem at hand. For example, it is also conjectured

in [64] that for the purpose of transforming — EZ) or n{Z — EZ)t so as to have approximately a spherically symmetric distribution, the symmetric positive definite square root is likely a better choice to accomplish this in small samples.

In general, as far as we know, it remains an open problem whether or not there exist a converse to conclusion (a) of Lemma 1.2.14 and thus, also, one to (b) of this lemma, as opposed to the recently obtained univariate characterization results in [20] and [16], [17] (cf. Lemmas 1.2.4 and 1.2.5 of Section 1.2.1). However, we are now to point out a subclass in GDAN which is characterized by asymptotic normality of all

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 114

the univariate Student statistics Tn( Z of (1.2.12) that correspond to multivariate Student statistic Vn(Z) of (1.2.135), rather than by asymptotic normality of (1.2.135) itself (cf. more in Remark 1.2.15 below). For doing this, we introduce class of random vectors

Z := | random vectors Z G IR**, such that Z has a full spherically symmetric

distribution and/or Z satisfies (1.2.115) and (1.2.116)}. (1.2.139)

Combining the results of Lemma 1.2.5 of Section 1.2.1 and Lemma 1.2.12 of the present Section 1.2.3, we summarize various characterizations of GDAN for random vectors from Z as in (1.2.139) in the following Lemma 1.2.15.

Lemma 1.2.15. Let {Z, Zi, i > 1} be i.i.d. random vectors from Z as in (1.2.139). Consider processes Vnit(Z) from (1.2.136) andTn>t(ZW) of (1.2.13), j = 1, d. As n —> oo, the following statements are equivalent:

(a) Z G GDAN and E Z — a;

(b) Z® G DAN and E Z& = aV\ for all j = Tfd;

(c) Tn

(d) Tn,t(ZW - oW) 3 W{t) on (D[0,1], p), for all j = Tfd;

(e) For each j = 1, d, on an appropriate probability space for {Z ^\ Z*f\ i > 1}, we can construct a standard Wiener process {W(t), 0 < t < oo} such that

sup = o,(l); 0

(£) V„M(Z -a ) Z N(0,t0Id) and T„,„(ZW - aV>) ZN(0,t0), for to £ (0,1], for all j = 1, d;

(g) Vnf(Z -a ) Z Wi{t) on (C(J0,1],«*),/>) and TnJ( Z ^ - a ^ ) Z W(t) on (D[0,1], p), for all j = 1, d.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3 Survey on GDAN and Studentization with New Characterizations 115

Proof. Equivalence of (b), (c), (d) and (e) is due to Lemma 1.2.5 of Section 1.2.1. Equivalence of (a), (b), (f) and (g) is argued as follows. From Lemmas 1.2.12 and 1.2.14 (a) yields multivariate weak invariance principle in (g) and conclusions of (b), that, in turn, via Lemma 1.2.5 of Section 1.2.1, leads to the second part of (g), i.e., to weak convergence of Tn>t(ZV) — a^), for all j = 1, d. Clearly, (g) implies (f), and (f) (with to = 1) results in (b) according to Lemma 1.2.5. On account of Remark 1.2.11 for spherically symmetric vectors and Lemma 1.2.12 for vectors satisfying (1.2.115) and (1.2.116), (b) implies (a). □

R em ark 1.2.15. Parts (f) and (g) of Lemma 1.2.15 indicate major difficulties one faces to answer, even in our special context, the fundamental open character­ ization question concerning when the multivariate Student statistic of (1.2.135) is asymptotically standard normal and, consequently, when its corresponding process in (1.2.136) is asymptotically a standard Wiener process. Thus, in order to conclude that (f) implies (a), being unable to work with the multivariate Student statistic VUji{Z — a) as such and make use of its weak convergence, we employ weak conver­ gence of its corresponding univariate Student statistics TU)i(Z ^ — a^), j = 1 ,d, and proceed via (b). Consequently, (a) is characterized by asymptotic normality of Tnti ( Z ^ —a^) for all j = 1, d, rather than by that of Vn>i(Z—a) itself. Alternatively, (a) is valid whenever all Student processes Tn>t(Z ^ — a ^) of corresponding multi­ variate Student process Vnjt(Z — a) are asymptotically standard Wiener processes (cf. (d) of Lemma 1.2.15).

Remark 1.2.16. If {Z, Zi, i > 1} are i.i.d. random vectors and Z e GDAN (in

particular, according to Lemma 1.2.12, Z^) may be all in DAN, while Z G Z), then Z is full (cf. Remark 1.2.9) and we have (b) of Lemma 1.2.11 with invertible matrix Bn (cf. Remark 1.2.10). Hence, on account of Remark 1.2.13, matrix Y%=i{Zi — Z)T{Zi—Z) is positive definite and thus, its Cholesky and symmetric positive definite square roots, as well as (]CiLi {Zi — Z)T(Zi — Z)J ^ , are well-defined WPAl,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 116

n —> oo. This also means that denominators ( (Zj^ — Z ^ ) 2/(n — 1)) of processes as in (1.2.13) are well-defined WPA1, n —> oo.

1.2.4 Auxiliary Results and Proofs of Theorems 1.1.4, 1.1.5 and Observation 1.1.2

This subsection provides the proofs for our main multivariate results of Section 1.1.4.

For the sake of proving Theorems 1.1.4, 1.1.5, first, two auxiliary random vectors are studied, namely those in (1.2.143) and (1.2.151) below, multivariate analogues of processes in (1.2.42) and (1.2.52) at t = 1 used in Section 1.2.2. First, we introduce random vector of (1.2.143) and establish Lemma 1.2.16 for it. This enables us then to examine another auxiliary random vector, the one in (1.2.151), via obtaining Lemma 1.2.17 with related remarks. The latter random vector is a special multivariate Student statistic (cf. Remark 1.2.20) that plays a crucial role in Chapter 1. More precisely, random vector of (1.2.151) is the prototype for the properly centered and normalized joint estimators studied in our main multivariate results and also for those random vectors that correspond to all reasonable joint estimators based On (^/, XJ Syy, Sxy, 5xx) • Hence, our auxiliary Lemma 1.2.17, a summary of the invariance principles for (1.2.151), can also be useful beyond Chapter 1 in the context

of ( 1.1.1)-(1.1 .2). Section 1.2.4 is a companion to Section 1.2.2, based on the developments of Sections 1.2.1 and 1.2.3.

For notations, definitions and abbreviations that are used below and have been introduced earlier, one may also refer to the list in this regard that is provided before Introduction in this thesis.

We are now to introduce the first of the two aforementioned auxiliary random

vectors. Let 61, • • •, bd be vectors of constants in 1R 7 whose first two coordinates bf^

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 117

and are such that

when Var£ = oo, |&P| + \bP\ > 0, while bP = bP = 0 for j = 2, d. (1.2.140)

Define d—dimensional vectors

W) (1-2.141)

and * = T n . (1.2.142)

where £ and £i are as in (1.2.40) and (1.2.41). Using notations in (1.2.141) and (1.2.142), we consider a special case of multivariate Student statistic in (1.2.135) as follows: / » \ -T/2 VS ',((»-1)'1£ ( j S - 3 ), H - ' ' ) J . (1.2.143) and assume that

J is full, if Var£ < oo and/or = bP = 0, (1.2.144)

and

j ( 2,d) _ ^ ^ ^ ^ ^ jg Var£ = oo and \ b P \ + |&[2)| > 0. (1.2.145) It will be argued in Remark 1.2.17 that under conditions of forthcoming Lemma 1.2.16 such a random vector of (1.2.143) is well-defined WPA1, n —» oo. We present a multivariate extension of Lemma 1.2.8 of Section 1.2.2 on obtaining CLT for (1.2.143) as follows.

Lemma 1.2.16. Let (B), (D) and (E) be valid. Consider the random vector in (1.2.143) that is defined via deterministic vectors £>i,••■,&<£ in 1R7 satisfying (1.2.140). Suppose that (1.2.144) and (1.2.145) hold true. Then, as n —» oo,

/ n \ -T/2 VS J ((n - l ) - 1 JJJ, - Jf(J, - J)j 4 N(0, Id).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 118

Proof. The proof is based on Lemma 1.2.15 of Section 1.2.3.

First, we are to show that vector J from (1.2.141) belongs to the special class of

random vectors Z defined by (1.2.139). In fact, the components of J obey (1.2.115) and (1.2.116). Indeed, if Var£ < oo and/or 6 ^ = 6® = 0, then E (£, bj)2 < oo for all j — l,d, and also, due to (1.2.144), conditions (1.2.115) and (1.2.116) are clear satisfied. If Var£ = oo and |6 ^ | + |&i2^| > 0, then (1.2.140) implies that

E (C, h )2 = oo, while E (C, bj)2 < oo for all j = %d. (1.2.146)

In this case (1.2.116) is guaranteed by (1.2.145), while (1.2.115) follows from

6j)| < oo for all j = 2, d,

that is argued similarly to (1.2.47). Thus, on account of Lemma 1.2.15 of Section 1.2.3 and the fact that EJi — 0 for i.i.d. random vectors Jj, the conclusion of Lemma 1.2.16 holds true if

(0 bj) e DAN, for all j = l~d. (1.2.147)

For each fixed j , (1.2.147) follows from (1.2.44) of the proof of Lemma 1.2.8 in Section 1.2.2, by noticing that condition (1.2.43) of Lemma 1.2.8 for (C, bj) is now a part of (1.2.144) and (1.2.145). □

R em ark 1.2.17. Since from the proof of Lemma 1.2.16, J £ Z and (1.2.147)

holds true, where vector J is from (1.2.141) and Z is defined via (1.2.139), then on account of Remark 1.2.16 with regard to i.i.d. random vectors {J, J*, i > 1}, matrix

((n — I)-1 ~ ~J)T { J i — ~ J )) T//2 in (1.2.143) is well-defined WPA1, n —» oo.

We are to define another auxiliary random vector of this subsection in (1.2.151) below, that is the prototype for the main terms in the expansions for the properly centered and normalized joint estimators of Theorems 1.1.4, 1.1.5.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 119

Let ci, • • •, q be vectors of constants in IR5 whose components satisfy

c ffi + c f - 0 and c ffi2 + c ffi + c f = 0 for all j = M . (1.2.148)

Let

bt = (20cf + cf, 0cf + 2cf, cf, cf, cf, cf, cf), j = M . (1.2.149)

Using ci, • • ■, Cd and vector rji{n) in (1.2.71), we put

Ki(n) = ({Vi(n)> ci), • • ■, (irji(n), cd)), i = Tfin, (1.2.150)

and define the random vector

/ \ -T/ 2 v'S K(n) ( f r - l ) - 1 J2(Ki(n) - K(n) )T(K,(n) - K(n) )J . (1.2.151)

Clearly, if ( ^ ( ^ ( n ) -K^jY{Ki(n)-K(n))) is the symmetric square root Of T,U(Ki{n) -K{ri)Y(Ki(n) - K(n)), then (Eti(Ki(n) - W ) Y(K,(n) - K jn j) y T/2 = ( Eti(ffi(n) -~K(Y))T(K,(n) - Kjn) ))~‘/2. W enotethat (1.2.151) is a multivariate version of Mnit at t = 1 from (1.2.52), the key auxiliary process for Section 1.2.2. Indeed, as seen from Remark 1.2.7,

1/2 (EE*(K?V) - KU)(n) )»/(« - 1)) y/n {vi{n), Cj) „ , n ic n i ------r m ~ Mn,u (1.2.152) ( E L i (Vi(n) ~ r}{n), c5)2/(n - 1))

with Mn,i defined with Cj in place of d in (1.2.52). The following Lemma 1.2.17 on CLT for random vector in (1.2.151) is a multi­ variate extension of Lemma 1.2.10 that dealt with Mn%t-

Lemma 1.2.17. Let (B), (D) and (E) be valid. Consider the random vector in (1.2.151) defined via (1.2.150) and deterministic vectors C\, • • •, cd in IR5 satisfying

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 1 2 0

(1.2.148). Assume also (1.2.140), (1.2.144) and (1.2.145) withvectors in IR.7 defined by Ci, • • •, according to (1.2.149). Then, as n —* oo,

/ » \ —T/2 Vn /T(n) ^(n - l)'1 £(jK*(n) - K(n) )T(#i(n ) -K(n ) )J -> AT(0, Jrf).

Proof. If intercept a is known to be zero, then due to (1.2.53),

Ki(n) = Jh (1.2.153)

where J\ is as in (1.2.142), and hence,

K>J((n-1)"1 ELi(Ki(n) - K lnj )T(K,(n) - Klflj))~T/2

= v/S 7 ( ( n - l ) - I ES=iW - 7 ) 3-( J , - 7 ) ) _T/2. (1.2.154)

Thus, in case a = 0, statement of this lemma is identical to that of the just proved Lemma 1.2.16. Assume now that a ^ O . From (1.2.53) ll-'aww -^R))'r/1 = L/S7((n-1 r'Y.UVi-WVi-yy*

+VS ^ ((n -l)-1 £LiM -7m -7))"’)

x ( E ? .,W -T )T(Ji- 7 ) ) T/2(T Z a (Ki(n) - K V j ) T(Ki(n) - K l n ) ) ) ^ ,

(1.2.155)

where vector y/n Q(n) = (y/n Q ^ (n) > , V n Q(2)(n))>

and, according to (1.2.55) and (1.2.140) for bj, each component y/n QW(n) of vector y/n Q(n) is such that

| y/n Q0)(n) | j fj p = op(l), noo, (1.2.156)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 121

where

1 , if Var£ < oo and/or = &P = 0, = 1 for all j = 2, d, = {£${n) , if Var£ = oo and |6^| 4- |5®| > 0, (1.2.157) with slowly varying function £%{n) of (1.2.28), such that £^(n) S oo when Var£ = oo, n —> oo. If in (1.2.155), as n —* oo,

|V»(n - 1)

and

(HUM -7)TM - 7))T/2(JXl(K,(n) - K{n) )T(K,(n) - K(n) ))“ T/2 4 Id, (1.2.159) then Lemma 1.2.17 follows from Lemma 1.2.16 for y/n j{{n—1) 1 £ ”=1(Ji—J)T(Jj—

First, we are to show (1.2.158). Introduce nonrandom matrix

n 2 diag( (i^(n) \JVar(6 ^ 5+b^e)} * (CovJ^2’^) , Var£ = oo and lif’l+l&PI > 0,

n - 2 (Cov J ) “ 2,otherwise, (1.2.160) where vectors b\ and j( 2>d) are defined in (1.2.149) and (1.2.145), respectively. Note that Bn is well-defined on account of (1.2.46) of Section 1.2.2, (1.2.144) and (1.2.145). Interpreting (1.2.158) as a degenerate weak convergence on (IRd, || • ||), and using the fact from Remark 1.2.13 that (1.2.133) implies respective statements of (1.2.131), (1.2.132) with a not necessarily spherically symmetric Z , for (1.2.158) it suffices to show that, as n —* oo,

(1.2.161)

and (1.2.162)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 1 2 2

Convergence in (1.2.161) follows from the fact that J € Z (shown in the proof of Lemma 1.2.16), and convergence in (1.2.127) for En of (1.2.120) defined with Bn as in (1.2.117) or (1.2.118) (cf. the proof of Lemma 1.2.12). In this regard, we also note that the correspondence of Bn in (1.2.160) to Bn in (1.2.117) or (1.2.118) is seen via (1.2.49) in Section 1.2.2, with bi instead of 6, and (1.2.140). As to (1.2.162), when Var£ < oo and/or = 0, it is a direct consequence of (1.2.156), (1.2.157) and (1.2.160). If Var£ = oo and |&P| + |&j^| > 0, then due to (1.2.156) and (1.2.157)

|yjn{n-\) Q{n)Blf

= yjn(n — 1) Q(1)(n) j2 (n) \JVar(6

+ 1 y/n(n—1) Q(2>d)(n)n-1/2 (Cov J^2,^ )“1/,2|| = op(l). (1.2.163)

For establishing (1.2.159), it suffices to prove that, as n —> oo,

(H U « -7 F « -7))T/2f£ U (1.2.164)

and

(B;1)T(EtiW (n) - K ln ) ) T(K,(n) - K ^ ) ) ~ Tn A U, (1.2.165)

with matrix Bn given by (1.2.160). Since

(Ea.,W -7)T« -7 ))T/2<

= - 7 ) TW - 7 ) ) 4)"‘b^(ES.i« -7 f ( J i - 7 ) ) 4

x ( E m W -

= (E?„«-TO EtiM - 7)rW - 7)BJ).

then (1.2.164) follows from convergence in (1.2.161) and (1.2.134) with symmetric matrix Vn = E"=i(*/i - J)T(Ji — 3) and A~1/2 = Bn. Similarly, (1.2.165) is a direct consequence of (1.2.134) and

Bn J2(Ki(n) - K(n) )T(Ki(n) - K(n) )B*->Id, n -> oo. (1.2.166) 2= 1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 123

The latter convergence is left to be verified, which we now proceed to do. Defining vector Ai(n) := Ki(n) - K(n) - (J* - 7), (1.2.167)

we have

J 2 ( K i(n ) - K (n) )T (K i(n ) - K {n)) = ~ J + &i(n) )T(Ji - J + A i ( n ) ) i=1 i=l

= E « -W ( J i - J) + EW - 7)TAi(n) + EAf(n)(Ji - 7) i=l i=l i= 1

+ S Af ( n)Ai(n)- (1.2.168) i=l Due to (1.2.63) of Lemma 1.2.10 in Section 1.2.2 (condition (1.2.43) with bj in place of e is satisfied on account of (1.2.140), (1.2.144) and (1.2.145)), for each component ASj)(n) of Af(n),

- ^ 2 = °p (1)> for all 3 = 1>d• (1.2.169) n *=l ' From (1.2.169) and the Cauchy-Schwarz inequality applied to each component of matrix n-1 J2i=i Aj(n)&i(ri), as n —> oo,

- X : Ar(n )A i(n )^ 0 , (1.2.170) n i=i and therefore, B„-£ A f (n)Af(n)B j i 0, (1.2.171) i= l with matrix Bn of (1.2.160). Now, in view of (1.2.161), (1.2.168) and (1.2.171), in order to establish (1.2.166), we only need to show that

-7 )TAi(n) + ^ Af(n)(Ji-jA bJ -*♦ 0, n -* oo. (1.2.172) V i=l i=l /

As n —> oo, when Varf < oo and/or = b ^ = 0, (1.2.172) follows from conver­ gence 1 EW - T f M n ) + ^ E A f(n)(Jl - 7) 4 0, (1.2.173)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 124

resulting from (1.2.140), (1.2.48) of Section 1.2.2 for — JW )2, (1.2.169) and the Cauchy-Schwarz inequality, since

n V n = Op(l)op(l) = op(l), for all j, k = 1, d.

If Var£ = oo and \b^\ + |6 ^ | > 0, then the proof of (1.2.172) goes as follows. Similarly to (1.2.120) of Section 1.2.3, we have

where symmetric matrix Wn is given via its matrix blocks as follows

,y, _ S .12(41)-7g)Af>(n) " n^(n)VaI(^ + ^>£) ’ 1 J

= (Wn )j= \, h=2 4 > w h e r e ik , ((4 " - J ® )AP(n) + ( jf - jm )A?1'(«)) £ = const ------w n n£{(n)

( U P - w H h const, EJrf ------) AfV)t-t-t +------(4d> - )A?>(n))- nt${n) (1.2.175)

and

w%= (cov j ^ y 1/2 £?=1 ((42,d)-l&*>)T&?'d\n) + (A?'d\n))T{42'® -!& % )) n x (CovJ(2’d))" T/2, (1.2.176)

where £^(n) is from (1.2.28) and such that £^(n) S' oo when Var£ = oo, n —> oo. Condition (1.2.140) and statements (1.2.49) and (1.2.48) of Remark 1.2.5 in Section

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 125

1.2.2 applied respectively to — )2 and YZ= ^ )2> 3 = 2> d, yield

while = o p (i) for all j = 275- (1.2.177) Now, (1.2.169), (1.2.177) and the Cauchy-Schwarz inequality result in

Wn 4 0, n -> oo. (1.2.178)

This leads to (1.2.172) in case Var£ = oo and \b^\ 4- |£>f^| > 0 and, also, completes the proof of (1.2.166). Hence, on (1.2.166) implying (1.2.165), and (1.2.164) and (1.2.165) implying (1.2.159), we finally conclude Lemma 1.2.17 when a ^ 0 by (1.2.155), (1.2.158), (1.2.159) and Lemma 1.2.16. □

Remark 1.2.18. We note that ({n—I)-1 Y^=\{Ki{n)—K{n) )T(Ki(n)—K(n)))

in (1.2.151) is well-defined WPA1, n —> oo. This is because matrix —

K(n))T(Ki(n) — K{n)) > 0 WPA1, n —* oo, that follows from convergence in

(1.2.166) with invertible matrix Bn and symmetric {Ki(n) — K{n) )T(Ki(ri) K (n)) by Remark 1.2.13.

R em ark 1.2.19. In (a)-(c) below, we provide some assumptions that are sufficient for, and equivalent to, those in (1.2.144) and (1.2.145) of Lemma 1.2.17. (a) First, we give some reasonable sufficient conditions for (1.2.144) and (1.2.145). On using the notion of a full vector, assumption (1.2.144) is satisfied if vector C m (1.2.40) is full and nonrandom vectors &i, • • •, bd in (1.2.144) are linearly independent. Similarly, fullness of C, and linear independence of b2, - • • ,bd guarantees (1.2.145). Following the lines of the proof of Lemma 1.2.8 (the paragraph where Var(C, b) > 0 is shown under Var£ < oo and |6 ^| + |6 ^| > 0), it is not hard to see that C of (1.2.40) is full if and only if det(CovC) > 0 (by (B), 0 < tr(Cov£)), while

det(CovC) > 0 holds true if and only if vector <^3,7) = (5, s, 5s — fi, 52 — X9, e2 — 0) is full or, equivalently, (5, s, 5s, 52, e2) is full. The latter error-based vector is full

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 126

if, e.g., E53 = Ee3 = 0 and all the error cross-moments of order < 4 are zeroes. As to verifying linear independence of nonrandom vectors &i, • ■ •, bj in (1.2.144) or

&2, • • •, in (1.2.145), having concrete vectors at hand, one can turn to computer software for doing this.

(b) Using the lines of the proof of Lemma 1.2.8 (the paragraph where Var(C, b) > 0 is shown under Var £ < oo and |6 ^ | + |£/2)| > 0), it is not hard to see that (1.2.144) and (1.2.145) are satisfied if and only if det (Cov(((, &i),--*,(C, &d))) > 0 and det (Cov((C, b2), • • •, (C, &d») > 0 respectively (by (B), 0 < tr(((C, &i), • • •, (C, &d))) < oo and 0 < tr(((£, b2), ■■■,((, & 0, that, by the WLLN, holds true whenever

U *& zZm z2lZc0,j> 0, _ o o . (1.2.179) n

In turn, via (1.2.167), (1.2.169) (proved without assuming (1.2.144)) and represen­ tation

-1 )= ^2(Ki(n) - K(n) - Ai(n) )T(Ki(n) - K(n) - A*(n)) i=l i=l

= J2(Ki(n) ~ K (n) )T(Ki(n) — K(n)) — J2{Ki(n) - K{n) )TAi(ri) i=l i=l

- S A ?(n)(K i(n) ~ K{n) ) + A f (n)Ai(n), i=l i=l

with Ki(n) from (1.2.150) and A i(n) of (1.2.167), convergence in (1.2.179) is equiv­

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 127

alent to

a,W .)-»))W -»))4 tosiUve d e fin ite ^, » - . oo, n (1.2.180) with positive definite matrix equals CovJ. Thus, condition (1.2.144) amounts to (1.2.180). Suppose now that Var£ = oo and |6 ^ | + |&f^| > 0. Then, similarly to the above, assumption (1.2.145) is equivalent to, as n —» oo,

p C o v j M > 0 n

or

a.(lfr«(.) - ) W «(n) -K ^ ( n ) ) , ^ ^ ^

(1.2.182) where the positive definite matrix in (1.2.182) coincides with CovJ(2,d\

R em ark 1.2.20. Random vector in (1.2.151) can be viewed as a version of mul­ tivariate Student statistic of (1.2.135) in a somewhat loose sense, based on the

triangular sequence of dependent random vectors { K i(n ), 1 < i < n, n > 1}. More­ over, Lemma 1.2.17 for (1.2.151) is not only an extension of the multivariate CLT based on Studentization (cf. (a) of Lemma 1.2.14), but its use can also go beyond studies in Chapter 1, since the results are suitable for all joint estimators that are reasonable functions of (y, x, Syy, Sxy, Sxx). In view of (1.2.152), this remark is a multivariate companion of Remark 1.2.7 in Section 1.2.2 for process (1.2.52).

P roof of Theorem 1.1.4. First, in view of Remark 1.2.4 in Section 1.2.2, we show CLT for vector y/n(U(l,n)0in — /?), (&in — a), L(l,n)(6in — 0)) instead of y/n(U( 1, n)0in~P), (2 in -a), L( 1, n)(0ln-0 )), where /3ln and 3 in are as in (1.2.35) and (1.2.39). From (1.2.74), (1.2.77), (1.2.80) and (1.2.84) of the proof of Theorem 1.1.2 in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 128

Section 1.2.2, we have

= ( v S K

+ A a ((n - - K W ) ) 'T,!j

x ( ZS.i(ff<(») - Kifl) )T(/f,(r.) - Kin) ) f /3 V -r /2(l, n), (1.2.183)

where Ki{n) = (ui(l,n), v-(l,n), Wi{l,n)), (1.2.184)

with Mj(l,n), ^(l,n) and Wi(l,n) of (1.1.32), (1.1.37) and (1.1.35), respectively, or, in other words, Ki(n) is as in (1.2.150), with vectors ?2 / \ _ Ql - 2 0 1 ci ~ (o,0 , -1 , (1.2.185) A + /32

TYt f (1,, -0 , ..0, 0,.. 0) - — c,. if Vare = M < «, (J^ ^ 1 (1, -0 , 0, 0, 0) , if Var£ = oo, and c3 = (0, 0, 1, -2/3, 0 \ (1,2.187)

In (1.2.183), vector /Vi = (p£{, P $ ), (1.2.188)

with components whose respective processes p^}, Pn} and p^J are as in (1.2.75), (1.2.78) (case Var£ < oo) and (1.2.81) (case Var£ = oo), and (1.2.85), respectively. Next, we argue that y/n K(n)(j(n—I)-1 £)?=i(Ki(n)—K(ri) )T (Ki(n)—K (n) )) in (1.2.183) obeys Lemma 1.2.17. From the course of the proof of Theorem 1.1.2, where Lemma 1.2.10 for (1.2.152) with vectors ci, c2 and c3 in (1.2.185)-(1.2.187) is verified, these Cj satisfy (1.2.148), and their respective vectors b\, b2 and bs defined via (1.2.149) (equal to those in (1.1.44) and (1.1.45) with j = 1, and (1.1.46), respec­ tively) are seen to follow (1.2.140). Conditions (1.2.144) and (1.2.145) of Lemma

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 129

1.2.17 for J and j( 2>d) of (1.2.141) and (1.2.142) with such &i, b2 and 63 amount to the assumed (1.1.42).

Due to Lemma 1.2.17 for y/n K(n)(/n — l)- 1 D£=i{Ki(n) — K(n))T(Ki(n) — \ —T/2 K{n))) and (1.2.183), the proof of Theorem 1.1.4 for the normalized joint esti­

mator on the initial left-hand side in (1.2.183) reduces to showing, as n —> 00,

| V ^r T A .,i( E t,( ^ i(n) - ) ? W )) “T/2|| = oP(l) (1.2.189)

and

(Ei,(K i(n)V-T/2(l,n ) £ I3. (1.2.190)

To obtain (1.2.189), we consider a special case of matrix Bn in (1.2.160) defined

via vectors 61, 62 and 63, that correspond to ci, C 2 and C3 in (1.2.185)-(1.2.187), Itf’l + Ii42)l > 0:

Bn =

n"3diag ^ ( n ) y'Var(6^)<5+6f}e)) \ (Cov((C,fli), (C,/&))) 2 j , if Var£ = 00,

n“5 (Cov((C, 61), (C, «i), (C,h})) 2 , otherwise. (1.2.191)

Just like (1.2.158) in the proof of Lemma 1.2.17 is resulted from (1.2.161) and (1.2.162), convergence in (1.2.189) follows from (1.2.166) of Lemma 1.2.17 and

|| V n ^ T pnil B l || = oP(l). (1.2.192)

On account of (1.2.75), (1.2.78) and (1.2.81), and (1.2.85), the components of p„ti behave similarly to those of y/n Q(n) (cf. (1.2.156) and (1.2.157), \b[ '| + \b\ >| > 0), i.e., P(1)l = 0J>(1), /> S I= 0p (1) “ d $ 1 = <*(!)• (1.2.193) k in)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 130

Therefore, the proof of (1.2.192) is similar to the one of (1.2.162) and thus, omitted here. As to convergence in (1.2.190), the following statements will guarantee it, as n —»oo:

( TX-AKiin) - Kin) )T(Ki(n) - K(n) tf'^BZ £ I, (1.2.194)

and (B~l)TV-Tl\l,n )^h , (1.2.195)

with Bn from (1.2.191). Now, since

(]T(Ai(n) - K{n) )r (Ki(n) - K{n) ))T/2B% i=1 = (£(*<(») - W ) ) T(K,{n) - K iK ) ) ) 'm B -1 i=1 X {Bn TS.^Ki{n) -K iK j)r(Ki(n) - Kin)

(1.2.194) is a consequence of (1.2.165) and (1.2.166). In view of (1.2.134), for (1.2.195) one only needs to check, as n —> oo,

BnV{l,n)Bl^h. (1.2.196)

First, using (1.1.47), (1.1.48) and (1.2.184), one writes

1/(1, n) = £ (.Ki(n) - K(nj + A,(n) - Afa) f (tfifa) - Ffa) + A, fa) - Afa)), l~l (1.2.197) with vector Ai(n) = (0, Vi(l,n) -t*J(l,n), 0), (1.2.198)

where on account of (1.2.79), (1.2.83) and (1.2.69) for E£=ifai(l >n) n) )2 (case Var£ < oo and/or = 0), as n —► oo,

£iLl(A f(n)-A W W )^ n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 131

and thus, also,

Bn it(M n) ~ A(n) )r (Ai(n) - A(n) )B l 4 0. (1.2.200)

Hence, (1.2.166), (1.2.197) and (1.2.200) reduce (1.2.196) to

BnJ2 ((Kt(n) - K(n) )r (Ai(n) - A(n)) + (A^n) - A(n) )T(*i(n) - K(n) ))f£ .

i=1 (1.2.201) n —> oo. Using (1.2.100) and (1.2.102) of Section 1.2.2 for K \l\n ) = iti(l,n) and K\z\n ) = iUi(l,n), as well as (1.2.101) and (1.2.69) for Ef=i(^ (l,n ) -1/(1, n) )2 = TZ=i(K i2)(n) ~ K^){n) )2, we get

( 1.2.202)

with £^(n) of (1.2.28). Mutatis mutandis, convergence in (1.2.201) follows from (1.2.172), since relations in (1.2.198)-(1.2.199) and (1.2.202) are the respective spe­ cial cases of those in (1.2.169) and (1.2.177). The proof of Theorem 1.1.4 for triples yfn(U(2,n)(j32n - P), (S2n - a), L(2,n)(92n ~ &)) and y/n(U(3,n)03n - ft), (a3n - a), L(3,ra)(A0in - A0)) is sim­ ilar to the one for y/n(U(l,n)0in — /?), (Sin — a), L(l,n)(0in — 0)) and thus, omitted. □

Proof of Theorem 1.1.5. If, as n —» oo,

V T/\j, n) V -W (j, n) 4 J3, j = 1 ^ , (1.2.203)

where V(J, n) and V(j, n) are as in (1.1.48) and (1.1.53), then Theorem 1.1.5 follows from Theorem 1.1.4. Due to similar arguments, we deal with case j = 1 only and omit the proofs of (1.2.203) for j = 2 and j = 3.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 132

For j = 1, to obtain (1.2.203), it suffices to show that, as n —► oo,

V T/2( l , n ) £ j 4 / 3 (1.2.204)

and (1.2.205)

with matrix Bn of (1.2.191). Clearly, (1.2.204) follows directly from (1.2.196), (1.2.134) and representation

V T'2{1, n) B l = V-V»(l,n) B~l(BnV{l,n) Bl).

As to the proof of (1.2.205), in view of (1.2.134), one needs to establish the following convergence: Bn V(l, n) B l 4 J8> n -» oo. (1.2.206)

Next,

Bn V (1, n) B l = Bn !>n) ~ p(1> n) )T(&()■>n) ~ Pi1*n) )Bl i=1 = {(Pi(l,n) - p(l, n)) + Ai(n)) ((j*(l, n) - p(l, n) ) + Ai(n))Bl

(1.2.207)

= BnV(l,n) B l + B n f; A f (n)Ai(n)Bn Z—1 n) ~ P(h n) )TAi(n) + A J(n) (p*(l, n) - p{ 1, n ) ) j-B j,

(1.2.208)

where vector Ai(n) is given by

A i(n) = (pi(l,n) - p(l,n))-(pi(l,n) - p(l,n))

= ^(S<(l,n) -u(l,n))-(«i(l,n) -u(l,n)),

(Vi( 1, n) - n(l, n )) - (»<(!, n) - v(l, n)),

(t«j(l, n) - w{ 1, n ) ) - (wi( 1, n) - tu(l, n) ) , (1.2.209)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.4 Auxiliary Results, Proofs: Theorems 1.1.4, 1.1.5, Observation 1.1.2 133

with vectors pi(l,n) and pi(l,n) of (1.1.47) and (1.1.52), respectively. From the lines of the proof of Theorem 1.1.3,

E“i£(I)n))2 = MD. W)2 - °p (1) ? (1.2.210) where ^(n ) is as in (1.2.28). Viewing (1.2.210) and (1.2.100)-(1.2.102) for p ^ (l,n ), j = 1,3, as respective special cases of (1.2.169) and (1.2.177) (with p ^ \ 1, n) in place

of j = M ), we obtain (1.2.171) with A j(n) of (1.2.209) and convergence in probability to zero of matrix in (1.2.208) (as a special case of (1.2.172)). Combining this with (1.2.196), one concludes (1.2.206). □

Proof of part (c) of Observation 1.1.2. Assuming that j = 1 and Var£ < oo, we prove that condition of (1.1.42)

vector ((C, bj), {(, q,-), (C, h)) is full (1.2.211)

and convergence of (1.1.54)

n~ 1 V(j,n)w —►P positive definite matrix, n —> oo, (1.2.212)

are equivalent, where positive definite matrix in (1.2.212) coincides with Cov((C, &i),

(C, ai)j (C, M)- Prom Remark 1.2.19, one concludes that (1.2.211) for j = 1 is equivalent to (1.2.180) with Ki(n) as in (1.2.184) and positivedefinite matrix equal to Cov((C, &i), (C, ai), {, Ci h)y Then, via (1.2.196), (1.2.206) with Bn = ri~1!2 (the respective proofs are similar to those of original (1.2.196) and (1.2.206)) and the Cauchy-Schwarz inequality, it is not hard to see that the latter condition and (1.2.212) with j = 1 imply each other if, as n —> oo,

n ‘^(M l.n) -p(l,n)) - (K,(n)~ K(n)))T i=1 x ((ft(l,n) - ) - (Ki(n) - K fr j)) 4 0 (1.2.213)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 134

and

n ~ X £ ((ft(l»n) ~ P( n )) “ n) - Pi( 1, n) ))T i=l X ((Pi(l. ™) - p(l, n )) - (Pi( 1, n) - p( 1, n ) )) 0, (1.2.214)

where vectors Pi(l,n) and j>i(l,n) are as in (1.1.47) and (1.1.52) respectively. Con­ vergence in (1.2.213) and (1.2.214) follow in turn from the proof of Theorem 1.1.2b (n"1 E?=iK(l, n) - i/(l, n) )2 = oP(l)) and (1.2.209)-(1.2.210). Situation of Varf = oo and j = 1, as well as cases j = 2 and 3, are treated similarly. □

1.2.5 Appendix

This Appendix constitutes a crucial step in the key auxiliary Lemma 1.2.8 of Sec­ tion 1.2.2. Namely, it provides the computer code written in “Maple” that verifies the following factorization used in the proof of Lemma 1.2.8 (cf. paragraph where Var(C, b) > 0 is shown under Var£ < oo and l&^l + |6^| > 0):

det (Cov(<^1,2\ C(3,7))) = {pit, ~ cm)2 — (E(£ — cm))2) 2 det(T) det (Cov<^3,7^). (1.2.215) In (1.2.215), C(1,2) is a subvector of £ of (1.2.40), i.e., of

C = ((£ — cm)8, (£ — cm)e, 5, e, 5e — p, 52 — A0, e2 — O'), (1.2.216)

matrix T is as in (1.1.3), i.e.,

r-(M 4«)-("J).

and C(3,7) is a subvector of subvector C^3,7^ of (. The first two components of are 5 and e, and the rest of the components (maximum three), if any, compose a subvector of vector (Se — p, S2 — A0, s2 — 0), with the same order of components

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 135

as in C of (1.2.216). Hence, (1.2.215) is to be checked for all 8 possible matrices Cov(C^1,2\ C(3’7)) that are defined by varying their right lower block C ov^3,7) and are submatrices of / a^X9 a^fx d,£X6 d$fx dgrnzx dgmzo dgmxz ^ a^fi a^9 d^fx d$9 d^mx2 dgmzx dgmoz d$X9 d^fi X 6 fx m2i mzo m i2 CovC = d^fi d$9 n 6 mi2 m2i m 03 , (1.2.217) dgrrizi d^mi 2 ^21 mi2 m 22 — t*2 m3i ~ A0// m.13 — 9[x d^m3o d^m2\ m30 m2x m31 - X9fx m40 — (X9)2 m 22 — X92 ^ m i 2 dgmoz mi2 m03 miz — Op ^22 - A02 mo 4 - 92 / where £ is as in (1.2.216) and

a$ = £(£ — cm)2, =E(£ — cm) and my = i5(<5V), i, j = 0,4. (1.2.218)

The code below (lines starting with >) is supplied with preliminary comments in­ serted into { } that follow % sign. Within the code, various matrices Cov(C*1,2\ <^3’7^) and Cov ^ 3,r) are denoted by Ai and Bi respectively, i— 1,8, while 1, t, mu, a, d and mij, i,j= 0,4, stand respectively for A, 9, fx, a^, d% and my, i,j = 0,4. For shortness, displaying the matrices, that automatically follows their inputting, is omitted. > with(linalg); %{ Inputting Cov(<^1,2\ £(3'7)) and C ov^3’7^ that correspond to £(3’7) = C^3,7^ = ( Al:=matrix([ > [a*l*t,a*mu,d*l*t,d*mu,d*m21,d*m30,d*ml2], > [a*mu,a*t,d*mu,d*t,d*ml2,d*m21,d*m03], > [d*l*t,d*mu,l*t,mu,m21,m30,ml2], > [d*mu,d*t,mu,t,ml2,m21,m03], > [d*m21,d*ml2,m21,ml2,m22-imr2,m31-l*t*mu,nil3-t*mu], > [d*m30,d*m21,m30,m21,m31-l*t*mu,m40-(l*t),'2,m22-l*t‘'2], > [d*ml2,d*m03,ml2,m03,ml3-t*mu,m22-l*t''2,m04-t~2]

> D;

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 136

> Bl:=matrix([ > [I*t,mu,m21,m30,ml2], > [mu,t,ml2,m21,m03], > [m21,ml2,m22-mu~2,m31-l*t*mu,ml3-t*mu], > [m30,m21,m31-l*t*mu,m40-(I*t)“2,m22-l*t~2], > [ml2,m03,ml3-t*inu,m22-l*t~2,m04-t'‘2] > 1); %{ Verifying (1.2.215), i.e., that det (Cov(£(1,2), C ^ ) ) / d e t ^Cov^3’7^) = (ug — 4)2det(r) (= (a£ - 4)2(A«2- 4))-} > normal(factor(det(Al))/factor(det(Bl)));

(a — d2)2(l t2 — mu2)

%{ Inputting Cov(<^1,2), £(3>7)) and C ov^3,7^ that correspond to <^3,7^ = (S, e, Se — fi, S2 — X6) and are denoted by A2 and B2, and checking (1.2.215).} > A2:=matrix([ > [a>id*t,a*mu,d*l*t,d*mu,d*m21,d*m30], > [a*mu,a*t,d*mu,d*t,d*ml2,d*m21], > [d*l*t,d*mu,l*t,mu,m21,m30] , > [d*mu,d*t,mu,t,ml2,m21], > [d*m21,d*ml2,m21,ml2,m22-mu''2,m31-l*t*mu], > [d*m30,d*m21 ,m30,m21 ,m31-I*t*mu,m40- (l*t) “ 2]

> D; > B2:=matrix([ > [I*t,mu,m21,m30], > [mu,t,ml2,m21], > [m21,ml2,m22-mu~2,m31-l*t*mu], > [m30,m21,m31-l*t*mu,m40-(l*t)"2]

> D;

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 137

> normal(factor(det(A2))/factor(det(B2)));

(a — d2)2(l t2 — mu2)

%{ Inputting Cov(C^’2^ C^3,7^) and Cov^3,7) that correspond to = (6, s, 5e — H, e2 — 6) and denoted by A3 and B3, and checking (1.2.215).} a S ii > A3: h

> [a*l*t,a*mu,d*l*t,d*mu,d*m21,d*ml2], > [a*mu,a*t,d*mu)d*t,d*ml21d*m03], > [d*l*t,d*mu,l*t,mu,m21,ml2] , > [d*mu,d*t,mu,t,ml2,m03], > [d*m21,d*ml2,m21,ml2,m22-mu''2,ml3-t*mu], > [d*ml2,d*m03,ml2,m03,ml3-t*mu,m04-t'‘2] > 1); > B3:=matrix([ > [I*t,mu,m21,ml2], > [mu,t,ml2,m03], > [m21,ml2,m22-mu'‘2,ml3-t*mu], > [ml2,m03,ml3-t*mu,m04-t~2]

> D; > normal(factor(det(A3))/factor(det(B3)));

(a - d2)2(l t2 - mu2)

%{ Inputting Cov(C^1,2\ C^3,70 and CovC^3’7^ that correspond to C^3’7* = (5, £, 82 — X0, e2 — 6) and are denoted by A4 and B4, and checking (1.2.215).} > A4:=matrix([ > [a*l*t,a*mu,d*l*t,d*mu,d*m30,d*nil2], > [a*mu,a*t,d*mu,d*t,d*m21 ,d*m03], > [d*l*t,d*mu,l*t,mu,m30,ml2] ,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 138

> [d*mu,d*t,mu,t,m21,m03], > [d*m30,d*m21,m30,m21,m40-r2*t~2,m22-l*t~2], > [d*ml2,d*m03,ml2,m03,m22-l*t‘'2,m04-t“2] ]); > B4:=matrix([ > [I*t,mu,m30,ml2], > [mu,t,m21,m03], > [m30,m21 ,m40-l ~2*t''2,m22-l*f'2], > [ml2,m03,m22-l*t's2,m04-t"2] ]); > normal(factor(det(A4))/factor(det(B4)));

(a — d2)2(l t2 — mu2)

% {Inputting Cov(C^1,2), C^3,7^) and C ov^3,7) that correspond to C^3,7^ = (5, e, 5e~n) and are denoted by A5 and B5, and checking (1.2.215).} > A5:=matrix([ > [a*l*t,a*nni,d*l*t,d*mu,d*m21], > [a*mu,a*t,d*inu,d*t,d*ml2], > [d*l*t,d*mu,l*t,mu,m21], > [d*mu,d*t,mu,t,ml2], > [d*m21,d*ml2,m21,ml2,m22-mu‘'2 ] ]); > B5:=matrix([ > [l*t,mu,m21], > [mu, t, ml 2], > [m21,ml2,m22-mu~2] ]); > normal(factor(det(A5))/factor(det(B5)));

(a — d2)2(l t2 — mu2)

%{ Inputting Cov(^1,2\ C^3,7^) and Cov^3,7) that correspond to = (8, £, $2 — X8) and are denoted by A6 and B6, and checking (1.2.215).}

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 139

> A6:—matrix([ > [a*l*t,a*mu,d*l*t,d*mu,d*m30], > [a*mu,a*t,d*mu,d*t,d*m21], > [d*l*t,d*mu,l*t,mu,m30], > [d*mu,d*t,mu,t,m21], > [d*m30,d*m21,m30,m21,m40-l~2*t‘'2 ] ]); > B6:=matrix([ > [l*t,mu,m30], > [mu,t,m21], > [m30,m21,m40-r2*t'‘2] ]);

> normal (factor (det (A 6 )) /factor (det (B 6 )));

(a - d2)2(l t 2 - mu2)

%{ Inputting Cov(C*ll2\ £(3>7)) and Cov that correspond to C^3’7) = (8 , £, £2—0) and are denoted by A7 and B7, and checking (1.2.215).} > A7:=matrix([

> [a*l*t,a*mu,d*l*t,d*mu,d*ml 2], > [a*mu,a*t,d*mu,d*t,d*m03], > {d*l*t,d*mu,l*t,mu,ml2], > [d*mu,d*t,mu,t,m03],

> [d*ml2,d*m03,ml2,m03,m04r-t~2 ] ]); > B7:=matrix([

> [l*t,mu,ml 2], > [mu,t,m03],

> [ml2,m03,m04-t“2] ]); > normal(factor(det(A7))/factor(det(B7)));

(a - d2)2(l t2 — mu2)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.5 Appendix 140

%{ Inputting Cov(^1,2), £(3-7)) and CovC(3,r) that correspond to C(3,7) = e) and

are denoted by A 8 and B 8 , and checking (1.2.215).}

> A8 :=matrix([ > [a*l*t, a* mu, d*l*t, d*mu], > [a*mu, a* t, d*mu, d*t], > [d*l*t, d*mu, l*t, mu], > [d*mu,d*t,mu,t] ]); > B8:=matrix([ > [l*t,mu], > [mu,t] ]);

> normal(factor(det(A 8 ))/factor(det(B 8 )));

(a - d2)2(l t 2 - mu2)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2

Invariance Principles via Studentization in Linear Functional Error-in-Variables Models

2.1 Introduction, Main Results and Applications

2.1.1 Model and Assumptions

In linear error-in-variables model (EIVM) of this thesis we observe pairs (t/*, Xi) G IR2 according to

Vi = P& + a + 6i, (2.1.1)

Xi = 6 + £i, (2.1.2)

where & are unknown explanatory/latent variables, real-valued slope (3 and intercept a are to be estimated and 8i and e* are unknown measurement error terms/variables, 1 < i < n, n > 1. EIVM (1.1.1)-(1.1.2) is also known as measurement error model, or structural/functional relationship, or regression with errors in variables. It is a generalization of (1.1.1) in that in (1.1.1)—(1.1.2) it is

141

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.1 Model and Assumptions 142

assumed that two variables 77 and £ are linearly related, 77 = /?£ + a, however now

not only 77, but also £, are observed with respective measurement errors S{ and e*.

In Chapter 2 , explanatory variables £* are assumed to be deterministic, i.e., we deal here with the so-called functional EIVM (FEIVM). As opposed to FEIVM’s, in the corresponding structural EIVM’s (SEIVM’s) (cf., e.g., Chapter 1), £j are assumed to be independent identically distributed (i.i.d.) random variables (r.v.’s) that are independent of the error terms. The case of (2.1.1)—(2.1.2) with a known to be zero is distinguished in the literature as the model without intercept. Convenient notations that are introduced in this chapter (cf. forthcoming (2.1.6), (2.1.7)) allow us to study both the no-intercept model and the model with unknown a simultaneously. We

note that model ( 2 .1.1)-(2 .1.2 ) is classified in form as one without the so-called equation error (cf. Cheng and Van Ness [12] and Fuller [19] for details on equation error models). Throughout Chapter 2 one of the following three conditions is made on the error terms:

(A) {(<5, e), (Si, Si), i> 1} is a sequence of i.i.d. random vectors of error terms with mean zero and positive definite covariance matrix

(2.1.3)

or

(B) Sequence {(S, e), (Si, £i), i > 1} is as in (A) and for some A € (0,2),

E\S\2+A < 00 and E\s\2+A < 00 ,

or

(C) Sequence {(5, s), (Si}ei), i > 1} is as in (A), with the fourth error moments assumed to exist, i.e.,

E SA < 00 and E e4 < 00 .

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 143

As to the explanatory variables {&, i > 1}, we assume that they obey some or all of the following conditions (D), (E) and (F): 1 A * , . „ (D) lim — Y' & = m and m is finite; v 7 n—>oo n 1 = 1

(E) lim inf ' ' n—>oo

Apart from assumptions on the joint distribution of (

(1) positive ratio of the error variances A = Var 6/Var e and correlation coefficient of the error terms cov(<5, e)/ W a r 5Var £ = n({\f\9) are known, where cov(5, £) is the covariance of 5 and e; equivalently, matrix V is known at least up to unknown multiple 9 = Var e\

(2) Var 5 = A6 and cov(<5, e) = n are known, while Var£ = 9 is unknown;

(3) Vare = 9 and cov(5,e) = fj, are known, while VaxS=X9 is unknown.

Other identifiability conditions for FEIVM’s (2.1.1)-(2.1.2) are briefly addressed in Remark 2.1.10 and then, also, in Section 2.1.5.

2.1.2 Introduction of Estimators and Processes under Study

In this subsection, depending on which one of identifiability conditions in (l)-(3) is assumed, well-known weighted or modified least squares estimators are given

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 144

for the slope and intercept, the principal parameters of interest, as well as for certain estimators of unknown error variances. For all these estimators, corre­ sponding processes are introduced in D[0,1]-space, that are believed to be new objects of studies for FEIVM’s (2.1.1)-(2.1.2). Formulae for most of the esti­ mators are not displayed separately, since they are absorbed into those for the corresponding processes. We also note that estimators for intercept a and their corresponding processes are introduced below and studied afterwards only under the assumption that either limn_>00 (n-1 SiLi £* ~ (n_1 SiLi &)2) = M < oo, or limn_>00 (n~l Y%=i £2 — (n~l &)2) = oo, rather than (E). In the light of the interplay of Chapter 2 with Chapter 1 of this thesis, these assumptions in regards of a estimation will be seen to be quite natural special cases of (E) (cf. discussions in Section 2.1.3). For convenience, necessary notations, definitions and abbreviations that are used more than once throughout Chapter 2, are introduced in the course of developments of Chapter 2 at the beginning of the subsection where they first occur, and also summarized in the List of Notations, Definitions and Abbreviations that is provided just before Introduction in the thesis. For further use, we now introduce a set of notations. For sets of real-valued

variables {ui, 1 < i < n} and {u,, 1 < i < n}, put

(2.1.4)

and 1 ^ (2.1.5)

where

Si,w = (v>i - cu)(vi- cv), (2.1.6)

with constant 0 , if intercept a is known to be zero, c = (2.1.7) 1, otherwise.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 145

For a time function K n(t) of t, K n(t) : [0,1] —► {0,1,... ,n}, n > 1, we define the following functions in D\0,1] JKn{t) UKn(t) — — ^ Uj, 0 < t < 1, where UKn{t) '■= 0 for t such that K n(t) = 0, n i= l (2.1.8 ) and I Knit) Suv,Knit) = - Yh si,™, 0 < t < 1, where SUViKnit) •= 0 for t such that K n(t) = 0. 71 i=l (2.1.9) The sign of a real-valued variable is denoted by sign(-). Deviating somewhat from the original agreement that all abbreviations are to be introduced right at the beginning of each subsection, for the sake of better pre­ sentation, we introduce new abbreviations throughout this subsection as well. When identifiability condition (1) is assumed, it is common to estimate P and a with weighted least squares estimators (WLSE’s) /?ln and c*in (cf. [12], Section 1.3.3 of [19], [4] and, for the early, 1879-1946 references, [40]). These WLSE’s minimize functional

Fin(P, a) = -■ £ ((V* ca»xi ~ 0 r_ 1 (2/i - £ P - c a ,X i- 0 r ) , P, ot G IR, Tv • i (2.1.10) where c is as in (2.1.7), T-1 is the inverse of matrix V of (1.1.3) and vector (y, — £P — ca, Xi — g)T is the transpose of vector (y* — — ca,xi — f). Formulae for Pin and ain are much simpler looking when // = 0 in (2.1.3). It is therefore quite common to study WLSE’s via first assuming that the error terms of model (2.1.1)- (2.1.2) are uncorrelated. Then, using a data-transformation based interplay between models (2.1.1)-(2.1.2) with uncorrelated and correlated errors, one can carry over

asymptotic results for WLSE’s from case fj, = 0 to /j, ^ 0 (cf. Remark 2.1.11 of Section 2.1.4). In Chapter 2, along with studying WLSE’s, we introduce weighted least squares

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 146

processes (WLSP’s) that do not seem to have been explored so far by researchers of EIVM’s area. WLSP’s are elements of D[0,1] that at t = 1 correspond to WLSE’s centered by (5 and a respectively, namely (3in — (3 and a.\n — a. Assuming that ft = 0 in (2.1.3) and using notations (2.1.5)-(2.1.7) and (2.1.9), for 0 < t < 1, we put WLSP for (3 to be

0in~P)K ln(t) = siga(Sxy)\J (izn - z)Kln(t) + z) 2 + \-{zn-z)Kln(t)-z-[3, (2.1.11)

with

( z — z W ^ — ^xx,Kin(t) - Syy,Kln(t) - 2zSxyiKln(t) , _ A — 0 * \*n z )Kln(t) — ’ ° XV u > a n a z ~ 2 XV (2.1.12) and the time function m n 1 {m : ^Varuj(l,n) < Varttj(l,n) >, 0 < t < 1, (2.1.13)

i = l i = l J where —2/?2 / \ _ a2 \ Ui(l,n) = ^ 02 ( A^i.xa: ~ si,yy si,xy )■ (2.1.14)

The respective WLSP for a is defined by

(2i» “ «)li»(*) =- x 0 m ~ P)Lln(t) + (y - £/?- a)iln(i), 0 < t < 1, (2.1.15)

where for 0 < t < 1,

Li n(t) ’ C m n 1 _ sup < m : Varv^(l,n) < t ^ 2 Varvj(l,n) > , if Jim^ (f 2 - (£)2) = M < oo,

[nt] , if Jim (F - (£)2)= oo, (2.1.16) with

( (yi - a) - pXi - Ui(l, n ) , if lim n_oo ( I 2 - (f)2) = M < oo,

(;V i- a ) - j3 x i , if lim^-.oo (£2 - (£)2) = oo, (2.1.17)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 147

and 0 i n — /3)Li„(t) is as in (2.1.11), with L\n{t) of (2.1.16) in place of K ln(t) of (2.1.13). When dealing with assumptions on covariance matrix T in case (1), with fx = 0, there are two further cases to consider: both variances A 9 and 9 in (2.1.3) are known, or both are unknown. In the latter case, it may also be of interest to estimate one of the variances, say 9 (cf. part (e) of Remark 2.1.6 of Section 2.1.4 on where estimators for 9 may be used). Now the weighted least squares approach, via functional (2.1.10), is unable to supply us with any estimator for 9. So it seems natural to adapt here the maximum likelihood estimator (MLE) of 9, that is derived when error terms are assumed to follow a normal distribution. Then, the MLE method not only produces estimators for /3 and a, that are known to coincide with the WLSE’s, but also gives us the MLE for 9 that is usually adjusted with factor 2n /(n — 2). This adjustment, also called “correction for degrees of freedom”, results, in particular, in consistency of the MLE of 9. In this context we introduce process in D[0,1]

(01n - 0)[nt] = ~ % * ] + “ 1)> 0 ^ ^ C2’1'18)

where for 0 < t < 1,

(q _q\ (Syy.lnt] ~ A9[nt]/n) — 2Sxyt[nt]Pin + (ffgs.fnt] ~ 9[nt]/ri)f3l ^ . . (y in “)\nt] — ’ (^ -1 - ^ )

and note that 9\n = (B\n — 0)[ni] + 9 is the MLE of 9 after the aforementioned adjustment. This new process as defined by (2.1.18)—(2.1.19), as well as 9\n, are studied here along with WLSP’s and WLSE’s that are our primary interest. When either (2) or (3) is assumed, i.e., when covariance matrix T in (2.1.3) is known up to one of the error variances, there are the so-called modified least squares estimators (MLSE’s) available for slope (3 and intercept a, introduced in Cheng and Tsai [9]. These MLSE’s are not new in form (basically coincide with method of moments estimators for (3 and a). However, they are obtained by a rather general

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 148

method applicable to a broad class of the models ( 2 .1.1)-(2 .1.2 ) simultaneously, e.g., to FEIVM’s and SEIVM’s (2.1.1)—(2.1.2), independently of whether the explanatory variables and/or the error terms are normally distributed or not. According to this

method MLSE’s of fi and a minimize, in case ( 2 ) and (3) respectively, the following functionals, as unbiased estimators of the appropriate unknown error variances:

F2n(P,

and

FsniP,a) = ~ 5Z ((Vi ~ xiP ~ a )2 “ + 20/*) i A a e IR, (2.1.21)

with c of (2.1.7). It is not hard to see that F2n(j3,a) and F3n(f3,a) are consistent estimators of 6 and A0 on assuming simply that the variances of the i.i.d. error terms exist, disregarding the nature of the explanatory variables, and thus, the MLSE method is also suitable for our FEIVM’s (2.1.1)-(2.1.2) under (A). When conditions (2) or (3) are assumed, going beyond studying MLSE’s only, we present modified least squares processes (MLSP’s) for (3 and a, that are also believed to be first in the context of (2.1.1)—(2.1.2). In case (2) and (3) respectively, the MLSP’s for fi are given in D[0,1] as follows:

0yy,K2n(t) ^ F 2n{t)/n) /3(SXy,Kzn(t) fl^ 2n{t)/n) 02 n - P)Knn(t) ~ , 0

Sxy ~ H 7^ 0 and Syy — \6 > 0 , (2.1.23)

while

0xy,K3n(t) I^Ksnify / ri) P{Sxx,K3n{t) dK 3n{t)/tl) 03n ~ P)K3n(t) = , 0 0. (2.1.25)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 149

In (2.1.22) and (2.1.24), { m n 1 ra : V a r < tY'Vaxu^j, n) >, 0 < t < 1, j = 2 and 3,

i=l i=l J (2.1.26) with

-<*»> - {&=r t \ V s z 32: As to the corresponding MLSP’s for a, we define

(ajn- a ) Ljn(t) = - x (Pjn-P)Ljn(t) + (y - x / 3 - a ) L.n{t), 0 < t < 1, j = 2 and 3, (2.1.28) where for 0 < t < 1 and j = 2 and 3,

Ljn (t) ' C m n "J _ sup|m : J]Var«'(j,n)<( Var«'(i,n)|, if Hm (f2-^ )2) =M

[nt] ,ifUm (P-(I)2)=oo,

(2.1.29)

with

n) {yi-a)-Pxi--^pUi( 2,n ) , if j = 2 and ton (f2-(£)2) =M

(yi-a)-jdxi , if limn—>oo (f2-(f\ )2) / =oo, j — 2 and 3,

(2.1.30)

and 0 j n — P)Ljn(t) are as in (2.1.22) and (2.1.24), with Ljn(t) of (2.1.29) in place of Kjn{t) of (2.1.26), j = 2 and 3. We note that 0 jn - P)Kjn(i) and (ajn - a)Ljnm are centered MLSE’s Pjn and o,n, j = 2 and 3, i.e., the respective minimums of (2.1.20) and (2.1.21), centered by P and a.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.2 Introduction of Estimators and Processes under Study 150

For estimation of unknown error variances, 9 in case (2) and A 6 in case (3), their method of moments estimators that are also MLE’s in the corresponding SEIVM’s (2.1.1)-(2.1.2) with (£, <5, e) being N(0, diag(Var£, T)) distributed, are adapted here, namely, K = (2.1.31) P2n and A$3n = Syy — (Sxy ~ p)P3n- (2.1.32)

Moreover, to the best of our knowledge, we are the first to introduce and study here processes for model (2.1.1)—(2.1.2) that at t — 1 retain the above estimators centered by 9 and A9, respectively. The somewhat complicated forms of these processes are motivated by the appeal of their asymptotic properties (cf. Theorems 2.1.2c, 2.1.3c of Section 2.1.4). Our process in D[0,1] that corresponds to dm of (2.1.31) is defined as

in n\ (Syyt[nt] — A0 [nt] /n) (SXXi[nt] — 9[nt]/n) — (iS'I3/>[nt] — (J,[nt]/n)2 (y 2 n “)[nt] — C \Q Oyy AO + (Syy A (9 ^yy,[nt] “I- A$[flt]/Tlj X A9[nt]/n ‘2/3‘2n(.Sxy,[nt] _____ /J.[ni\/n) + 9\nt\/nfj f

(2.1.33)

provided that (2.1.23) is satisfied, where fan = (#2n-/5)*:2n(i)+A and 0 2 n~P)K2n{t) is from (2.1.22). Assuming that (2.1.25) holds true, we also introduce the process for A9 in D[0,1] as

/wi \/)V (

w 0yy,[nt]~^Q[nA /n ~2P3n{SXy,[nt]~fA[nt]/n)+ft3n(SXx,[nt]~Q[nt]/n)) /r> 1 X - - , (2.1.J4J Ott (7

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 151

with #3n = 0 3 n - P)k3n(i) + P from (/%n - P)K3n(t) in (2.1.24).

We note in passing that since 0i„, d2n and A03„ (cf. (2.1.18), (2.1.31) and (2.1.32) respectively) are estimators of unknown positive variances, due to their consistency (cf. Theorems 2.1.1a, 2.1.1b of Section 2.1.4), they are eventually nonnegative. Throughout Chapter 2, an important convention is that all the introduced es­ timators, namely /?in, Sin, 9\n, d2n and \9sn, i = 173, and their corresponding pro­ cesses in D[0,1] are introduced and studied here on assuming that model (2.1.1)- (2.1.2) is nondegenerate, i.e.,that @ / 0. When /3 = 0, model (2.1.1)-(2.1.2) reduces to

yi = ot + 6i, (2.1.35)

= & + £*. (2.1.36)

Thus, observations yi and Xi in (2.1.35)-(2.1.36) are no longer linearly associated and, when errors 5i and e* are uncorrelated or independent, they do not seem to provide meaningful information for estimating a and /3 in EIVM context.

2.1.3 Prelude to Main Results

In this subsection, for better appreciation and understanding of our main results in Section 2.1.4, we provide an introduction to their origin and nature in view of related results in the literature and Chapter 1 of this thesis. As to abbreviations used in this subsection, CLT stands for central limit theorem, DAN is to denote the notion of the domain of attraction of univariate normal law and i.i.d.r.v.’s is for independent identically distributed random variables. Further to the notion of DAN, DAN is the collection of sequences of i.i.d.r.v.’s {Z, Zi, i > 1} for which there are constants an and bn, bn > 0, such that Zi — ^

N(0,1), n -» 00 . A square block-diagonal matrix will be defined by listing square matrix blocks on its diagonal as follows: diag(-, •••,•)•

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 152

In Chapter 2 we continue to explore and develop Studentization ideas in con­ nection to some basic asymptotic theory for linear EIVM’s (2.1.1)-(2.1.2). Despite the strong interplay between results of Chapter 1 for SEIVM’s and Chapter 2 for FEIVM’s, that is to be extensively covered in this subsection, the present chapter is written so that it can be read independently of Chapter 1. An exception in this regard is Section 2.1.5, where it is more convenient to lean on the corresponding section from Chapter 1, namely on Section 1.1.5.

Some general remarks on the long history of EIVM’s, the models that are also known as measurement error models, or functional/structural relationships, or re­ gressions with errors in variables, can be found in Section 1.1.3 of Chapter 1. In the next three paragraphs, we investigate the history and present state of CLT’s and consistency results in the literature on the linear FEIVM’s (2.1.1)-(2.1.2) with univariate observations and without equation error, under one of the identifiability conditions in (l)-(3 ). Given the present variety of different EIVM’s studied in the literature, one should not be misled or surprised by our choice of the simplest in form EIVM’s. Similarly to Chapter 1, such a set-up in Chapter 2 allows us to pro­ vide a better insight to the new Studentization approach to some basic asymptotic theory for FEIVM’s. Hence, we are not concerned here with possible generaliza­ tions of the main results to some more complex in form FEIVM’s. However, one of the main objectives of this chapter is to use the most general moment-like and moment distribution-free assumptions respectively on the explanatory variables and error terms of FEIVM’s (2.1.1)-(2.1.2). Time and again we will call attention to the generality of our model assumptions in the present context, in particular, to the fact that the error terms are not necessarily assumed to follow a normal distribution here, and that the assumptions on the explanatory variables that are commonly used in the literature are relaxed here.

According to Section 11 of Moran [52], “the basic theory” of FEIVM’s was first

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 153

considered by Lindley in the 1947 paper [38]. By that time, in view of some his­ torical remarks of Section 1.1.3 of Chapter 1 that are based on the survey papers quoted therein, some initial studies on SEIVM’s had already been in progress. How­ ever, quoting from Section 3 of Sprent [62], “The distinction between functional and structural relationships was first clearly stated by Kendall” in [31] and [32] only in 1951. Making a clear distinction between the two types of EIVM’s is essential, since estimation problems, in particular, basic asymptotic theories in FEIVM’s and SEIVM’s are approached quite differently. The main reason for this is that the num­ ber of FEIVM parameters increases with the sample size n due to the explanatory variables £i,... ,fn being fixed unknown parameters, as opposed to the situation in SEIVM’s, where are i.i.d.r.v.’s. This principal feature of any FEIVM also leads to some major problems with maximum likelihood estimation that were studied by many authors (cf., e.g., [12], [52] and [62] for the references).

In view of the previous paragraph and [12], first consistency results in FEIVM’s (1.1.1)-(1.1.2) under (l)-(3) are apparently due to [38]. In particular, the “correc­ tion for degrees of freedom” for the MLE of 0 under (1) and {(<5i? £*), i > 1} being i.i.d. AT(O,diag(A0,0)) distributed, that leads to 0\n (cf. Section 2.1.2), is attributed to [38]. As to CLT’s studies in EIVM’s (FEIVM’s/SEIVM’s) (2.1.1)-(2.1.2) under (l)-(3), from the survey papers and texts listed in Section 1.1.3 of Chapter 1, they appear to really begin only in 1970’s, after, according to [62], some initial estimation problems in EIVM’s had been clarified. One of the indicators of this is stating as open questions (5) and (7) in the conclusive part of general survey paper [52].

Among numerous works concerned with consistency and asymptotic normality for estimators in FEIVM’s (2.1.1)-(2.1.2) under an identifiability condition in (1)- (3), the most relevant to the ones in Chapter 2, that treat the explanatory and error variables under most general assumptions, are Gleser [21], Cheng and Tsai [9] and Cheng and Van Ness ([10], [11]). Delegating the details into Remarks 2.1.2, 2.1.3

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 154

and 2.1.5, we just first note here that the results of these papers imply consistency ON. «'**. XN, and vn-asymptotic normality of (fan, fan, ^in) ((/?im Sin) in case T of (1.1.3) is

completely known), (fan, fan, fan) and (fan, S3n, A03n) respectively under (1), (2) and (3). We would also like to call attention to the fact that, just like CLT’s from our main references for companion SEIVM’s (2.1.1)—(2.1.2) of Chapter 1, a common feature of all these CLT’s is that covariances of their limiting normal distributions depend on various unknown error moments and on some unknown parameters asso­ ciated with the explanatory variables (cf. part (c) of Remark 2.1.6). Consequently, applications of these CLT’s are aggravated by having to estimate additionally these typically unknown error moments and some unknown parameters of {&, i> 1}, that is hard in practice, except for some special cases (for example, when the moments of error terms {(<5*, e*), i > 1} are like those of iV(0, diag(A0, 0))-distribution, as illustrated in [21], cf. part (c) of Remark 2.1.6).

In Chapter 1, we obtained different types of CLT’s for the estimators of Section 2.1.2 along with new invariance principles for the appropriate corresponding pro­ cesses of these estimators in SEIVM’s (2.1.1)—(2.1.2). Being strongly inspired and influenced by recent advances in DAN via Studentization and self-normalization (cf., e.g., [16] and other relevant references in Section 1.1.3 of Chapter 1), we introduced the new assumption, £ € DAN for i.i.d.r.v.’s {£, &, i > 1}, that have usually been viewed as i.i.d.r.v.’s with positive finite variance. Such enrichment of the traditional space of explanatory variables led to exploring Studentization ideas in the context of SEIVM’s (2.1.1)-(2.1.2). As a consequence of this approach, all the invariance

principles in Chapter 1 (in particular, CLT’s for (fan, fan, 9ln), (fan, fan, fan) and

(fan, f a n, A03n)) are invariant in form within the class of {(£, S, e), (&, <5*, £,), i > 1} and readily available for some immediate applications as in Section 1.1.5 of Chap­ ter 1, since apart from the model parameters of interest, these asymptotic results, as opposed to related results in the literature, do not contain unknown distribution

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 155

parameters of (£, 5, e).

Naturally, in view of Chapter 1, it became desirable to replace also the CLT’s for FEIVM’s in the aforementioned [21], [9], [11] and [10] by more satisfactory Studentization CLT’s that are similar to those in Chapter 1. Moreover, hoping for a natural interplay between SEIVM’s (2.1.1)-(2.1.2) studied in Chapter 1 and appropriate FEIVM’s (2.1.1)-(2.1.2) prompted us to think of establishing various invariance principles in the latter models that would be identical in form to those obtained in Chapter 1. To achieve this goal, first, we had to introduce FEIVM’s (2.1.1)—(2-1.2) that would be appropriate companions for SEIVM’s (2.1.1)—(2.1.2) in Chapter 1. Consequently, specifying the nature of explanatory variables in FEIVM’s (2.1.1)-(2.1.2), conditions in (D)-(F) were introduced to match the assumptions that i.i.d.r.v.’s {£, £*, i > 1} of SEIVM’s in Chapter 1 are either nondegenerate with finite mean, or such that £ € DAN. We note that in view of {£, £*, i > 1} in Chapter 1 having positive finite or infinite variance, leaning on Remark 1.2.2 of Chapter 1, one could have simply assumed in place of (E) that either 0 < limn_>oo (£2 — (£ )2) < oo, or limn_>oo (£2 — (£ )2) = oo. However, the behaviour of £2 — (£ )2, as n —> oo, is not exhausted by these two conditions here, except in Theorems 2.1.2b and 2.1.3b in regard of intercept a. Later on in this subsection we will see that introduced conditions (D )-(F) are weaker than those on the explanatory variables in FEIVM’s studied in [21], [9], [11] and [10].

Due to introducing FEIVM’s in Chapter 2 that correspond to SEIVM’s in Chap­ ter 1, our main Theorems 2.1.1a, 2.1.1b, 2.1.2a-2.1.2c and 2.1.3a-2.1.3c (collectively called Theorems 2.1.1, 2.1.2 and 2.1.3 respectively) in Section 2.1.4 are synchronized and share similar features with their companion results in Chapter 1. In particular, under the appropriate conditions from (A)-(F), weak consistency results of The­ orem 2.1.1 and CLT’s and weak invariance principles in Theorems 2.1.2, 2.1.3 are identical to the corresponding theorems for estimators of Section 2.1.2 and their

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 156

appropriate processes in Chapter 1 that are proved under companion assumptions to (A )-(F). Thus, for example, the matching results to the aforementioned ones in Theorem 2.1.2 are proved in Chapter 1 under present (C) and appropriate condi­ tion in (l)-(3) on the error terms, condition £ € DAN, that is a random analogue of present (D)-(F), and also under condition that f and (8 , e) are independent, that does not need an analogue here, since {&, i > 1} are nonrandom in FEIVM’s (2.1.1)-(2.1.2). Also, similarly to the CLT’s in Chapter 1, dealing with less re­ strictive assumptions on the explanatory variables than those in the literature and having Studentized forms, CLT’s of Theorem 2.1.2 and completely data-based CLT’s of Theorem 2.1.3 do not contain unknown moments of 8 and e and parameters as­ sociated with {&, i > 1} in the variances of their limiting normal distribution and, consequently, are readily applicable (cf. Remarks 2.1.6, 2.1.9). Moreover, though our CLT’s imply those in our main references [21], [9], [11] and [10], due to their aforementioned main Studentization features, they can also be viewed as new ones under the conditions of these papers (cf. Remarks 2.1.5, 2.1.6). As to the weak in­ variance principles and weak approximations in probability results of Theorems 2.1.2 and 2.1.3, they are believed to be first time results, and some of the weak invariance principles of Theorem 2.1.3 are also completely data-based. Strong consistency re­ sults for the estimators of Section 2.1.2 turned out to be slightly more challenging in the context of Chapter 2 (cf., e.g., Remark 2.1.3). Hence, the lack of striking cor­ respondence with strong consistency results in Chapter 1. All consistency results of Theorem 2.1.1 extend related known results (cf. Remarks 2.1.1—2.1.3). In addition to the main results, many of the remarks of Section 2.1.4 contain complementary and similar results and facts that are not organized in separate statements. We wish to note that in Chapter 1 there are also joint CLT’s available for (Pin, Sin, Oin),

0 2 n , s2n, 02n) and (fan, S 3n, \ 9 3n). The work on establishing their analogues in the present context of FEIVM’s (2.1.1)-(2.1.2) is presently in progress by the au­

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 157

thor (cf. [47] and Epilogue). We also note that though a global scheme of the proofs of the main results in Chapter 1 was adapted for the proofs of Theorems 2.1.1-2.1.3 in Section 2.2.2 of Chapter 2, the methods and details of the proofs of the major auxiliary results in Chapter 1 and Chapter 2 are very different. For example, while the key auxiliary Lemma 2.2.6 for Theorem 2.1.2 is on invariance principles for a sequence of independent nonidentically distributed r.v.’s satisfying Lindeberg’s con­ dition, in the similar situation in Chapter 1 we deal with invariance principles for 1.1.d.r.v.’s from DAN.

The desirable identity of Theorems 2.1.1-2.1.3 of Chapter 2 to corresponding main results in Chapter 1, that is described in the previous paragraph, establishes a striking interplay between FEIVM’s in Chapter 2 and SEIVM’s in Chapter 1. In this regard, we especially note that weak consistency results and CLT’s for all the estimators in Section 2.1.2, as well as weak invariance principles for (Sjn —

or) [„*] (under limn —>00 («2- « ) 2) = oo in Chapter 2 and Var£ = oo in Chapter 1), (0ln - #)[„*], (02n - 9) [nit] and (A03n - A0)[„t] are genuinely indistinguishable in form and proved under companion model assumptions in the two chapters. Thus, these results are invariant with respect to explanatory variables {£,&,*> 1} satisfying all or some of the deterministic assumptions in (D )-(F) or the random assumption that 1.1.d. & are either nondegenerate with finite mean, or such that £ € DAN. In other words, the aforementioned asymptotics disregard whether or not the explanatory variables have a deterministic nature, as in Chapter 2, or a random nature, as in Chapter 1.

Though FEIVM’s and SEIVM’s have always been developing parallel to each other, a breakthrough in studying interaction of these two types of models in a systematic way is due to Gleser [22] of 1983. In [22], it is noted that most papers had concentrated on only FEIVM’s or SEIVM’s, with a couple of exceptions where the authors make use of results known for one type of EIVM to infer results for the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 158

other type of EIVM. In the next paragraph we are to give a glimpse of Gleser’s [22] interplay of FEIVM’s and SEIVM’s and relate it to the interplay between FEIVM’s of Chapter 2 and SEIVM’s of Chapter 1 that is described in the previous paragraph. In the literature on asymptotic theory concerning FEIVM’s (2.1.1)—(2.1.2), it has been usually assumed that the explanatory variables are such that

£ —»m and £2 - (£ )2 —>M > 0, n -* oo, with finite m and M. (2.1.37)

It is easy to see that convergence of £2 to a positive limit implies that n-1£2 = £2—n_1(n—l)(n —l)-1 S^T]1 £2 —> 0, n —► oo, and the latter convergence, via proof by contradiction, leads to n~l maxi<*<„£? —> 0, n —» oo, and thus, to (F). Thus, (D)- (F) assumed in the present work are weaker than (2.1.37). As to asymptotic studies for SEIVM’s (2.1.1)-(2.1.2), as mentioned earlier in this subsection, {£, £,, i > 1} have been usually referred to as i.i.d.r.v.’s such that

— m and Var£ = M > 0, with finite m and M, (2.1.38)

except in Chapter 1 of this thesis, where conditions (2.1.38) were weakened by not necessarily having to assume that M < oo, via introducing assumption £ € DAN. In particular, assuming exactly (2.1.37) when dealing with FEIVM’s or (2.1.38) for SEIVM’s, and (C) on the error terms, the papers [21], [22] combined, [9],

[11] and [10] imply -y/n-asyniptotic normality for estimator triples 0 i n, Sin, ^in),

0 2 n, <*2n, A$2n) and 0 3 n, o.3n, \6zn), that was already mentioned in regards of FEIVM’s at the beginning of this subsection. The technically convenient assump­ tions in (2.1.37) and (2.1.38) seem to have gained even more popularity and have become standard in FEIVM’s and SEIVM’s (2.1.1)-(2.1.2), after Gleser [22]. In the latter paper, Gleser proposes a unifying approach to such FEIVM’s and SEIVM’s and gives an insight into their interrelationship. In particular, a summary in [22] on the interplay of FEIVM’s, SEIVM’s and asymptotic theories within these models

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.3 Prelude to Main Results 159

enables one to translate large sample results (consistency and asymptotic normal­ ity) for (Pin, otin, 9in) proved in FEIVM’s under (2.1.37) (cf. [21]) to SEIVM’s with {£> > 1} satisfying (2.1.38). In view of the remarks of Section 1.1.4 in Chapter 1 of this thesis and Remarks 2.1.1-2.1.3, 2.1.5, 2.1.6 and 2.1.9 of Section 2.1.4, we conclude that not only are (2.1.37) and (2.1.38) not necessary for the consistency results and CLT’s in [21], [22] combined, as well as in [9], [11] and [10], but also that, via assuming less about the explanatory variables, Theorems 2.1.2, 2.1.3 and their companions in Chapter 1 provide essentially improved CLT’s for estimators of Sec­ tion 2.1.2, and some new invariance principles as well. Moreover, the conclusions of Gleser [22] on identity of consistency results and CLT’s in FEIVM’s and SEIVM’s (2.1.1)—(2.1.2) satisfying (2.1.37) and (2.1.38), respectively, are also a part of the interplay between our present FEIVM’s (2.1.1)-(2.1.2) and SEIVM’s in Chapter 1 that is described in the second to last paragraph of this subsection. It is interesting to note that, while the interrelationship between FEIVM’s and SEIVM’s in [22] is based on studying FEIVM’s, the interplay established in Chapter 2 has originated in a natural way from first sorting out limit theorems in SEIVM’s of Chapter 1.

The results of Chapter 2 and their companions in Chapter 1 share further du­ ality properties. Just like replacing (2.1.38) with £ G DAN in Chapter 1 allows Pjn of P, j = 1,3, to have rates of convergence in the CLT’s that are faster than y'n, introducing conditions (D )-(F ) that weaken (2.1.37) leads to the same effect in FEIVM’s here (cf. Remark 2.1.7 of Section 2.1.4). Moreover, (F) here, and respec­ tively assumption £ GDAN in Chapter 1 that, due to Lemma 1.2.1 in Chapter 1, is characterized by a random version of (F): maxi<* < „ £ ? /0 , n —> oo, are crucial for the invariance principles of our main Theorem 2.1.2 of Section 2.1.4 and its companion in Chapter 1 (cf. Remark 2.1.8 for details). A new insight on FEIVM’s (2.1.1)-(2.1.2) and their connection and nearness to regression models is given in part (e) of Remark 2.1.10, that is due to exploring further duality with

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 1 6 0

SEIVM’s studied in Chapter 1. Naturally, in Section 2.1.5 we will see a great deal of the interplay of the applications of the main results of Chapter 2 and Chapter 1. “Negligibility” conditions (F) and £ G DAN have been explored in various statis­ tical applications. For example, DAN condition was used by Mailer [41] to establish asymptotic normality of the regression coefficient in a linear regression, and Mailer [42] lists further examples in this regard that are concerned with asymptotic theory of generalized linear models, time series and of some other models as well. Kukush and Maschke [36] make use of condition (F) with an assumed rate of convergence to zero to deal with of MLSE /%„ in a special case of FEIVM’s (2.1.1)—(2.1.2). It appears that, apart from Chapter 2 and Chapter 1 of this thesis, “negligibility” conditions (F) and £ G DAN have not yet been explored in connection with the basic limit theorems in the context of (2.1.1)—(2.1.2). Both in Chapter 2 and Chapter 1, these conditions have a natural intuitive appeal. Indeed, in Chapter 1, assuming £ gDAN instead of (2.1.38) lets the explanatory variables in (2.1.1)—(2.1.2) have infinite variance and thus dominate over the errors with finite variance. This, in turn, renders observations and Xi more robust to noise (errors) and thus, more precise. A similar interpretation of enlarging the usual class (2.1.37) of {£*, i > 1} to that defined by (D )-(F ) can be given in the FEIVM’s (2.1.1)-(2.1.2) context (cf. Remark 2.1.7).

2.1.4 Main Results with Remarks

In this subsection we list our main results, Theorems 2.1.1-2.1.3 with Remarks 2.1.1- 2.1.12 on them. Many of the remarks contain complementary results, frequently with immediate short proofs. First, we introduce new necessary notations, definitions and an abbreviation. jp Convergence in probability is denoted by —>, where P is the probability measure generated by all finite-dimensional distributions in n of the model (2.1.1)-(2.1.2).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 161

Sometimes, we say “convergence in P ” for convergence in probability P. Through­ out this chapter 0 p{ 1) stands for random variables that are bounded in probability P, while op(l) is to indicate convergence of r.v.’s to zero in P. For convergence almost surely, frequently called a.s. convergence here, we write Abbreviation SLLN is used for classical Kolmogorov’s strong law of large numbers for i.i.d.r.v.’s with finite mean. Notation —» stands for convergence in distribution. By writing {W(t), 0 < t < oo}, or W (t), we mean a standard real-valued Wiener process (Brownian motion). The space of real-valued functions on £>[0,1] with the sup- norm metric p is denoted by (D[0, 1 ],p). Throughout this chapter all the vectors are row-vectors, || • || denotes their Euclidean norm, while (•, ■) stands for Euclidean inner product of two vectors. Random vector Z is called full or, equivalently, Z has a full distribution if for any deterministic vector u, with ||u|| = 1, {Z, u) is a nondegenerate r.v. Throughout this chapter, const universally stands for various absolute constants. Sometimes, instead of saying that a certain property related to a sequence of r.v.’s holds on sets whose probabilities approach one, i.e., that the property holds “with probability approaching one”, we use abbreviation WPAl. Our first main result, Theorem 2.1.1, is on consistency of WLSE’s, MLSE’s, &in> &2n and \e 3n. It is split into parts Theorem 2.1.1a and Theorem 2.1.1b, and collectively called Theorem 2.1.1 on occasions.

Theorem 2.1.1a. Assume the identifiability assumption in (l)-(3 ) that is appro­ priate for the estimator in hand. Let (A), (D) and (E) be satisfied. Assume that p. = 0 in (2.1.3) when studying j3\n and Sin. Then, a s n —> oo,

ain ^ a , 02 n^-*Q and A 03nA0, i = 173, (2.1.39)

while under (B), (D), (E) and (F), (2.1.39) holds almost surely.

Theorem 2.1.1b. Assume that p = 0 in (2.1.3) and (1) is valid.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 16 2

(i) Suppose that lim sup S# < oo, (2.1.40) n—>oo then (A), (D) and (E) lead to

01 n n —* oo, (2.1.41)

while under (B), (D), (E) and (F), (2.1.41) holds almost surely.

(ii) Without assuming (2.1.40), under (C )-(F ), (2.1.41) holds true.

Remark 2.1.1. Gleser [21] establishes large sample results in the context of func­ tional model (2.1.1)-(2.1.2) with vector-valued observations yi and and with slope /3 being a matrix. In particular, when yi and Xi are fft-valued, it follows from [21] that Pin, Scin and $in are strongly (hence, also weakly) consistent, under the therein postulated assumptions that (A) and identifiability condition (1) with fj, = 0 are satisfied, as well as that assumption (D), conditions

lim £2 exists (2.1.42) n—*oo

and Um(?-(?)2)>0 (2.1.43)

are valid in regards of the explanatory variables {&> i > 1}. In fact, consistency from [21] is for the model with T of (1.1.3) where A = 1, p = 0 and 9 is unknown, but we note that, naturally, the results for P\n and otin are also valid when A = 1, p = 0 and 9 is known. Moreover, in view of upcoming Remark 2.1.11, consistency of Pin and ain (9 is known or not), and 9in can be extended from case A = 1 and fj, = 0 to case of arbitrary A and p = Q. Clearly, (2.1.39) of Theorem 2.1.1a and part (i) of Theorem 2.1.1b imply the aforementioned weak consistency of pin, 5in and 9in that follows from Gleser [21]. Moreover, on assuming the existence of a bit more than two moments for the error terms (cf. (B)), we are able to relax conditions (2.1.42)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 163

and (2.1.43) of [21], via assuming (E) and (F) (cf. discussions in Section 2.1.3 after (2.1.37) on concluding that (2.1.42) with a positive limit implies (F)), and still retain strong consistency of /3i„ and (cf. Theorem 2.1.1a). On the other hand, analysing the method of proof in [21], we see that condition (2.1.42) was crucial for proving strong consistency there. Indeed, just like the proof of strong consistency in Theorem 2.1.1a reduces to showing that S ^ fy /S # 0 and S ^ fy /S ^ 0, the key step of the proof in [21] was establishing convergence, as n —» oo,

% = and % = f i - c f e 0,

with c as in (2.1.7), or, due to (D) and the SLLN for S and e, convergence

fJ^AO and fi^O . (2.1.44)

Condition in (2.1.42) assumed in [21] yielded

oo c2

that, in turn, combined with the Kolmogorov SLLN for nonidentically distributed independent r.v.’s with two moments, resulted in (2.1.44). However, via Kronecker’s lemma, it is easy to see that if condition in (2.1.42) is violated, then (2.1.45) and thus (2.1.44) may no longer be true (e.g., when c = 0 and (D) does not have to be assumed to prove strong consistency of /?i„ and 0\n in [21], one may take f? = i, 1 < i < n, implying that £"=1 f? = n(n + l)/2 and n ~2 f? —> 1/2 ^ 0, n —» oo and the latter, in turn, leads to violation of (2.1.45)). Consistency of WLSE j3in under c = 0 obtained in Theorem 2.1.1a ((D) does not have to be assumed) also implies one of the conclusions of the papers Kukush and Martsynyuk ([34], [35]), where the authors deal with a criterion for the WLSE of /3 in multivariate no­ intercept FEIVM’s (2.1.1)-(2.1.2). Namely, according to [34] and [35], in particular, under (A), and liminfn_+00f2 > 0 and lim sup^^ f2 < oo, conditions that are

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 164

weaker than (2.1.42) of [21] with a positive limit, but stronger than (E), fan is strongly consistent.

Remark 2.1.2. In [9], it is shown that fan, <*in, fan and 6t^n are strongly consistent,

while the authors of [11] and [10] prove strong consistency for triples {fa n , &2n, fa n )

and {fa n , S3n, A03n) respectively. In these three papers (A) with independent 5 and e, (D), (2.1.42) and (2.1.43) are assumed. In [11] and [10], in addition, 8 and e are normally distributed. Arguing similarly to Remark 2.1.1, we conclude that (2.1.39) of Theorem 2.1.1a implies weak consistency of the estimators in these three papers. Furthermore, replacing (A) with a bit stronger condition (B), and (2.1.42) and (2.1.43) of [11] and [10] with weaker conditions (E) and (F), in Theorem 2.1.1a we also obtain strong consistency of the above estimators, for a larger class of explanatory variables {&, i > 1}.

R em ark 2.1.3. From the proof of part (ii) of Theorem 2.1.1b, it is seen that without condition (2.1.40), model (2.1.1)-(2.1.2) is not sensitive enough to capture consistency of fan only under (A), (D) and (E), that are used in part (i) of Theorem 2.1.1b. However, assuming also (C), i.e., the existence of four moments on the error terms, as opposed to two moments used in [21] (cf. Remark 2.1.1), but not necessarily assuming (2.1.42) and (2.1.43) (according to Remark 2.1.1, (E) and (F) of part (ii) of Theorem 2.1.1b are weaker than (2.1.42) and (2.1.43) being assumed in [21]), we extend weak consistency result for fan available in [21] via allowing the explanatory variables {&, i > 1} to be on a richer space of numerical sequences. Concluding our remarks on Theorem 2.1.1, we note that weak consistency results of Theorem 2.1.1 are completely synchronized with those in the companion Chapter 1 (cf. Section 2.1.3 for the full summary on the interplay of Chapter 1 and Chapter 2). Thus, conditions on the explanatory variables for (2.1.39) and (2.1.41) in Chapter 1 are simply random companions of (D), (E) and (F). Proofs of strong consistency results of Theorem 2.1.1 are slightly more challenging as compared to those in Chapter 1, and hence,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 165

the correspondence between strong consistency results in these two chapters is less profound. We also note in passing that on account of (D) and (2.2.34) of the proof of Theorem 2.1.1a, all the estimators under study that were introduced in Section 2.1.2 are well-defined WPA1, n —> oo.

Our second main result is Theorem 2.1.2. For convenience, it is split into Theo­ rems 2.1.2a, 2.1.2b and 2.1.2c respectively for estimators of 0 , a and unknown error variances, and their corresponding processes.

Theorem 2.1.2a. Let assumptions (C )-(F ) and the identifiability assumption in (l)-(3 ) that is appropriate for the process in hand be satisfied. Let

2Sxy , i f f = 1, &xy ~ ft > If 3 ~ 2> (2.1.46) Sxx — 6 , i f j = 3,

ik(f,n) be as in (2.1.14) and (2.1.27), and Kjn(t) be as in (2.1.13) and (2.1.26), j = 1,3. When studying (0in — 0)Ki„(t)> suppose that p, — 0 in (2.1.3). Then, as n —> oo, the following statements hold true:

(iii) we can redefine {(<$*,£»),i > 1} on a richer probability space together with a sequence of independent standard normal r.v. ’s {Vi, i > 1} such that

V^U(j,n)(0jn-0 )Kdn{t) sSr^Var^C;, n ) ) ^ sup i * = 0p(l).

Theorem 2.1.2b. Let assumptions (C )-(F ) and the identifiability assumption in (l)-(3 ) that is appropriate for the process in hand be satisfied, and either

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 1 6 6

lim n_»oo (£2 - ( O 2) = M < oo, or lim n_>oo (£2 — (£)2) = oo. When studying (aln — oi)i,ln(t), we also assume that /i = 0 in (2.1.3). Define x Vi(j, n) = {yi -a )- pxi ~ Jjf— j Ui(j, n ), j = 1,3, (2.1.47)

with U(j,n) of (2.1.46), Ui(j,n) of (2.1.14) and (2.1.27), j = 173. Let v'^j, n) be as in (2.1.17) and (2.1.30), and Ljn(t) be as in (2.1.16) and (2.1.29), j = 1,3. Then, as n —» oo, we have:

y/n (OLjn — CX)Ljn(to) v (i) 1/2 N {0,t0), for t 0 e (0,1]; ( EJLi(«»C7, n) - v(j, n) )2/(n - 1))

y/n (a jn a )Ljn(t) (ii) Zw (t) on(D[0,l],p); ( I f L i M , n) - v{j, n) )2/(n - 1))1/2

(iii) we can redefine {( 5j, sf), i > 1} on a richer probability space together with a sequence of independent standard normal r.v.’s {Vi, i> 1} such that

y/n (otjn

Theorem 2.1.2c. Let assumptions (C )-(F ) and the identifiability assumption in (l)-(3 ) that is appropriate for the process in hand be satisfied. Suppose that \i = 0 in (2.1.3) when {6\n — 0)[nt] is studied. Let

n - 2) (A + & ) / n , if j = 1, M (j, n) = (2.1.48) f , if j = 2 or 3, and

(si,yy iP{Si,xy A4) + P^{Si,xx @) > if j — 1; P~2((si,vv ~ Xd) ~ 2P(si,xy - P ) + P2{si,xx ~ 0)) , if j = 2, (2.1.49) . i^hyy ~ ~~ 2/3(Sj,ij/ — fi) + /32{si,xx $) > if j = 3. Then, for j = 1 and 2, as n ^ oo, the follomng are valid:

-^/n M (j ,n)(9jn 6) [nto] 0) Q N(0, to), for to G (0,1]; (E ”=i(^iO>) - w(j,n) )2/(n - 1))

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 167

(ii) 7------")(»*■ gf a ----- Z. Wlf) m (£>[(), 1],p); (EJUK(j, n) - w (j,n) )2/(n - 1))

(iii) we can redefine {( 1} on a richer probability space together with a sequence of independent standard normal r.v.’s {V*, * > 1} such that

V n M(j, n)(9jn - e)[nt] EB(VaJWj(j, n))*Vj sup oP { 1). 0

Moreover, (i)—(iii) are valid for (X6Sn — X9)[nt], with M (S,n) andwi(3,n) respectively in place of M (j, n ) and n).

R em ark 2.1.4. Weak convergence on (Z?[0, \\,p) in (ii)’s of Theorem 2.1.2 is to be understood in the following way. Let D be the sigma-field of subsets of Z)[0,1] generated by the finite-dimensional subsets of D[0,1]. We say that a sequence of random elements {X n(t), 0 < t < 1, n > 1} of D[0,1] converges weakly to W(t) on

P[0,l],p)if h(Xn(t)) ” h(W(t)), n-> oo,

for all h : D[0,1] —> IR that are (D[0,1], D) measurable and /^continuous, or p- continuous except points forming a set of Wiener measure zero on (D[0,1], D). With this definition in mind, also by using (2.2.118) and (2.2.123), we conclude (ii)’s of Theorem 2.1.2 from respective (iii)’s. For example, (iii) for the WLSP for (5 implies that for any h as above, as n —► oo,

J ^U(l,n)0^-l3)KlM \ h /'s^ ir(‘)(Varm(l,n))V»V<'\ * 0 , -«(1 .'>))2/(«-l))1/2/ \ (E?.iVaru,(l,n))1/2j while from (2.2.118), with Q of (2.2.59), the appropriate vector b (as in the proof of Theorem 2.1.2a), K n(t) of (2.2.61) and independent standard normal r.v.’s t/j that are denoted by Vi in Theorem 2.1.2a, / h n —>• oo, (EiiVar«!,»)) j

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 168

and also on account of (2.2.123) ( Ln(t) = K in(t) and Ki„(t) is as in (2.1.13), (rji(n), d) = Ui(l,n) and ttj(l, n) is from (2.1.14), and U{ are as above)

/ S£5r(*)(Vartn(l.n))1/»K) h f ESf(Var(Ci, h —»P n0, n —* oo. , (Sfc.iVartil(l,rj))Va J ( (ZS.i Var(0, 6))1/2 ) Consequently, respective (ii) of Theorem 2.1.2a follows. Clearly, (ii)’s yield the respective cases (i) in Theorem 2.1.2. Thus, the proof of Theorem 2.1.2 will be reduced to establishing (iii)’s only. Despite of this, cases (i)—(iii) are spelled in Theorem 2.1.2 as separate results for further use and convenient reference.

R em ark 2.1.5. We are to discuss how the CLT’s of Theorem 2.1.2 relate to those in the literature. First, we note that when the pair of conditions (2.1.42) and (2.1.43), as a special case of (E) and (F), is not assumed, then (i) parts with to = 1 of Theorems 2.1.2a—2.1.2c present first time around CLT’s. On the other hand, under (2.1.42) and (2.1.43) instead of (E) and (F), CLT’s of Theorem 2.1.2 imply already known related CLT’s in our main references in this regard. Thus, vectors of

estimators 0 in , 5 ln, 9ln) (cf. [21]), 0 2n, a 2n) and 0 3n, S3n) (cf. [9]), 0 2n, a 2n, d2n)

(cf. [11]) and 0 3n, S3n, Ad 3n) (cf. [10]) are known to be v^n-asymptotically normal. The result that follows from [21] requires (1) (T of (1.1.3) is such that A = 1, /i = 0 and 9 is unknown), (C), (D), (2.1.42) and (2.1.43). We note that this result is also valid with arbitrary A of T in view of Remark 2.1.11, that, naturally, implies y/n- asymptotic normality for (&„, 2in) when A is arbitrary, /i = 0 and 6 is known in T

of (1.1.3). Asymptotic normality for 0 2n, a2n) and 0 3n, a3n) in [9] is proved under (2) and (3) respectively, (C) with independent 6 and e, (D), (2.1.42) and (2.1.43). In [11] and [10] the authors use the same conditions as in [9] and also suppose that 6 and e are normally distributed. To see that CLT’s of Theorem 2.1.2 imply all the aforementioned previously known ones, one first replaces appropriate conditions of this theorem with stronger conditions employed in [21], [9], [11] and [10], i.e., one assumes (2.1.42) and (2.1.43) instead of (E) and (F), also independence of 8 and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 169

e in regards of (fan, <*2n) and <* 3n), and also that (6, e) is iV(0, diag(A0, 6)) in

regards of 0 2 m &2n, @2n) and (fan, «3n, A03„). Then in such a set-up one concludes that, due to (2.2.29) and (2.2.152) (case = b^> = 0 and/or lim^oo£2 < oo),

and consistency of j3ln, as n -» 00, ( E ”=i («»(.?> ^) “u(j,n))2/(n - 1))C/“2(j?,n), E"=i(^0‘, n) - v (j,n ) )2/(n - 1) and ( E ^iK O ', n) ~ w(j,n) )2/(n - 1 j)M ~ 2{j, n), j = 1,3, converge in probability to positive constants that are equal to the variances of the asymptotic normal distributions of the corresponding estimators in [21], [9], [11] and [10].

Remark 2.1.6. Parts (a)-(e) of this remark are concerned with the main Studen- tization features of Theorem 2.1.2. (a) Theorems 2.1.2a-2.1.2c may be viewed as nontrivial extensions of limit the­ orems based on Studentization in that all the processes in Theorem 2.1.2, namely, y/nU( 1,ri)0 in - P ) Kln(t) (E L i(«;(!>n ) - u { 1,n) )2/ ( n - 1)) 1/2, • • •, are essentially Student processes (cf. definition in (2.2.7)) in a somewhat loose sense. More pre­ cisely, it is seen in the proof of Theorem 2.1.2 in Section 2.2.2 that for j = 173 processes in D[0,1]

______< i i n )Kjn{t)______V n v,U’n)Ljn(t)______(Er=i (Ui(j,n) - u(j,n) )2/(n - 1))1/2 (E"=i(^0» ~ v'tiin) )2/(n ~ i ) ) ^ y/E w (j,n)[nt] and ------——LJ------(2.1.50) ( EF=iK (i, n) - w(j, n) )2/(n - 1))

are the main term processes respectively for those studied in Theorems 2.1.2a, 2.1.2b and 2.1.2c, where u(j,n), v'{j,n) and w(j,n) are defined in Section 2.1.2, j = 1,3. Processes in (2.1.50) can be viewed as special Student processes for triangular se­ quences {ui(j, n), 1 < i < n, n > 1}, • ■ •, of dependent random variables. Moreover, all the processes in (2.1.50) are handled here via our key auxiliary Lemma 2.2.9 of Section 2.2.2, with universal Studentization invariance principles for all reasonable

estimators that are based on (y, x, Syy, Sxy, Sxx) and for their corresponding pro­

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 170

cesses in the context of FEIVM’s (2.1.1)-(2.1.2) (cf. also Remark 2.2.8). Benefits from Studentization in Theorem 2.1.2 are in that the results are invariant with re­ spect to distribution of (5, e) satisfying (C) and the identifiability assumption in (l)-(3) that is appropriate for the process at hand, do not contain unknown pa­ rameters of this distribution (depend only on error moments that are assumed to be known according to corresponding (l)-(3)), do not contain parameters associ­ ated with {£j, i > 1} and the only unknown parameter appearing in normalizers ( E i= iM j, n) - u(j, n) )2/(n - 1)) 1/2, ( E"=ifyi(j, n) - v(j, n) )2/(rc - 1)) 1/2 and (EILi(^iO» — wUin) )2/(n ~ 1)) ^ j = 173, is 0. Some immediate applica­ tions of Theorem 2.1.2 and upcoming Theorem 2.1.3, where 0 in the aforementioned normalizers will be replaced with its WLSE or MLSE’s that are available in corre­ sponding cases (l)-(3), will be discussed in Section 2.1.5.

(b) Here is how Theorem 2.1.2 has originated. In principle, the respective CLT’s in (i)’s with to = 1 of Theorems 2.1.2a—2.1.2c could have been available along with the long existing CLT’s in [21], [9], [11] and [10] provided that conditions (E) and (F) of Theorem 2.1.2 were reduced to (2.1.42) and (2.1.43), stronger conditions according to Remark 2.1.1, and the error terms were to obey respective conditions in these papers (cf. Remark 2.1.5). However, only in the presence of Chapter 1 of this thesis has it become easier and inviting to think about such Studentization CLT’s. In fact, it is the enriching the space of explanatory variables to DAN class in SEIVM’s (2.1.1)—(2.1.2) that has led to Studentization in the invariance princi­ ples (CLT’s in particular) established in Chapter 1. Furthermore, when exploring a natural interplay between SEIVM’s studied in Chapter 1 and FEIVM’s under con­ sideration in this chapter, it became desirable to establish invariance principles for estimators and processes of Section 2.1.2 that would be identical in form to those obtained for their respective companions in Chapter 1. As a result, there is a strong correspondence between Theorem 2.1.2 in Chapter 2 and Theorem 1.1.2 in Chapter

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 171

1. Namely, (D), (E), (F) and assumption that either limn_»oo (£2—(€)2) = M < oo, or limn-** (e2- ( 0 2) = oo are replaced with their random analogues in the context of Chapter 1, and the time functions K jn(t) and Lin(t) for the processes for (3 and a, j = 1,3, are simply [nt] in Chapter 1, while approximating Gaussian processes in corresponding weak approximations in probability results are simply W (nt)/y/n in Chapter 1 (cf. Section 2.1.3 for a full summary on the interplay between Chapter 1 and Chapter 2). It turned out that, in view of conclusive lines of Remarks 2.2.6, 2.2.7, due to not necessarily assuming “classical” condition (2.1.42) here, Studenti­ zation in Theorem 2.1.2 became necessary.

(c) We elaborate further on the CLT’s of Theorem 2.1.2 in view of available CLT’s in the literature. As opposed to (i)’s with to = 1 of Theorems 2.1.2a—2.1.2c, even though -^/n-asymptotic normality of (An, An, An), or that of (An, Sin), which follow from Gleser [21] (cf. Remark 2.1.5 for details) are generally proved without assuming normality or normality-like conditions on the error terms, their applicable forms require error moments being as if (5, e) had a N(0, diag(A0, 9)) distribution. This is because, in general, the covariance matrix of the asymptotic distribution of (An, An, 9i n) in [21] involves unknown error term cross-moments up to and includ­ ing moments of order four, in addition to unknown parameters j3, m = limn^oo £ and M = limn_+00£2 — m2. As pointed out by Gleser [21], these moments are hard to estimate from data. On the other hand, when he assumes that error moments are identical to those in N (0,912), the covariance matrix of the asymptotic nor­ mal distribution of (An, An, 9\n) contains only the following unknown parameters: A 9 , m and M. In this set-up, with applications in mind, in addition, one needs consistent estimators of m and M, that are available due to [21]. Similarly to [21], CLT’s for (An, An) and (An, a3n) in [9] are also laden with complicated covariances of their asymptotic distributions unless 6 and e are assumed to be normal, while normality assumptions of [11] and [10] allow simple estimable forms for covariance

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 172

matrices of asymptotic normal distributions of 02m ®2m $2n) and 0 z n, fan, \9zn) respectively (cf. Remark 2.1.5 for details on these CLT’s). Further to studies of estimating variances/covariances of asymptotic distributions in CLT’s in FEIVM’s

(2.1.1)-(2.1.2), Babu and Bai [5] prove Vn-asymptotic normality of fan, fan and fan and give jackknife-type estimators for the variances of their asymptotic dis­ tributions, under some strong conditions on the error terms and the explanatory variables, with conditions on {&, i > 1} being stronger than (D), (2.1.42) and (2.1.43) combined, i.e., stronger than in our other references in this regard. We note that consistent estimators for the variances of asymptotic normal distribu­ tions of fan, fan and fan proposed in [21], [9] and [5] are different from expressions (J2i=i{ui(j,n) — u(j, n) )2/(n — 1 ))l/2U~l{j,n) and ( £?=i uf(j, n ] f /2U~l {j, n) of forthcoming Theorem 2.1.3 and part (b) of Remark 2.1.9 respectively, j = 1,3, that serve as consistent estimators of the just mentioned variances. To summarize, prop­ erly used Studentization in Theorem 2.1.2 makes the CLT’s datarbased and free of any parameters associated with unobservable explanatory and error variables.

(d) Theorem 2.1.2 continues to be valid when Studentized processes of the pa­ rameters of interest are replaced with their self-normalized versions, namely when normalizers ( E"=i(«i(j, n) - u(j,n) f/{n - 1)) 1/2, ( £E=i(t>iO'. n) - v(J,n) )2/(n - 1))"1/2 and ( n) ~ w(j,n) )2/(n — 1)) ^ of the corresponding processes are respectively replaced with (Z)”=i w2(j,n)/n) ^2, (E"=i ^2(j,n)/n) ^ and (X)ILi w2(j,n)/nj l^2, j ~ 173. The proof of this goes as follows. First, using (2.2.33) and (2.2.105) of Remake 2.2.6, one shows that auxiliary Lemma 2.2.6 is also valid with (YZ=iiCu b)2/n ) ^ in place of (ELi(C* ~~ C> b)2/(n - 1)) ^ , and then one proves Lemma 2.2.9 with ( J2i=i(Vi(n), d)2/n j 1/2 in place of ( Y%=i{Vi(n) — rj(ri), d)2/(n — 1)) ^ 2, similarly to the proof of original Lemma 2.2.9 via using the modified Lemma 2.2.6. Finally, the latter modification of Lemma 2.2.9 leads to self-normalization in Theorem 2.1.2, similarly to the proof of Studentization The-

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 173

orem 2.1.2. Studentization approach to Theorem 2.1.2 is more effective than the self-normalization one in that it makes n) ~ v(j>n))2/{n ~ 1)) ^ in Theorem 2.1.2b a-free and eliminates xmknown error variances in (Y%=i('Wi(j,n) — w (j,n ))2/ ( n - 1)) V 2 in Theorem 2.1.2c. It appears that two approaches are equally effective for Theorem 2.1.2a. (e) In view of the conclusive lines of (a) of this remark, one may wonder about the necessity of working out asymptotics at all for the estimators and processes for

unknown error variances in Chapter 2. Indeed, 0i„, 62n and A03„ are not needed anymore to make the CLT’s in FEIVM’s (2.1.1)-(2.1.2) applicable. In fact, this is also true in regards of all other invariance principles of Section 2.1.4. Nevertheless,

when lim^oo^2 < 00, 0\n and &2n may come handy for estimating the so-called reliability ratio (cf. (c) of Remark 2.1.10) that may also serve as an indicator for reliability of confidence intervals for j3 (cf. Section 2.1.5). In any case, technically,

asymptotics for din and 62n under limn_>.oc, f 2 < 00 are handled simultaneously with the case when this limit does not exist, but (E) holds true.

R em ark 2.1.7. In view of (2.2.29) of the proof of Theorem 2.1.1a and (2.2.153) of Remark 2.2.7 (case |&W| 4 -16®| > 0) in Section 2.2.2, under the assumptions of Theorem 2.1.2, for the normalizers U(j, n) ( £ ”=i(«*(j,n) ~ UU>n) )2/(n ~ 1)) ^ of processes (/3jn - fi)Kjn{t) for /?, we have

U (j, n) ^S(«i(7»n)~ u ( j,n )) 2/ ( n - 1)) = y/S ^O P( 1), n -> 00, (2.1.51) i=l / where S# is as in (E). On combining part (i) with tQ = 1 of Theorem 2.1.2a and

(2.1.51), as n —> 00,

A» -0 = • % = • (2.1.52) v n%

In particular, if lim^oo (^2 - (? )2) = 00, as allowed in Chapter 2 via assuming (E) and (F) in place of “classical” (2.1.42) and (2.1.43) (cf. more in Section 2.1.3),

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 174

the degree of precision of the WLSE and MLSE’s of /3 increases. This effect nat­

urally meets our empirical expectations. Indeed, {&, i > 1} satisfying (D) and lim^oo (£2 - (f )2) = oo behave as if they were i.i.d.r.v.’s with finite mean and infinite variances. Hence, loosely speaking, these conditions make the explanatory variables more dominant over the errors with finite variances. This, in turn, makes observations y* and Xi more robust to noise (errors) and thus, more precise. As to estimators of a, j = 1,3, and d\m &2n and X0sn of unknown variances, on account of respective (i)’s of Theorems 2.1.2b and 2.1.2c, and (2.2.153), they are \fn-

asymptotically normal (in regards of a jn , we note that if lim n_>oo (£2 — (f)2) = oo, case bW = = 0 of (2.2.153) is used). Observations of this remark go hand in hand with Observation 1.1.1 in Chapter 1, where similar phenomena take place in SEIVM’s (2.1.1)-(2.1.2) under £ € DAN, companion assumption to present (D)- (F).

R em ark 2.1.8. “Negligibility” condition (F), that is first explored for various invariance principles in Chapter 2, is a crucial sufficient condition for Theorem 2.1.2. In fact, Lindeberg’s condition (2.2.99) for auxiliary sequence {(Ci, 6), % > 1}, with Ci of (2.2.59) and b e IR7, b ^ 0, is the principal sufficient condition for key auxiliary Lemmas 2.2.6 and 2.2.9 for the proof of Theorem 2.1.2 (cf. also the introduction to Section 2.2.2). According to auxiliary Lemma 2.2.10 of Section 2.2.2, under (C )-(E ), (F) is necessary and sufficient for Lindeberg’s condition (2.2.99) when

|6«| + \bW\ > 0 (such {(Ci, b), i > 1} corresponds to Theorems 2.1.2a and 2.1.2b (case lim^oo (f2 — (£)2) < oo and limn_*oo £ ^ 0)). The analogue of this remark in the predecessor Chapter 1 for SEIVM’s (2.1.1)-(2.1.2) is Proposition 1.1.1, that amounts to optimality of £ e DAN for the companion of Theorem 2.1.2, a condition that is characterized by maxi oo (cf. Lemma 1.2.1 in Chapter 1).

As anticipated in Remark 2.1.6, eliminating (5, the only unknown parameter in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 175

the normalizers (Zi=i{uiU>n) - uti>n) )2/(«-l)) 2, (Ei=i(vi(j,n)-v{j,n) )2/( n - 1)) 5 and ( X)”=i (wi{jin) ~ wUin) )2/(n ~ 1)) 3 °f Theorems 2.1.2a-2.1.2c, in The­ orems 2.1.3a-2.1.3c, collectively called as Theorem 2.1.3 on occasions, among other things we provide below complete data-based versions, readily available for some immediate applications as in Section 2.1.5, of the CLT’s for the estimators of inter­ est. Theorem 2.1.3a. Let all the assumptions of Theorem 2.1.2a be satisfied. Introduce

-W A - Xs. _A x x , A2 \ *hVV nj^ S iA , ifj = 1, P i n ) n ) = I £ - * > - - J > ' . v i = 2, (2-L63) 1{$i,xy fi) Pzni,Si,xx > if 3 — 3. Then, as n —> oo, statements (i)—(iii) of Theorem 2.1.2a continue to hold true with Ui(j,n) of (2.1.53) in place ofui(j,n) of (2.1.14) and (2.1.27), j = 1,3. Theorem 2.1.3b. Suppose that all the conditions of Theorem 2.1.2b are valid. Let _ *** *** x ___ Vi(j, n) = (yi - a ) - /3jnXi - Ui(j, n), j = 1,3, (2.1.54)

where U(j, n ) and Ui(j, n) are as in (2.1.46) and (2.1.53), j = 1,3. Then, a s n —> oo, statements (i)—(iii) of Theorem 2.1.2b, withvi(j,n) of (2.1.54) replacing V{(j, n) of (2.1.47), remain valid, j = 1,3. Theorem 2.1.3c. Assume all the assumptions of Theorem 2.1.2c. Let

i?i,yy ^ ) 2ftln(.Si,xy AO "b $) > if j ~ 1, ^i()j = * /^2n (isi,yy 2/?2n (si,xy ~ fi) ~b /^i(®i,xx $)) > if j — 2, . (si,yy ~ M) 2/?3n(SjiXj, — fl) + @ZrJ\Si>xx ~ > if j = 3. (2.1.55) Then, as n —> oo, statements (i)—(iii) of Theorem 2.1.2c, with Wi(j,n) of ( 2.1.55) replacing Wi(j,n) of (2.1.49), continue to hold true, j — 1,3.

R em ark 2.1.9. (a) Just like CLT’s of Theorem 2.1.2 (cf. (c) of Remark 2.1.6), CLT’s of Theorem 2.1.3 are first time CLT’s that are completely data-based and free

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 176

of the unknown moments of the unobservable error terms and parameters associated with the explanatory variables, both under (2.1.42) and (2.1.43) on the explanatory variables, when neither the explanatory variables, nor the error terms are assumed to be normal or normal-like, and, of course, under (E) and (F), when (2.1.42) and (2.1.43) are not satisfied. In particular, under (2.1.42) and (2.1.43) without the nor­ mality assumptions, normalizers y/nU(1, n) ( £ ”=1(5i(l, n) — 5(1, n) )2/(n — 1)) ^ 2,

1}, and thus, they are also available for some immediate applications (cf. Section 2.1.5). Namely, these are weak invariance principles for (2jn - a )ijn{t) — (ajn - a)[nt] in Theorem 2.1.3b, under limn_*oo (£2 — (O 2) = oo assumed, j = 1,3, and for

(Oin — Q)[nt], (#2n — #)[nt] and (A#3„ — A0)[nt] in Theorem 2.1.3c. The CLT’s and afore­ mentioned weak invariance principles of Theorem 2.1.3 are indistinguishable in form from those of Theorem 1.1.3 in Chapter 1 for SEIVM’s (2.1.1)-(2.1.2). As to the rest of the invariance principles in Theorem 2.1.3, while those for 0 j n — 0)Kjn{t) and (aJn -

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 177

j = 1,3. For some applications, it would be of interest to eliminate unknown pa­ rameters from Kjn{t) and Ljn(t), j = 173. Research in this regard is presently in progress by the author (cf. [46] and Epilogue).

(b) Theorem 2.1.3 is also valid when denominators ( X7=i(^C?> n)—u(j, n) )2/(n —

1)) V2, ( E?=i (M 3,n) ~ $U>n) Y K n ~ !)) 1/2 and (E"=i (witi,n)-€i(j,n))2/ ( n - 1)) of the corresponding processes are replaced with (lX=i n )/n j ) (Z?=i% U, n)/n) 1/2 and (ESU w f(j,n )/n ) 1/2, respectively, j = 173. The proof of this result is based on its companion in part (d) of Remark 2.1.6 and the proof of the original Theorem 2.1.3. We note that such Theorem 2.1.3a is completely data-based.

Remark 2.1.10. (a) At the beginning of Chapter 2 we emphasized that one of the principal distinctions of FEIVM’s (2.1.1)—(2.1.2) that are studied here is that the inference about the parameters in these models is based on the additional knowledge as in one of (l)-(3). For the notion of identifiability of a parameter in EIVM’s, for the areas of research where identifiability assumptions are commonly found and for ways of obtaining identifiability assumption information, we refer to texts [12] and [19], and Gleser [21]. We note in passing that the situation of uncorrelated errors

(fi = 0) with equal variances (A = 1) seems to present quite a reasonable and natural special case when (1) is automatically satisfied. As noted in part (a) of Remark 1.1.6 of Chapter 1, identifiability assumptions were originally introduced to make the normal SEIVM (2.1.1)-(2.1.2) (( yi,Xi) are normally distributed) identifiable. In contrast, the corresponding normal FEIVM (2.1.1)-(2.1.2) is identifiable without any additional assumption, but lacks consistent estimators. In fact, there is a close relationship between identifiability of a parameter in a SEIVM and a possibility of this parameter being consistently estimated in the corresponding FEIVM (cf. pp. 7, 78, 239, 240 of [12] for details). Hence, not only do identifiability assumptions provide identifiability of parameters in SEIVM’s, but they also allow, but do not

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 178

guarantee, consistent estimation of these parameters in the corresponding FEIVM’s. If a FEIVM (2.1.1)-(2.1.2) is identifiable, instead of the estimators of Section 2.1.2, one can choose to work with those that are based on estimation methods that do not require any supplementary information, e.g., on the methods of sample higher moments or cumulants, characteristic functions and some other methods (cf. Van Montfort [63] and texts [12], [19] and [33]). A study of these methods is not within the scope of Chapter 2. In any case, restrictions that are parameterized by identifi­ ability assumptions (l)-(3) may, in general, be viewed as ones that make intuitive sense, in that they fairly concretize model (2.1.1)-(2.1.2), and thus help to control and diminish the effect of its error terms for the sake of the possibility of meaningful inference under reasonable conditions on £ and the error moments. (b) Apart from (l)-(3 ), the following two identifiability assumptions for FEIVM’s (2.1.1)-(2.1.2) are also found in the literature:

(4) under E{8 e) = 0, (D), (2.1.42), (2.1.43) and intercept a being unknown, the

limit of the reliability ratio = (£2 - (£ )2)j ((£2 - (£)2) + Vare) is known, i.e., k f = limn^oo (£2 - (£)2)/(lim n^ oc (£2 - (£)2) + Vare) is known;

(5) intercept a is known and limn_oo £ ^ 0.

Conditions (4) and (5) are not considered in the main lines of development of Chapter 2 and will only be briefly addressed in (c) and (d) of this remark and then, also, in Section 2.1.5. Also, in (1), the cases of V being known up to an unknown multiple and T being completely known are usually treated as two separate identifiability assumptions in the literature. We note that the weighted least squares approach (cf. Section 2.1.2) allows us to accommodate them in single (1) here. (c) Identifiability condition (4) is a companion to condition (4) of parts (b) and (c) of Remark 1.1.6 in Chapter 1, that amounts to the properly defined reliability

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 179

ratio being known. Similarly to Chapter 1, fc|° adjusts the ordinary least squares estimator

ft — _ ^ xv ~ A/£Loo fogl o o&XX c ’

so that p 4n is a consistent estimator of P under (A) with // = 0 in (2.1.3), (D),

(2.1.42), (2.1.43) and a being unknown. It follows from Gleser [26] that {Pin, 24n), where ct4n = y — x /34n, is A/n-asymptotically normal if, in addition, (C) with independent S and e holds true. (d) Just like in SEIVM’s, sometimes in FEIVM’s it is reasonable to assume (5) (cf. parts (b), (d) of Remark 1.1.6 of Chapter 1 and Chapter 1 of [12]). In this case, rather than dealing with problematic MLE’s, one adapts and explores MLE’s from the corresponding SEIVM’s (2.1.1)-(2.1.2) under (5) of Remark 1.1.6 in Chapter 1, namely, ^ y — q .—. — ^ @5n — — > 7^ 0> A05n ~ and @5n = X where \ 6zn and 02n are as in (2.1.31) and (2.1.32). In fact, triple (/?5n, \ 6^n) 0$n) and its properly defined corresponding process can be studied asymptotically by using the methods of Chapter 2. We only note here that under (A) and (D) with limn_»oo £ ^ 0, it is easy to see that, as n —> oo,

r„ = - — ------2, j\r(o, i) (LIU {(Vi ~y)~ P(xi - ^)) /(n - !)) and ^ ______y/nx0^n ~ P) v M(n - / / ~ \ 2 \ 1/2 ( EJL 1 ((2/i - y) - Psn(xi - x)) /(n - 1)) However, if ( S, s) has a normal distribution, an important observation here is that Tn has a Student t distribution with n — 1 degrees of freedom. We also note in passing that while, on the one hand, throughout Chapter 2, distinction between no­ intercept model and the one with unknown intercept is accommodated in notations

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.4 Main Results with Remarks 180

(2.1.6) and (2.1.7) that were introduced disregarding the identifiability problem, on the other hand, when intercept a is known to be zero, one can also use estimator

y /x in place of A n, /? 2n and /%„, that are studied respectively under (1), (2) and (3). (e) Condition

(E») lim ^ 00( e - ( ? ) 2)= o o

presents a special important case of (E) for a number of reasons. First of all, (E») seems not to have been explored in the literature in connection with some basic asymptotic theory in FEIVM’s (2.1.1)-(2.1.2). Secondly, it is a companion of condition Var£ = oo in the corresponding SEIVM’s (2.1.1)-(2.1.2) studied in Chapter 1 of this thesis, and the impacts of the two conditions on robustness of the EIVM’s are synchronized (cf., e.g., Remark 2.1.7). Moreover, both conditions Var£ = oo and (E’) allow one to treat corresponding SEIVM’s and FEIVM’s like linear regressions (cf. part (e) of Remark 1.1.6 in Chapter 1 in connection with SEIVM’s) in that under (E’), on account of (2.2.34), ordinary least squares estimator A» = Sxy/S xx of P is strongly consistent, so are /3n = Syv/S xy and 54n — y -x p n as well, and none of identifiability assumptions (l)-(3 ) is necessary in this regard. It is interesting to note that under (E’), the limit fc|° of the reliability ratio in (4) of (b) and (c) of this remark is 1, and f3n = Sxy/S xx is seen to be equal to /34n used under (4). Further investigations on FEIVM’s (2.1.1)—(2.1.2) under (E ’) is currently in progress in [48] along with studies of corresponding SEIVM’s under Varf = oo. In this work, we focus on formal connection and nearness of EIVM’s (2.1.1)-(2.1.2) to regression models and also develop consistent estimators of unknown variances A 6 and 6.

Remark 2.1.11. When studying fiin, Si„ and 0i„ in Theorems 2.1.2 and 2.1.3, we assume for convenience that fj, = 0 in (2.1.3) and thus, the respective identifiability

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.5 A Note on Some Applications 181

condition (1) reduces to the ratio of the error variances being known. Between such a model and the other type of the model covered by (1), with fj, ± 0 and matrix T known up to an unknown multiple, there is a close data-transformation based interplay that follows from [21]. We note that this interplay concerns a FEIVM with A = 1 and /i = 0 in T of (2.1.3) and the one with arbitrary T as in (1), and is also valid in the respective context of our FEIVM’s (2.1.1)-(2.1.2) under consideration in Chapter 2, with explanatory variables satisfying all or some of the assumptions in (D )-(F). Moreover, it allows one to convert the WLSE’s of j3 and a and the MLE of 9 obtained in the model with arbitrary T as in (1) to corresponding 5ln and 9\n with A = 1. Hence this interplay can be adapted to obtain asymptotic results for the model with fi ^ 0 in F of (2.1.3) from Theorems 2.1.2, 2.1.3. It can also be used to translate asymptotic results from FEIVM (2.1.1)—(2.1.2) with A = 1 and fj, = 0

to the one with arbitrary A and fj, = 0. We also note in passing in regards of /?i„, $i„ and 9in that, though it does not take any extra effort to deal with arbitrary A in (2.1.3), in view of the described interplay, similarly to [21], we could have assumed that A = 1.

R em ark 2.1.12. We note in passing that, due to (2.2.152) and convergence in (2.2.189), (2.2.194) and (2.2.199) of Section 2.2.2, the denominators of all the pro­ cesses studied in Theorems 2.1.2 and 2.1.3 are well-defined WPA1, as n —> oo.

2.1.5 A Note on Some Applications

In this section we discuss some immediate applications of the main results of Section 2.1.4. Due to the established interplay of SEIVM’s of Chapter 1 and FEIVM’s of Chapter 2, and identity in form of the corresponding main results in these chapters (cf. summary in Section 2.1.3), these applications are similar to those described in Section 1.1.5 of Chapter 1 in connection with SEIVM’s. Hence, in this section, we will take a short route via leaning on Section 1.1.5 of Chapter 1 as much as possible.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.5 A Note on Some Applications 182

Throughout this section abbreviations LSA and CI stand, respectively, for large- sample approximate and confidence interval.

Below, under each of identifiability condition of (l)-(3) and (5) as in part (b) of Remark 2.1.10, we address LSA 1 — a CPs for slope /?, 0 < a < 1, that follow from the CLT’s of Theorems 2.1.2a and 2.1.3a, the corresponding self-normalized CLT’s of part (d) of Remark 2.1.6 and part (b) of Remark 2.1.9, and the CLT’s in part (d) of Remark 2.1.10. Other applications of our main results in Section 2.1.4 are along the lines of the corresponding subsection of Section 1.1.5 of Chapter 1.

Since Theorem 1.1.3a of Chapter 1 and Theorem 2.1.3a, and the corresponding self-normalized CLT’s for (3 in part (b) of Remark 1.1.5 of Chapter 1 and part (b) of Remark 2.1.9 are indistinguishable in form, complete datarbased CLT’s of Theorem 2.1.3a and companion CLT’s of part (b) of Remark 2.1.9 lead, respectively, to LSA Cl’s in (1.1.55) and (1.1.56) obtained in Section 1.1.5 of Chapter 1, provided that one of (l)-(3) is assumed. Similar arguments yield yet another two LSA CPs under (2), namely those in (1.1.57) and (1.1.58) that are, respectively, WPA1 equivalent to (1.1.63) and (1.1.64) in Section 1.1.5 of Chapter 1, and follow in Chapter 2 from CLT’s of Theorem 2.1.2a and part (d) of Remark 2.1.6. We note that to conclude (1.1.63) and (1.1.64) in the context of the FEIVM’s (2.1.1)—(2.1.2), we assume additionally that lim su p ^ ^ , £4 < oo (this results in the sings of (1.1.61) and (1.1.62) being positive WPAl). In the manner of (1.1.63) and (1.1.64), under (3), one can obtain yet another two LSA CPs in addition to (1.1.55) and (1.1.56). Under (5) as in part (d) of Remark 2.1.10, LSA CPs for /?, based on the CLT’s therein, coincide with those under (5) as in part (d) of Remark 1.1.6 in Chapter 1, namely with (1.1.68) (or, equivalently, WPAl with (1.1.71) ), (1.1.69) and (1.1.70) (or, equivalently, WPAl with (1.1.74)) of Section 1.1.5 in Chapter 1.

Just like in SEIVM’s (2.1.1)—(2.1.2), due to lack of any other type of CPs, avail­ ability of any LSA CPs for j3 in FEIVM’s (2.1.1)-(2.1.2) under identifiability assump­

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.5 A Note on Some Applications 183

tions should be generally appreciated (cf. Cheng and Van Ness [12] and Section 1.1.5 in Chapter 1). From our summary on the CLT’s in the literature, i.e., from Section 2.1.3 and Remarks 2.1.5, 2.1.6 (parts (a), (c)), 2.1.10 (parts (c), (d)) of Section 2.1.4, one can get an idea about the nature of corresponding LSA Cl’s for j3 in FEIVM’s (2.1.2)-(2.1.2) under identifiability assumptions. As to the summary of our contri­ butions in this regard that are presented in the previous paragraph, the obtained LSA Cl’s under (l)-(3 ) are new when we do not assume (2.1.42) and (2.1.43), the usual assumptions on the explanatory variables in the literature that are relaxed to (E) and (F) here, and also when (E) and (F) are reduced to (2.1.42) and (2.1.43), but the error terms are not assumed to be normal or normal-like, as opposed to the LSA Cl’s in the literature (cf., e.g., part (c) of Remark 2.1.6). We note that, if (2.1.42) and (2.1.43) and normality or normality-like assumptions on the error terms are satisfied, our LSA Cl’s differ from the corresponding ones in the literature (cf. Remark 2.1.9). Moreover, similarly to Section 1.1.5 in Chapter 1, our approach in Chapter 2 has added a few new LSA Cl’s under each of (l)-(3) to the handful that have been known so far. Also, under (5), we introduce three LSA Cl’s. This allows one to consider a few LSA Cl’s under each of (l)-(3 ) and (5) and, naturally, compare the performances of these Cl’s. Such comparison studies are planned for the future by the author.

Phenomenon of the Gleser-Hwang effect, that was widely discussed in Section 1.1.5 of Chapter 1, also takes place in regard of LSA Cl’s for j3 in FEIVM’s (2.1.1)— (2.1.2) (cf. also [8], [11], [12], [23], [24] and [27]). As a result of this, under (A), with (5, e) assumed to have a joint density satisfying some general assumptions, it follows from [27] that LSA Cl’s for (3 in FEIVM’s (2.1.1)-(2.1.2) may suffer some unfortu­ nate limitations that are nevertheless preventable with a proper, robust choice of the models. For example, Gleser [26] concludes that when in FEIVM’s (2.1.1)-(2.1.2) under (1), with A = 1 and (5, e) having iV(O,diag(A0,0)) distribution, reliability

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.5 A Note on Some Applications 184

ratio k% (cf. part (b) of Remark 2.1.10) is at least 0.5, then his LSA CI for (3 based on the moderate sample of size n > 25 is reasonably good in that its confidence is close enough to the desired level. This can also be explained via the idea behind the Gleser-Hwang effect. Indeed, according to [27], this effect takes place in a FEIVM (2.1.1)-(2.1.2) that is, roughly speaking, close enough to the degenerate one with £i = • • • = £n =m, i-e., to yi = m[3+(x + Si, £; = ra+e,, where the explanatory vari­ ables do not vary and hence make it impossible to fit a unique straight line through the data points. If one a priori restricts the explanatory variables to a closed subset for which £2 — (£ )2 ^ 0, the effect disappears. In view of this, though no similar results under other identifiability conditions and model assumptions exist, the mag­ nitude of the reliability ratio may be indicative of possible problems with LSA Cl’s for (3 (2.1.1)-(2.1.2) in FEIVM’s (2.1.1)-(2.1.2) and, if not known, is worth to be preliminarily estimated.

The idea of importance of the reliability ratio, which is defined as k$ = Var£/(Var£ +Vare) in SEIVM’s and as = (F - (£)2)/((F ~ (f )2) + Vare) in FEIVM’s, is brought here to the next level, constituting one of the main lines of the thesis. Namely, in Chapter 1, the belief that the SEIVM’s (1.1.1)—(1.1.2) with Varf = oo must be more robust to the impact of the error terms with finite variances than the SEIVM’s with 0 < Varf < oo prompted us to develop an asymptotic theory for the first time around SEIVM’s (2.1.1)-(2.1.2) in which the explanatory variables are allowed to have infinite variance. Moreover, conclusive lines of Section 1.1.3, Section 1.1.4 (Observation 1.1.1 and part (e) of Remark 1.1.6) and conclusive lines of the second section of Section 1.1.5 in Chapter 1 provide various rigorous support of this belief. In Chapter 2, it is condition limn_oo ( F _ (£ )2) = oo, as a special case of (E), that plays a similar role in FEIVM’s (2.1.1)-(2.1.2) to that of condition Var£ = oo in SEIVM’s (2.1.1)-(2.1.2). Indeed, from Remark 1.2.2 of Chapter 1 we recall that if i.i.d.r.v.’s {£,&, i > 1} are such that exists and Varf = oo, then, as n —» oo,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.1.5 A Note on Some Applications 185

n r 1 X)iLi(£i—E£)2 oo, that well rhymes with condition lim„—oo (£2—(£ )2) = co,

i.e., with limn-x^n -1 YZ=i(^i — £)2 = oo in Chapter 2. Thus, while SEIVM’s with Var£ = oo can be formally viewed as models with the maximal possible reliability

ratio one, in corresponding FEIVM’s condition limn-**, (£2 — (£ )2) = oo leads to the largest limit one of the reliability ratio. In such EIVM’s (2.1.1)-(2.1.2), though being present, the errors with finite variances are less prominent and disturbing in observations yi and X{, and that makes these models behave somewhat like as if they were ordinary regression models. This is seen from the synchronized effect of the two conditions on SEIVM’s and FEIVM’s (2.1.1)~(2.1.2) (cf. part (e) of Remark 2.1.10), that, combined with the disappearance of the Gleser-Hwang effect under Var£ = oo in SEIVM’s (2.1.1)—(2.1.2) (cf. Section 1.1.5 of Chapter 1) make it rea­ sonable to believe that under lim^oo (^2 — (£ )2) = oo, LSA Cl’s for (3 in FEIVM’s (2.1.1)-(2.1.2) are also resistant to this effect. In fact, it looks like that, while the idea of closeness of regression models and EIVM’s has always been attractive to statisticians in view of the simpler and well-available results for regressions, in this thesis we are first to have found conditions that provide a formal connection and nearness in this regard (cf. part (e) of Remark 2.1.10).

2.2 Auxiliary Results and Proofs

The purpose of Section 2.2.1 is to present Lemma 2.2.1, a result of a general prob­ abilistic nature, that is then applied in Section 2.2.2, when dealing with the special context of FEIVM’s (2.1.1)-(2.1.2). Providing the proofs for our main Theorems 2.1.1—2.1.3 of Section 2.1.4, Section 2.2.2 contains a variety of auxiliary results, some of which may be useful beyond the context of Chapter 2 (cf. first paragraph of Section 2.2.2 for details). Section 2.2.3 plays a crucial role in the proof of the key auxiliary Lemma 2.2.8 of Section 2.2.2 and amounts to a computer code in “Maple”. Just like in Section 2.1, new important notations, definitions and abbreviations that

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v.’s with Two Moments 186

are used more than once throughout Chapter 2 are introduced at the beginning of the subsection where they first occur, and also summarized in the list in this regard that is provided before Introduction in the thesis.

2.2.1 Invariance Principles via Studentization for Independent Nonidentically Distributed Random Variables with Two Moments

The indicator of set A is denoted below by I 4; log stands for the logarithm with natural basis. For r.v.’s X and Y, by X =' Y we mean that P{X = Y) = 1, while X — Y designates equality in distribution of X and Y. 0(1) denotes a uniformly bounded numerical sequence.

Recently, for sequences {Zi} i > 1} of i.i.d.r.v.’s, Gine, Gotze and Mason [20] concluded a full answer to the question

“When is the Student t—statistic asymptotically standard normal?”, (2.2.1)

where the Student t—statistic is given by

y/HZ Tn(Zi, ■■■,%.) = ------: -J7J. (2.2.2) (E ?„1(Zi - Z ) 2/ ( n - l ) ) 1

This paper stimulated further developments along the lines of ( 2.2.1). Thus, extend­ ing the main result of [20], Csorgo, Szyszkowicz and Wang in [16] and [17] obtained characterization of DAN also in terms of weak convergence of the Student process

y/m Zfnt] T«,t(Zu - , Z n) = ------—V'_W ------57J , 0 < t < 1, (2.2.3)

via sup-norm approximation in probability of (2.2.3) by a Wiener process (cf. The­ orem 2.6 of [17], and also, e.g., Section 1.2.1 of Chapter 1 on basics of DAN and Studentization). Moreover, the authors of [16] and [17] study the behaviour of the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v.’s with Two Moments 187

Student t—statistic and the corresponding process also in the context of indepen­ dent nonidentically distributed r.v.’s with two moments and present Proposition 2.3 in [17], that is spelled out as Lemma 2.2.2 here. Investigations along the lines of question (2.2.1), seeking a multivariate analogue of the main result of [20], have also been developed. Though it appears that no answer to (2.2.1) for {Zi, i > 1} of i.i.d. random vectors that would be a true companion to the one in [20] has been given yet, some ramifying answers to such (2.2.1) for some special {Zi, i > 1} are obtained in Section 1.2.3 of Chapter 1. The aforementioned papers [16] and [17] of Csorgo et al. with numerous insights on Studentization and, among other things, new asymptotic results for {Zi, i > 1}, where

{Z i, i > 1} is a sequence of independent r.v.’s with E Z i = 0, n Var Zi — of < oo, i > 1, and y~] of > 0 forre > 1 sufficiently large, (2.2.4) i= 1 have become the mentor guidance papers for this subsection. Thus, [16] and [17] inspired us to sort out contributions on invariance principles via Studentization for (2.2.4) here, presenting, among other things, an answer to (2.2.1) for sequences as in (2.2.4). Namely, we establish here a one-sided analogue of Theorem 2.6 of [17] in the context of (2.2.4). In this regard, while Theorem 2.6 in [17] provides asymptotic characterizations of i.i.d.r.v.’s from DAN, our main Lemma 2.2.1 of this subsection contains different invariance principles via Studentization for sequences as in (2.2.4) that satisfy Lindeberg’s condition, Studentized analogues of a few well- known theorems. In order to present Lemma 2.2.1, we have to introduce the Student process that corresponds to (2.2.2) based on (2.2.4).

For {Zi, i > 1} of (2.2.4) and n large enough, let TO = m>0, (2.2.5) 1=1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v.’s with Two Moments 188

and K n(t) = sup {s^ < 0 < t < 1, n > 1. (2.2.6) 0

______y /n Z Kn(t)

Under Lindeberg’s condition (2.2.8), due to (2.2.4) and (2.2.26) of the proof of Lemma 2.2.1, YH=\(Z% —Z )2 in the denominator of (2.2.7) is positive WPAl, n —» oo. In the following Lemma 2.2.1, (c) implies (b) (cf. Remark 2.1.4 of Section 2.1.4), and (b), clearly, leads to (a) (cf. the proof of Lemma 2.2.1). However, all (a), (b) and (c) appear below as separate results for further convenience.

Lemma 2.2.1. Let sequence {Zi, i > 1} of (2.2.4) satisfy the Lindeberg condition, i. e., n for each e > 0, s~2 J2 E ( Z l l {\Zi\>esn}) -► 0, n -► oo, (2.2.8) i= 1 ■with s i as in (2.2.5). Then, as n —> oo, for the Student process Tn>t(Zi, • • •, Zn) in (2.2.7) we have:

(a) Tnit0(Zi, • • •, Zn) % N(0, t0), t0 e (0,1];

(b) on (D[0,l],p);

(c) we can redefine {Zi, i > 1} on a richer probability space together with a se­ quence of independent standard normal r.v. ’s {Ui, i > 1} such that K„(t) sup Tn,t{Zi, •■■,Zn)- s ~1 °iUi = Op{ 1), 0

Remark 2.2.1. Lemma 2.2.1 provides Studentized companions to a few well-known theorems. Thus, with process Kn(t) S~l E Z i , o < t < 1, (2.2.9) t=i

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v.’s with Two Moments 189

in place of Tntt(Zi, • • •, Zn) of (2.2.7), where K n(t) in (2.2.9) is defined as in (2.2.6), the conclusion of (a) with to = 1 amounts to the Lindeberg-Feller CLT (sufficiency part, under (2.2.8)) for a regular sequence of r.v.’s rather than double arrays of r.v.’s. Theorem 3.1 of Prohorov [55], read for a regular sequence of r.v.’s, states that in the presence of the so-called uniform negligibility condition for (2.2.4), Lindeberg’s condition in (2.2.8) is equivalent to weak convergence of a continuous version of (2.2.9) to W(t) on (<7[0, l],p) with the sup-norm metric p. A version of this result is introduced as Problem 7 on p. 143 in [6], namely that (2.2.8) implies weak conver­ gence of (2.2.9) to W(t) on (£>[0,1], ps), where ps is the Skorohod metric on D[0,1]. Since C7[0,1] supports Wiener measure on (D[Q, 1], CD), with sigma-field D generated by finite-dimensional subsets of D[0,1], (b) of Lemma 2.2.1 can be viewed as the one-sided Studentized companion of the (D[0,1 \,p) version of Prohorov’s theorem. As to (2.2.8) implying (c) of Lemma 2.2.1, this is the Studentized companion of the result of Csorgo et al. in [17] for the self-normalized partial sums process (cf. forthcoming Lemma 2.2.2).

R em ark 2.2.2. Conclusions of (a) and (b) in Lemma 2.2.1 remain valid for double arrays {Zi, 1 < % < kn, n > 1} of r.v.’s that are independent for each n, have zero means and finite variances, at least one of which is positive. The proofs are based on the Lindeberg-Feller CLT and Prohorov’s theorem for such {Zi, 1 < i < kn, n > 1} respectively (cf. Remark 2.2.1), and also on (2.2.26), that in turn follows from the version of Raikov’s result of forthcoming Lemma 2.2.4 for the just mentioned double arrays {Zi, 1 < i < kn, n > 1} (cf. Theorem 4 on p. 143 in Gnedenko and Kolmogorov [28]).

The proof of Lemma 2.2.1 is based on Lemmas 2.2.2-2.2.4. The result of the following Lemma 2.2.2 is a sup-norm approximation in proba­ bility for a self-normalized partial sums process. It was recently obtained in Csorgo et al. [16] (cf. Proposition 2.3 in [17]).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v.’s with Two Moments 190

Lemma 2.2.2. Let {Zi, i > 1} be as in (2.2.4) and satisfy Lindeberg’s condition (2.2.8). Then we can redefine {Zi, i>l} on a richer probability space together with a sequence of independent standard normal r.v. ’s {Ui, i > 1} such that

V 'ln*l 7. rr TJ- sup ^ ! =0p(l), (2.2.10) 0< t< l ( E t i z?)1/2 *. where is defined by (2.2.5).

We will also make use of Levy’s modulus of continuity of the Wiener process (cf., e.g., [14]).

Lemma 2.2.3. For a standard Wiener process (W(f), 0 < t < 1}, sup I W(t2)-W (t!)| j. 00 yj2h\og(l/h) Lemma 2.2.4 below is a special case of Raikov’s theorem (cf. Theorem 4 on p. 143 in [28]).

Lemma 2.2.4. Suppose that sequence {Zi, i > 1} is as in (2.2.4) and is defined by (2.2.5). In order that the uniform negligibility condition be satisfied, i.e., for each e > 0, max P(\Zi\ > esn) —» 0, n —> oo, (2.2.11) and the CLT should hold, namely,

snl^2 z i ^ N(0,1), n oo, (2.2.12) i=1 it is necessary and sufficient that sums s ~2 X)"=iz ? be relatively stable, namely,

n -* oo. (2.2.13) 1 Proof of Lemma 2.2.1. First, we are to prove (c). Using the following represen­ tation for Tn

v~vKn(0 7 /(Y'n 72\l/2 Tn,t(Zi, • • • , Zn) = —/------*) 2------\l/2> 0 < t < 1, (In - Z,/(T.U Z?)1/2) )/(» - 1)) (2.2.14)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 Invariance Principles via Studentization for r.v. ’s with Two Moments 191

we have Kn(t) s £ i w Zi E w Wo-|Dl sup , Zn) — s~l °iui < sup o 1} is as in Lemma 2.2.2. In view of the latter lemma, (2.2.15) and observation that for each n > 1,

E d <7//i sup = max T t i Z i E d o M 0

* . < < > sup (2.2.16) 0

sup = Op (1). (2.2.18) 0 1, f KnW 1 ^ f / ^ \ 1 p 1 E ^ 0 < t < l p p p £ a2J,0

since all finite dimensional distributions of the two processes are the same. Let

hn := sn max of. (2.2.20) l

Clearly, Kn(t) t - h n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 192

Hence, by (2.2.21), (2.2.22) and Lemma 2.2.3, as n —> oo, K„(t) sup octci n"2 E «?) i= 1 / 0(1). (2.2.23) yjhn \og{l/hn)

Finally, combining (2.2.19), (2.2.21), (2.2.23) and the fact that

sup |W(t)| = Op(l), (2.2.24) 0

one concludes (2.2.18). As to part (b), it follows from (c), (2.2.19), (2.2.21) and (2.2.23).Alternatively, independently of the conclusion in (c), (b) can be argued as follows. Due to (2.2.21), for any e > 0, as n —* oo,

P (n (Z )2 > esl) < {ensl)-lE ( E W) = \i=l / i=l < (e s2)_1 mgc -> 0- (2.2.25)

Since Lindeberg’s condition implies (2.2.11) and (2.2.12), from (2.2.13) and (2.2.25),

-Z)2_ ES.1 -Z? n (Z f P o2 o2 q2 1, n —* oo. (2.2.26) s n bn bn Finally, Prohorov’s theorem for (2.2.9) (cf. Remark 2.2.1) and (2.2.26) result in (b). Trivially, (b) yields (a). For to = 1, another proof of (a) can be given by using the Lindeberg-Feller CLT (sufficiency part, under (2.2.8)) and (2.2.26). □

2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1—2.1.3

Aiming at the proofs of our main Theorems 2.1.1—2.1.3 and at their accompanying remarks introduced in Section 2.1.4, in this subsection we develop auxiliary Lemmas 2.2.5—2.2.10 of various nature. In particular, for the proof of Theorem 2.1.2, we employ two auxiliary processes, namely those in (2.2.60) and (2.2.114), and establish Lemmas 2.2.6 and 2.2.9 for them. Process in (2.2.60) will be seen to serve as the main

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 193

term in the expansion for (2.2.114) (cf. the proof of Lemma 2.2.9), while (2.2.114) is the prototype of the main terms in expansions for properly normalized processes of our main interest that are introduced in Section 2.1.2 and studied in Theorem 2.1.2. Among the auxiliary results of this subsection that may be of general interest, we distinguish Lemma 2.2.7 on matrices and invariance principles of Lemma 2.2.9 for a special Student process (cf. Remarks 2.2.6 and 2.2.10 for details). The chain of our auxiliary results is concluded with observational Lemma 2.2.10 that calls attention to optimality of the newly introduced condition (F) in FEIVM’s for key Lemmas 2.2.6 and 2.2.9. Below, o(l) stands for a numerical sequence that converges to zero. Abbrevi­ ation WLLN is for the Kolmogorov weak law of large numbers for i.i.d.r.v.’s with finite mean. For a square matrix A, Amin (A). Amax(A), det(A) and tr(A) denote respectively the minimum eigenvalue of A, the maximum eigenvalue of A, determi­ nant of A and trace of A, while by A > 0 or A > 0, we mean that A is positive semidefinite or positive definite. If Z is a d-dimensional vector, then is its j th component, while = {Z^k\ Z^k+1\ • • •, Z^k+l^) is a subvector of Z € IRd that has all the components of vector Z € lRrf starting with Z ^ and ending with Z^k+l\ l 2. Useful relations on the explanatory variables that are established in the following lemma play crucial roles throughout this subsection.

Lemma 2.2.5. Let the explanatory variables satisfy (D) and (E). Then,

(a) for sufficiently large n,

£2 > £2 _ ( f )2 > a £2( for some a e 1).

(b) we also have liminf 11 71—♦OO I

vnth c of {2.1.7).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 194

Proof. To prove (a), we are to show first that inequality in (E) is strengthened to the following one:

liminf (e£2 — (£ )2) > 0» for any e 6 (d, 1), (2.2.27)

where

d = ( (* + W (^2 “ (£)2)/m2) 1 >if 1M gf ( ? ” (^)2) < 00 and m ± 0, [ 0 , otherwise, (2.2.28) with m as in (D). Assume first that m ^ 0 in (D). Then, on account of (E), there exists a positive constant 5 such that

0 < 5m 2 = Slimsup(£ )2 < liminf (£2 — (£ )2) . n—►oo >oo \ /

The very last inequality implies that,

0< u^2.SSf( - 5 )2) + (P-U + ^Kf)2)

and we have (2.2.27), with e (1 +

liminf (e£2 — (£)2) > £ lim inf (£2 — (£ )2) — (1 — e) lim sup(£)2 71—>00 \ / 71—>00 \ / 7X—>00

= e liminfn—>oo V(£2 — (£)2) ' ' / >0.

Thus, we have proved (2.2.27) with constant d from (2.2.28) in case m — 0 as well. From (2.2.27), for sufficiently large n and e € (d, 1), with d e [0,1) of (2.2.28),

F - (?)2 - (?? + (1 - e)F > liminf (e ? _ ( + (i _ e)|J > (j_ e ) |J

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 195

that, in particular, holds true for £ = (1 — d)/2. Hence, constant a in part (a) can

be chosen as a = (1 + d)/2 € [|, 1), while, trivially,

F >F-(?)2-

We are to prove part (b) now. Let c = 0, i.e., intercept a is known to be zero.

For sufficiently large n, from part (a) and (E) guaranteeing that £2 > 0,

that trivially leads to inequality in part (b). When c = 1, (D) and (E) imply

t • c(-1 — v (£ — m )2 limsup (£ — m)2 limmf 1 - ;s ' 1 > 1 - lim sup ;s ' > 1 ------— —— n-»°° y ( £ _ m)2J n_*oo (£ - m )2 liminfn^oo ^ — 2m£ -f m2)

> 1 lim ^pod - t o ) 2______liminf^oo (f 2 - (£)2) + limn-**,^ - m )2 □ Proof of Theorem 2.1.1a. Assuming (A) and (E), we first prove that, as n —» oo,

% ^ il, %Z£4/3 and ^»~A*4/3a. (2.2.29)

For example,

= ,82 + + ^ 5- ^ , (2.2.30)

where, since ES2 = \6 < 0 0 and (E) is valid, by the SLLN,

S g g -x o P - x e c (s)2 0.s. , 1X ,nnoiX —C------= —c------Q ~ °(1)> w 00, (2.2.31)

where c is as in (2.1.7). Due to representation

s (> = ^ E fe - °?)W - cJ) = - E(f< - «?)*. (2.2.32) n .=1 » i=i and by Markov’s inequality, for any a > 0,

P(|5d>a^

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 196

Thus, (2.2.30)-(2.2.33) and (E) result to the last convergence in (2.2.29). The first and the second convergence in (2.2.29) are proved similarly. Suppose now that (B) and (D )-(F ) are valid. Then, as n —» oo,

? 2 ! f J L * ± > p and gw ~ X6 /?2. (2.2.34) £>& % Indeed, concentrating only on the last convergence in (2.2.34), we argue it as follows. If Xi, • • •, Xn are independent r.v.’s with EXi = 0, i = l,n, and p> 2, then

P ( n / / n \ P 1 /2' E T,X: (E®W +(E^-x?) (2.2.35) i = 1 where c{p) is a positive constant depending only on p. Inequality in (2.2.35) and its proof can be found in [54] (cf. Theorem 19 on p. 86 there). Applying (2.2.35) to {(& — c Q$i/y/n YZ=i(6 — c£)2, 1 < i < nj with p = 2 + A, where A > 0 is as in (B), and employing (F) and part (a) of Lemma 2.2.5 (under (D) and (E)), for n large enough, we have _ 2+A v™ Je, - E Etite - c£)^i (6 - c?)2

const (£i - c0<5i < E E 2+A O. ff E ( 2X 1+t> 7l1 + 2 l—l \ / const 1+T < 1_L A n A+ 2 (E?=lte-c|)2)^ V SSaft-- c & ) — A const ( maxi 0, 2+A -c£)Si P (|S 5J| > a j% () < a-<2+a>B a («< V^E"=i( ( i ~ c | ) 2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 197

the Borel-Cantelli lemma and (2.2.36), one concludes

=• o(l), n oo. (2.2.37)

Thus, (2.2.30), (2.2.31), (2.2.37) and (E) imply the last convergence in (2.2.34). Now, assuming that n = 0 in (2.1.3), consider WLSE P\n = (Pin—P)Kin(i)+P, with 0 in — P)Kin(t) from (2.1.11). It follows from (E) and (2.2.29) that

for any fi > 0, P (S ^ < 0) = P (s iJ ( S (iff) < o) = P (s m/(S(iP) - 1 < - l )

< P (|S ra/(S£e« - 1| > l) - 0, n —* oo. (2.2.38)

Similarly, for any ft < 0, P(Sxy > 0) —» 0, n —> oo. (2.2.39)

On combining (2.2.38) and (2.2.39),

sign(5'XJ,) sign(/3), n -» oo. (2.2.40)

Convergence in P to z of zn = (zn - z)Kln^ + 2, with (zn - z)Kln(t) of (2.1.12), that results from (2.2.29), and (2.2.40) lead to weak consistency of pln. Similarly, under respective conditions, strong consistency of /3ln is concluded from (2.2.34) and (2.2.40) that now holds true in almost sure . For @2n and Pzn, their desired consistency follows directly from (2.2.29) and (2.2.34) respectively. Consistency of ain follows from that of respective /3in, (D) and the SLLN (under (A)), i = T73, since

ain- a = y - xj3in - a = (y - xp - a) - x(/3in - p) = (5 - e/3) - (£ +e)((3in - P).

Consider now estimator $2n from (2.1.31). Accordingly,

_ q = ^ xx ~ ~ ~ (^xv ~ ^)2 ^2 2 41) Syy A 9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 198

where

(Sxx - 9)(Syy ~ \6) - (Sxy ~ fl)2

= (% + 2% + see - 0) (Sgfl2 + 2 Sufi + s6S - xe)

~ (S& P + % + + Sfie ~ A4)

= (S2y 2 + 2% % /?2 + (See - 0)% /?2 + 2 % % /? + 2 % 2 % 0

+ (See - 0)2%/? + % (% - A0) + 2% (% - A0) + (SE£ - 0)(Sgs - A0))

- { S y 2 + S i + S l(P + (% - fx)2 + 2%/3% + 2%/?%/?

+ 2 %/?(% — A4) + 2% % /? + 2% (% — /j) + 2% /?(% — aO)

= SK/32( - |(S * -,«) + Jj(Sm - A0) + (S« - 9)) + (2.2.42)

with

it!n = -Sf* - 52e/?2 + 2% % /? + (% - 0)(% - A0) - (SSe - Ai)2

+ 2Sssf3(See - 0) + 2% (% - A0) - 2% (% -/*)- 2% /?(% - /z). (2.2.43)

From the SLLN (under (A)), as n —> oo,

— (Si, - //) + i ( S « - AS) + (S« - 0) S ' o(l), (2.2.44)

while due to (2.2.33) and (2.2.37) respectively, their versions for % /^ /% , the SLLN (under (A)) and (E), as n —» oo, result in

I RnS £ \ =op(l) or a=s o(l). (2.2.45)

Under respective appropriate conditions, combining (2.2.41)-(2.2.45) and the last stated convergence results in (2.2.29) and (2.2.34) respectively, we obtain weak and

strong consistencies of 02n.

For A03n of (2.1.32), from (2.2.42)-(2.2.45) and the first stated convergence re­ sults in (2.2.29) and (2.2.34), we arrive at

■x,, '.o

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 199

% ( - 2(3(SSe - n) + (Sss - xe) + p2(see -ej)+Rn Sxx — ^ = op(l) or a=s- o(l). (2.2.46) □

P roof of T heorem 2.1.1b. For part (a), under respective conditions, by (2.2.29), respectively (2.2.34), and weak, respectively strong, consistency of /§in, we arrive at as n —> oo, (Syy ~ A0) — 2Sxy/3i„ + (Sxx ~ n _ fi\ _/i\ ------= Op(l) or = 0(1)

and A + j3?n —► A + ft2 in P or a.s., (2.2.47)

and hence, and by (2.1.40), for 0\n -Q = (0in -0)[ni], with (0in -#)[nt] as in (2.1.19), we conclude 0lB- 0 = oP( 1) or = o (l). (2.2.48)

Assuming that (C )-(F ) are satisfied, but not necessarily assuming (2.1.40), we are to prove part (b), i.e., (2.1.41). One rewrites 0in — 6 as

a _ Q _ n (Syy ~ A fl) — 2Sxy/3 ~h ( SXx ~ G)^2 Ti —2 A + Pin n (SXx-O )0 l-P 2)-2Sxy0ln-P) n — 2 A + Pin + (2.2.49)

='.Vi + V2 + V3, (2.2.50)

where Vi, V2 and V3 are respectively for the first, second and third summands in

(2.2.49). Clearly, by the SLLN and consistency of /3ln, as n —> 00,

V = e) ~ 2M + (S"~ ~ ^ o(l) and Vs = o(l). (2.2.51) n - 2 A + /?lB Hence, to conclude consistency of 0ln, it suffices to show that, as n —» oo,

V2 = oP( 1). (2.2.52)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 200

From (2.1.52) of Remark 2.1.7 in Section 2.1.4 ((C )-(F) are assumed), we have

& „ - /? = n —*oo. (2.2.53) VnS«

From (2.2.53), (2.2.33), the version of (2.2.33) for consistency of An and (E), as n —> oo,

^ ft +& ) = («•* - «)(A2„ -0 1) - 2 s„ (A „ - /3) = - /3)2 + 2S?,(A „ - /3)2 + (S„e-«)(&»- /* )(& , + V)

-2S(S0in-0)-2Ss,0 l„-0)

n JSk n

that yields (2.2.52). □

Remark 2.2.3. For further asymptotic studies, in the proof of Theorem 2.1.2 it

suffices to replace 0 i n — P)Kln(t) of ( 2.1.11) with process

(A n- AtfinW = sign {P)\j((zn-z)Kln(t)+ zf+ A - (zn-z)Kln(t) - * - A (2.2.55)

where (zn — z)Kln(t) and z are as in (2.1.12) and 0 < t < 1. Indeed, in view of Theorem 2.1.2a, we have for each t E [0,1],

y/nU(l,n)0ln- p)Kln(t) VnU(l,n)(/3ln- P)Kln(t)

(Ei=i(ui(l,n) - u{l,n))2/(n - 1))1/2 ( £ r =1M l ,n ) - u(l,n ) )2/(n - 1))1/2

+Pn,K in(t) j (2.2.56)

with U(l, n) of (2.1.46), Uj(l,n) of (2.1.14) and ( jX ^ u ^ l.n ) ~ u(l,n) )2) ^ 2 being well-defined WPA1, n —► oo, due to forthcoming Remark 2.2.7.Moreover, due to (2.2.40),

sup | p„,Kln(t) I = °p (1), (2.2.57) 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 201

since for any s > 0, as n —» oo,

P ( 0s u p \pn,Kln{t)\ > e) < P (sign(S^) ^ sign(/3))

< P(jsign(Sxy) - sign(/?)| > 2) -» 0.

Similarly, in the proof of Theorem 1.2b, we will study

(5i„ - a)Lln(t) = - x (fan - P)Lln(t) + ( y - x P ~ a )z,ln(t)> 0 < t < 1, (2.2.58)

in place of (aln - a)Lln(t) of (2.1.15), where 0 i n - p )n n{t) is as in (2.2.55), with Lin(t) of (2.1.16) in place of K in(t) of (2.1.13).

We are to introduce process in (2.2.60), the first of the two key auxiliary processes for the proof of Theorem 2.1.2 of Section 2.1.4 (cf. introduction to Section 2.2.2), and then, we summarize invariance principles for (2.2.60) in Lemma 2.2.6. Put random vectors

Ci = ((& - cm)5i, (f* - cm)£i, Si} eit 6 ^ - n, - A0, e? - 0), i = T ji, (2.2.59)

with c as in (2.1.7). Using (2.2.59) and nonzero vector of constants b € 1R7, we introduce a special case of the Student process of (2.2.7) in D[0,1] as

(2.2.60)

with time function K n(t),

m n K n(t) = sup I m : Var(G, b) < t^ 2 Var(Ci, b) ^ , 0 < t < 1. (2.2.61) I i=l i=l J

As argued in Remark 2.2.6 that follows the proof of Lemma 2.2.6, under the con­ ditions of the latter lemma, — C> &)2 > 0 WPA1 and thus, (2.2.60) is well- defined WPA1, n —> oo.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 202

Lemma 2.2.6. Let assumptions (C )-(F) be satisfied. If = 6® = 0, suppose additionally that

Vax{(5, e, Se - pt, 52 - \0, e2 - 0), &(3-7>) > 0. (2.2.62)

Then, as n —► oo,

(a) , / n (?T V k m b}/(n - 1))“1/2 Z N (0,t0). t0 6 (0,1];

(b) y/n {(, &)*„(,) (EiUK-0 4>V(»-1)) 1>2^ W ( t ) on (£>[0, l],p);

(c) we can redefine {(£*, ef), i > 1} on a richer probability space together with a sequence of independent standard normal r.v.’s {Ui, i > 1} such that

« , t)x.m S&W(Varte. W 2Ui sup = Op( 1). 0

In view of Lemma 2.2.1, the proof of Lemma 2.2.6, that follows after auxiliary Lemmas 2.2.7, 2.2.8 that are developed for it, will be reduced to verifying Lindeberg’s condition (2.2.8) for sequence {(&, b), i> 1}. In fact, it is the case l&^j + 16®| > 0 that is to receive a special attention via these two forthcoming auxiliary lemmas. This is because when = b ^ = 0, {(&, b), i > 1} are i.i.d.r.v.’s and, due to assumed (2.2.62), Var(Ci, b) = Var((<$, e, Se — p, S2 — A9, e2 — Q), b ^ ) > 0 and the Lindeberg condition is clearly valid in this situation. In the following auxiliary Lemma 2.2.7 an arbitrary sequence {An, n > 1} of positive semidefinite matrices is considered. The result amounts to a criterion on when the minimum and the maximum eigenvalues of A n are bounded away from zero and infinity, respectively, uniformly in n.

Lemma 2.2.7. Let {An, n > 1} be a sequence o fk x k positive semidefinite matri­ ces. Then, 0 < liminf Amin(i4n) < lim sup Amax(An) < oo (2.2.63) n—>oo n —>oo if and only if lim inf det(A„) > 0 (2.2.64)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 203

and lim sup tr(j4„) < + 00. (2.2.65) n—>oc

Proof. Let (2.2.64) and (2.2.65) be satisfied. It is well-known that eigenvalues Amin (An) = A^ < A^2) < • • • < A ^ = Amax(^n) of a symmetric matrix An are real and related as follows: £A<,‘> = tr(A ,) (2.2.66) i = 1 and I I An^ = det(i4n). (2.2.67) i= l Moreover, since An > 0, A*? > 0 , 1 < i < k, (2.2.68)

and hence, (2.2.65), (2.2.66) and (2.2.68) result in

0 < lim sup A^ < lim sup tr(An) < 00, 1 < i < Jc, (2.2.69) n —>00 n—>00

that, in particular, implies the last inequality in (2.2.63). In view of (2.2.68), lim infn—nx, Amjn(j4n) ^ 0. If

liminf Amin(rin) = 0, n—>oo

then, by (2.2.67) and (2.2.69),

k lim inf det(A„) < liminf Amin (An) TT lim sup n-> 00 n—»oo I—J n _*oo r 1 fc- ~ 1 < lim inf Amm(i4n) lim sup tr(rin) = 0. n—*oo I J

This contradicts (2.2.64). Consequently, we have also proved the first inequality in (2.2.63), i.e., 0 < lim inf Amin(An).

The middle inequality in (2.2.63) is trivial.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 204

On assuming (2.2.63), due to (2.2.66)-(2.2.68),

jfc lim inf det(An) > lim inf > [ lim inf Amin(.4n)] > 0, n —>oo v TT n —>oo 71 L n~>oo v 'J

lim sup tr(^4n) < lim sup X$ < k limsupAmax(An) < + 00.

Thus, (2.2.64) and (2.2.65) are verified. □

R em ark 2.2.4. Lemma 2.2.7 read for a single matrix A (take An = A, n > 1) amounts to a criterion on when a positive semidefinite matrix is strictly positive definite. In order to appreciate the beauty and simplicity of conditions (2.2.64) and (2.2.65), one may like to compare them to the assumption that all the leading minor determinants of A are positive, a well-known necessary and sufficient condition for A > 0. In the same spirit, we say that for An > 0, n > 1, conditions (2.2.64) and (2.2.65) are necessary and sufficient for An to be positive definite uniformly in n, in the sense of (2.2.63). For the proof of inequality (2.2.80) of Lemma 2.2.8, a key auxiliary result for the proof of Lemma 2.2.6, (2.2.64) and (2.2.65) will become especially handy, as it will be possible to verify them.

Proceeding with auxiliary developments on our way to verify Lindeberg’s condi­ tion (2.2.8) for {{£*, b), i> 1}, with Q of (2.2.59) and b e IR7 such that |&W|+|M2)| > 0, we consider a convenient representation for (Q, 6), namely

(Ci) b) — inini bn), (2.2.70)

where for i = 1, n,

(€i-cm)Si (& — cm)£i Si e* Si£i-/j, Sf—XQ ef-6 (6 - cm)*' (6 - cm)2’ ’ V" ’ y fi (2.2.71) with c as in (2.1.7), and

y/nb^A\ y /n b ^ \ y /n b ^ , y f n b (2.2.72)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Awdliaxy Results and Proofs of Theorems 2.1.1-2.1.3 205

Vector 7 in is well-defined for n sufficiently large by (E) and (2.2.101). Representation

in (2.2.70) will allow to reduce Lindeberg’s condition for sequence {{£*, b), i > 1} in

case |&^| + 16®| > 0 to the one for double array {( 7m, bn), 1 < i < n, n > 1} with

l^n^l + 1 ^1 > 0 (^ a t for large n |&^| + |6®| > 0 if and only if l&^l + | 6^ | > 0 is due to (E) and (2.2.101)). In fact, it will be seen to be possible due to the

componentwise prenormalized form of 7*n and hence, also, due to certain desirable properties of covariance matrix of

7 n = £ > „ . (2.2.73) i=l

By (C), E ^ in = 0 and Cov 7n exists and has the following form:

/ xe dnXe dnfJ> dnm 21 dnm 30 dnm u ^ v e dnjX dn9 dnm \2 dnm 2i dnmQ3 dn xe dnn xe m 21 m 30 m x2 An := Cov7„ = dnfx dn9 e m u m 21 w 03 dnm 21 dnm i2 m 21 mi2 m22 — /j? ra31 — A9fi - 9(j, dnm3o dnm 2i m30 m2i m3i - A6fi mm - (A0)2 m 22 - A02 ^dnm i2 dnm03 m u m 0 3 m i 3 - 9fj. m 22 - X92 m 0 4 - 62 j (2.2.74) where £ — cm dn = : (2.2.75) ((f-cra)2)] and

my = £(<5V), i ,j = 0,1,2,3,4. (2.2.76)

Among other useful properties of An = Cov 7n in (2.2.74), the most crucial one is that either 0 < liminf Amin(An) < lim sup Amax(An) < oo, (2.2.77) n-*tx> n —* oo

or there is a subvector % of 7„ such that this property holds for submatrix Cov 7n of An. More rigorously, this result reads as Lemma 2.2.8 below.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 2 0 6

Lemma 2.2.8. Let assumptions (C )-(E ) be satisfied. Then, there exist random vectors {%, n > 1} and nonrandom vectors {bn, n > 1}, such that for 7n of (2.2.73) and bn € IR7 with |&W| + \b$\ > 0, we have

{lm K ) =' (7n, bn), (2.2.78)

and

0 < 0||6n||2 < ||M |2> with some

Also

0 oo (2.2.80)

Proof. In the proof we distinguish three cases: (2.2.81) is satisfied, (2.2.81) is not valid, while (2.2.85) holds true, and both (2.2.81) and (2.2.85) fail. In each of these situations appropriate sequences {%, n > 1} and {bn, n > 1} are found, that obey (2.2.78)-(2.2.80) with certain absolute constants ' and ip'. Naturally, and ip in (2.2.79) and (2.2.80) that are universal for all three cases are taken as minimums of respective (p' and ip'. First, we assume that

vector (S, e, 5e, 82, e2) is full. (2.2.81)

Choosing

7n = 7n and bn = bn,

we only need to verify (2.2.80), since (2.2.78) and (2.2.79) with (p' — 1 are trivial. By (C), lim sup tr(A ) < +oo, (2.2.82) n—►oo where An is from (2.2.74), while using computer software “Maple” (cf. Section 2.2.3 Appendix for the code details), we factorize det(A„) (cf. the first equality in (2.2.83))

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 207

as follows:

det(A„) = (1 — d2)2det(r) det(Cov 7^3’7^ = (1 - d2)2det(r)det(Cov 7 '), (2.2.83)

where t 7 is as in (2.2.81). On account of part (b) of Lemma 2.2.5, positivity of V of

(2.1.3) (cf. (C)) and assumed (2.2.81) implying that C 0V7 ' > 0, we conclude

lim mf det(An) > 0. (2.2.84)

Via Lemma 2.2.7, (2.2.82) and (2.2.84) result in (2.2.80) with ^ = liminf Amin(i4n) > 0. In fact, using this lemma, one can also conclude that assumed (2.2.81) is necessary and sufficient for (2.2.84) and hence, for (2.2.80) with An = An. Suppose now that (2.2.81) fails and

Var(7/, 6<3-7>) = 0, (2.2.85)

with t 7 of (2.2.81) and subvector M3,7) of b in representation (2.2.70)-(2.2.72). As

seen from the previous paragraph, 7„ = j n and bn — bn are no longer a suitable choice, since when (2.2.81) is not valid, then (2.2.80) with A n = An ceases to hold.

By noting that Var( 7^3,7\ b@'7)} = nVa,r(/yl, &(3’7)), we conclude from (2.2.85) that

&?7>) “ ■ 0, (2.2.86)

that leads to

<7». K) ='

and the natural choice of 7n and bn here are

7n = 7 n ’2) a n d K = b £ ’2K

While (2.2.78) follows from (2.2.87) and (2.2.80) with An — T of (2.1.3), and ijj' = Amin(r) > 0 is trivial (cf. (C)), (2.2.79) has to be shown. We have

IMP = E « - cn>)2||6M ||2 + n||4(3'7>||2, \\K \\2 = I l l ’ll2 = E « - cm)2^ 1'2’!!2, ^ ’_1 (2.2.88)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 208

with b and b^3'7^ of vector of constants b 6 IR7 as in (2.2.72), such that ||6^1,2^|| > 0 (due to (E) and (2.2.101), for large n the latter inequality amounts to the assumed l^n^l + > ^)- Clearly, if ||M3,7)|| = 0, then one can put ' = 1 in (2.2.79). Otherwise, it is not hard to see that (2.2.79) is satisfied with ft = d( 1 + d)~x € (0,1), where d = liminfn_ 00(2n)-1 £™=i(& - cm )2||&(1,2)||2/||6 (3’7)||2 > 0 and, on account of (2.2.101) and (E), liminfn_»oo n~x EILi(6 ~ cm )2 > 0. Indeed, for large n, '( 1 — ')~x = d < n~x ]£JLi(& — cm )2||6^1,2^||2/1|6^3,7^||2, and hence, ' holds true. Suppose now that both (2.2.81) and (2.2.85) fail, in other words,

Var(y, &(3-7)) > 0, (2.2.89)

and y f of (2.2.81) is not full, i.e., there exists e € IR5, c ^ 0, ||e|| = 1, such that

(Y, e) a= 0. (2.2.90)

To define appropriate yn and 6n, we use the following reduction procedure. For n > 1, we modify ’’tails” yj?’7) of yn into yi3’7^ and construct vectors of constants &(f’7) such that <7(a’7), e r)> =• (7(3>r), &S?’7)>, (2.2.91)

where, due to (2.2.89) and the fact that Var(yf?,7\ bj3,7)) = nVar(y', b ^ ) for all n,

At the first step of our reduction procedure, we cut 7 ' by one of its components that has nonzero coefficient in (2.2.90), eliminate a component with the same position

from 7 ^3,7) and denote the reduced vectors, with the original order of the remaining

components, by 7 ' and y ^ respectively. Then, using

(7M , e) 0, (2.2.92)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 209

that holds for all n with e of (2.2.90) due to Cov Y = Cov 7^3,7\ the cut component

of 7 *j3,7) gets also eliminated from {y$'7\ b£'7^) and, modifying coefficients near the remaining components in this expression, one gets as in (2.2.91). If Y is full, then the process of constructing y ^ and b ^ is over. Otherwise, replacing (2.2.90)

and (2.2.92) with (Y> e) —' 0 and (y^’7\ e) =' 0 (CovY = Cov 7^3,7)), where e is some nonrandom nonzero vector, and expression ( y ^ , b ^ ) with (y^'7\ b ^}, we keep on applying the described reduction algorithm to y1, y ^ and frjp) until Y is full, each time saving the same notations for the newly obtained vectors. Clearly, on account of positivity of error covariance matrix (cf. (C)), y '^ ^ = (5, e) is full, and one can always produce Y out of y1 by preserving these two components in y'. Thus, y/nS and y/ne are always contained in vector y ^ 1?\ Hence, our construction procedure has at most three steps, and the dimension of the resulting vectors y ^ and &(f’7) varies from 2 to 4. Vectors y^3,7^ and b£,7\ that satisfy (2.2.91) and are obtained on the last step of the construction, define yn and bn of interest as follows:

7 n = 7 f°) (2.2.93)

and bn = i » 7>). (2.2.94)

Finally, when both (2.2.81) and (2.2.85) fail, we are to show that yn and bn of (2.2.93) and (2.2.94) satisfy (2.2.78)-(2.2.80). Clearly, (2.2.78) follows from (2.2.91). Concerning (2.2.80), first, we establish factorization

det(i4„) = (1 — d2)2 det(r) det (Cov^3,7)). (2.2.95)

Crucial result of (2.2.95) is obtained in “Maple” by verifying identity (2.2.95) for all possible An = Covyn that correspond to yn of (2.2.93), with y(3,7) having the first two components y/nS and y/ne, and the rest of the components (maximum two), if any, composing a subvector of vector (y/n(Se - fi), y/n{82 — A9), y/n{e2 - 0 ) ) , with

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 210

the same order of components as in 7 „ . In other words, (2.2.95) is checked for all

7 possible matrices An defined by varying lower right block Cov 7^3,7) = Cov 7' of

An in (2.2.74), with 7 ' of the previous paragraph. For details of the computer code corresponding to (2.2.95) we refer to Section 2.2.3 Appendix. Having (2.2.95) allows us to argue (2.2.80) as follows. By (C),

lim sup tr(An) < lim sup tr(An) < + 00, n —>00 n—>00

that combined with

limmf det(An) > lim inf (1 — d2)2 det(r) det (Covt '),

via part (b) of Lemma 2.2.5, positivity of T of (2.1.3) (cf. (C)), fullness of 7 ' (cf. the previous paragraph) and Lemma 2.2.7, lead to (2.2.80) with 0 ' = lim infn_»oo ^min(A„) > 0. As to (2.2.79),

IMP = E « - cm)2||&M|p + n||&M |P, 7=1 while IIM 2 = E t t - ™ )2|I<>M |P + r # 3’7>|P, (2.2.96) 7=1 where M3,7) = V fi^fyjn and, according to construction of v£'7\ W '7^ is a vector of constants. If ||&(3,7)||2 < ||^3,7^||2, then (2.2.79) is satisfied with 0' = 1. If ||&(3,7)||2 > ||6(3*7)j|2^ then, clearly, ||6^3,7^||2 ^ 0 and 0' = ||6^3,7^||2/||6 ^ 3,7^||2 < 1 in (2.2.79) for all large n. □

Remark 2.2.5. From the course of the proof of Lemma 2.2.8 we have seen that (2.2.77) is satisfied whenever (2.2.81) is valid. The latter condition can simply fail

if, e.g., £2 — 0 ==’ 0, i.e., if Var(e2 — 6) = 0. In such a case a subvector 7„ of 7„ and nonrandom vector bn obeying (2.2.78)-(2.2.80) are found. To show (2.2.80) under (2.2.81) and to verify (2.2.80) when (2.2.81) is not valid, but (2.2.89) is satisfied, the key factorizations in (2.2.83) and (2.2.95) are employed. These matrix properties

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 211

are somewhat suggested by the form of An in (2.2.74) (e.g., by the upper left 4 x 4 submatrix of An and by An itself, with dn of (2.2.75) close to zero as in a ^ 0 case). Once guessed, (2.2.83) and (2.2.95) have to be verified. This task might appear to be hopeless at first sight. Dimensionality of matrix An of (2.2.80) that varies from 4 x 4 to 7 x 7 and its complex fully filled form are quite discouraging for manual handling. Crucially, computer software “Maple” helps to overcome these difficulties and to conclude (2.2.83) and (2.2.95).

P roof of Lem m a 2.2.6. First, we argue that sequence {(£*, b), i > 1} satis­ fies (2.2.4). That {(0, b), i > 1} are independent r.v.’s with E (Q, b) = 0 and

Var {Q, 6) < oo, i > 1, follows from (C). If b ^ = 6® = 0, then

n Var (Q, b) > 0 for n > 1 sufficiently large, (2.2.97) i= l

since Var(£i, b) = nVar(Y, b ^ ) with 7' of (2.2.81), (2.2.62) is assumed to be valid. Via (E) and (2.2.101) (argued independently of the developments of the current proof and, in particular, of (2.2.98)), for large n |&C)| _f_ |&(2>| > 0 is equivalent to |&M| + 16^ | > 0 assumed in Lemma 2.2.8. Therefore, when l&^l + |&(2)| > 0, (2.2.97) is automatically satisfied. Indeed, using (2.2.70), (2.2.73), (2.2.78)-(2.2.80), (2.2.101) and (E), for large n, we have

E V a rfo , b) i= l n = E Var<7m, bn) = Var(7n, bn) = Var(7„, bn) 1=1

> Amin(Cov 7n)|| 6n||2 > lirninf Amin(An) & L > 2 ~ 2 H> E L i(6 ~ cm )2((6(1))2+(b(2))2} _ nMj((bW)2+(bW)2) £?=1(& -cm )2 > 2 2 n ^ n#((& W)2 + (5(2))2) 1 ^ 1(6 - c j f > nW>((bM)2 + (6(2))2) - 4 n ~ 8 n~>0° ^ = const • n > 0. (2.2.98)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 212

Now, in view of Lemma 2.2.1, it suffices to check Lindeberg’s condition (2.2.8)

for {(Ci, b), i > 1}, i.e., as n —* oo,

fo ran y “ > 0 ' S a W q g *t o . ‘t ’ 1 |l«i,»>l>o^/E2=1VarJ^ 1 i= r^r, 1 I -* °- ' (2.2.99)

Ifi>(1) = &(2) = 0, then {(C», b), z> 1} arei.i.d. r.v.’s with Var(Ci, 6) = Var(7 /, 6^3,7)), i > 1, and hence, due to assumed (2.2.62), (2.2.99) is clearly satisfied. Consider now the case of |6^1| + |&(2)| > 0. From the lines in (2.2.98),

ELiVarfc, 6) ft (Q, b)2l \ |[«i.6)l>a^/Er=i Var|^ 2 - 0^||&n||2 E E (il7inll2||&n||2l{l|7in|P||6w|P>W ||6„|P)/2}) ,

with (j) € (0,1] and > 0 of (2.2.79) and (2.2.80). We note that since |6 ^ |+ |6 ^ | > 0,

via (E) and (2.2.101), ||6n|| > 0 for large n . Thus, one only needs to show that

n for any I > 0, (||7m ||2l{||7i„||2> o ) “ * °> ^ oo. (2.2.100) i=l

On account of (D), (E) and the Cauchy-Schwarz inequality, as n —* oo,

E L i(6 “ c m)2 EJ=i ((& - cl ) 2 + 2c(& -c£)(£-ra) +c(£-m )2) E L i & - c S ) 2 ~ E L i ( Z i- c t) 2 (2 .2 .101) Combining (2.2.101), part (a) of Lemma 2.2.5 with a £ [|,1), (D) and (F), one gets, as n —> oo,

maxi • (2.2.102)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 213

Clearly,

Il7i.ll2 < v .n := - c m y + £<) + — ($i + £$ + (fai — I*)2 + (<$2 ~ A0)2 + (ej — 0)2) , (2.2.103)

with i.i.d.r.v.’s uin, 1 < i < n. Markov’s inequality, (C) and (2.2.102) imply that for any I > 0, l{i/i„>z} 0, n —> oo. (2.2.104)

Finally, by (2.2.103),(2.2.104) and the dominated convergence theorem, for any

l>0,

Y 1 E (llTm||2l{||7i„|p>z}) < 53^

= f ( h cm )2(5\ + ej)

Sf + ej + ( f e - f i f + (d? - + (e? - e f + n )) — E (l{i/i„>z} (252 + 2ef + (&l£i — fi)2 + (5j — A6)2 + (e\ — 0)2)) “ * 0, n —* oo.

This leads to (2.2.100) that concludes the proof as well. □

R em ark 2.2.6. The magnitude of the norming factor (X)"=i (C* — C> b)2/( n —1)) for y/n (C, b)Kn^ = yjn ((, b), in the CLT from part (a) with

n — 1 2 ^ const , if t/l) = M2> = 0 and/or lim < oo, V E E rttt-C »>V

( e ^ ( 6 - c ! & ) 2) £* COnSt ,if |6(1)| + |6(2)| > 0 , nlim |5 = oo,

, c i - (e£ I ^ ) 2- 02 W PA l ’otherwise’ (2.2.105)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 214

with some positive constants ci and C 2, and as in (E). In particular, due to (2.2.105), process in (2.2.60) is well-defined WPAl, n —> oo. The arguments for (2.2.105) go as follows. On account of (2.2.99) and (2.2.26) of the proof of Lemma

2.2.1, ESrtfe - C, b)2 p 1, n —> oo. (2.2.106) S r^ V a rfc , b) In general,

Var(Ci, b) = J2 (((6(1))2A0 + (6(2))20)(& - cm )2 + d2(& - cm) + d3), (2.2.107) i=1 i=l with some constants d2, d3 > 0, where ((b^)2X6 + (b ^)26) + cfo > 0 for large n, due to (2.2.97). In particular, by (2.2.101),

f n-1 SiLi Var(Ci, b) —> const , if b ^ = b ^ = 0 and/or Jirn^^2 < oo,

| (nS#)~l Ya=i Var(Ci, b) —* const , if |&M| + |6®| > 0 and Jim ^ 2 = oo, (2.2.108) where limiting constants in (2.2.108) are positive, since when b ^ = b ^ = 0, d3 > 0, while when |&0)| -|_ |&(2)| > o, we have the lines in (2.2.98). Via (2.2.106) and (2.2.108), we conclude the first and the second parts of (2.2.105). Suppose that neither J im ^ < oo, nor Jim ^2 = oo, and that |&W| 4- |6®| > 0, which for large n

via (E) and (2.2.101) is equivalent to |6 ^ | + |&( 2)| > q ^ a t js assumed jn Lemma 2.2.8. Due to (2.2.70), (2.2.73), (2.2.78), (2.2.80), (2.2.101) and (E), the nature of bn in (2.2.78) (cf. the proof of Lemma 2.2.8), for large n we have

^2 Var(Ci, b) = ^ V a r ( 7j„, bn) = Var(7n, bn) = Vax{%, bn) < Amax(Cov 7n)|| 6n||2 i=l i=l

< 2 limsup Amax(An) ^((6(1))2 + (6(2))2) 0 & - cm )2 + c2n j

< c 2m%, (2.2.109)

with some % > 0. Combining (2.2.109) with the lines in (2.2.98), we get

cm% < EVar(Ci, b) < c2n % , if \b&\ + \b®\ > 0, (2.2.110) i= l

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 215

with some Z\ > 0, and hence, via (2.2.106), estimation in the third part of (2.2.105) holds true. We also note that, due to not necessarily assuming lim £2 < oo or n—+oo Jim ^ 2 = oo in Lemma 2.2.6, verification of Lindeberg’s condition (2.2.99) becomes more challenging, since for E"=i Var(£j, b) as in of (2.2.107), E"=i Var((i, b)/(nS^) ■/* const , n —> oo. This observation helps to appreciate auxiliary Lemmas 2.2.7, 2.2.8 that lead to crucial inequality in (2.2.98). By (2.2.106) this also implies that, in general, (E ”=i(Ci “ C» b)2/ ( n - 1)) 1/2 const, n —* oo, and hence one cannot then write CLT in (a) with to = 1 of Lemma 2.2.6 in the usual form of

g(n)((, b) —► N(0, const), n —> oo,

with g(n) = yfn or g(n) = yJn/S#. That is why one needs Studentization or scalar normalizer (E ”=i Var(0, b)/(n - 1)) 1/2 in place of (E ”=i(C< - C. b)2/(n ~ 1)) ^ for the CLT in part (a) of Lemma 2.2.6 to hold true.

We are to introduce now another auxiliary process of this subsection, the one in (2.2.114) below that is a prototype for the main terms in expansions for our original processes of interest introduced in Section 2.1.2. Then, (2.2.114) is to be studied in Lemma 2.2.9, via Lemma 2.2.6, for auxiliary process in (2.2.60). Let random vector rji(n) be given by

ViiP’') ~ (.Vi ^ > ^i,xy Ah &i,xx ^)» (2.2.111)

and vector d G 1R5 be such that

d P p + dl 2) = 0 and d(3)/?2 + dw /3 + d<5) = 0. (2.2.112)

Define time function Ln(t) as

m n ^ m : Var(ty(n), d) < 1J ^ V a r ^ n ) , d) V . (2.2.113)

{ i=l »=1 J

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 2 1 6

Using vectors rji(n) of (2.2.111) and d ^ 0 as in (2.2.112), we introduce process in D[0,1] as y/n (v(n), d)Ln(t) (2.2.114) (E2=i(*fc(n) - »?(n), d)2/(n - 1))1/2

= \Jn{n-l) [d{1\y -o t)Ln{t) + d{2)xLn{t) + d®(Syytt - \6Ln{t)/n)

+d^)(Sxy>t - (j,Ln(t)/ri) + d{5)(SXXtt - 6Ln(t)/n)j

X ( £ (d(1\yi -y)+ d{2\ Xi -%)+ rf(3)(s*,yy - Sm) \ 2=1

- 1/2 + d ^ \siiXy — Sxy) + d ^ \s itXX — Sxxfj ^

Lemma 2.2.9. Let assumptions (C )-(F ) be satisfied. Define vector

b = ( 2 0 s* + d{A\ 0 4) + 2 d(5), d(1), df®, d(4), d(3), d(5)), (2.2.115)

where d = (d ^ , • • •, d ^ ) G H 5 is as in (2.2.112). I f b ^ = = 0, assume also (2.2.62). Then, as n —* oo, for process in (2.2.114) we have:

y/n (v(n) > d)Ln(t0) v AT,n + ^ ± ^ , n ^ (a) ------^ ------YJ2 -» N{0, *o), for t0 G (0,1]; ( ESLi (»fc(n) -?/(n), d)2/(n - 1))

/•i.\ y/n (v(n)> d)L ^ v (b) ------= ------^ ------^ w ( t ) on (D[0,l},p); (E"=i(^(n)-»?(n),d)2/(n -l))-1» (c) we can redefine {(5i, £j), i > 1} on a richer probability space together with a sequence of independent standard normal r.v.’s {Ui, i > 1} such that

y/n (y(n), d)Ln(t) Sfcf(Varfa(n), d ))^ sup = Op(l). 0

Proof. Due to the properties of vector d in (2.2.112), we have

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 217

= d(1)(?/i - a) + d^X i + $®(si7yy - X9) + d ^ (s iiXy - /a) + d^s\ s i

= d ^ \^ i/3 + + Si) + d^^Sitffi2 4- 2S i^ /3 + (s^ss — X6)^j

+ d ^ ( s i £ t P + Si£s + Si&P + (sitss — m)) 4- d ^ (si££ + 2 si£ e + (s ijSe — 0)J

= (2f3d& + d « )(6 - c^)(5i - cl) + (/5rf(4) + 2<*«)(& - c£)(ei - ce)

+ d^Si + d^Si + d^((8i - cS)(£i — ce) — fij

+ dm ((Si - cS 'f - XB) + (i<5>((£j -c t f - e)

= (Q,b)+cRi(n), (2.2.116)

with Ci and b of (2.2.59) and (2.2.115) respectively, c of (2.1.7) and term

Riin) = 6(1)( - ~8{£i - m) + (m - 1)($ - 3)) + &(2)( - ~ ™) + (m - f)(e* - e)) +&(«)(_ z6t -8 e i +'5e) + b®( - 28Si + ® 2) + &<7>( - 2 e Si + (e)2).

(2.2.117)

If intercept a is known to be zero, i.e., c = 0, then from (2.2.116) (7fe(n ), d) = (Q, b), function K n(t) of (2.2.61) coincides with Ln(t) of (2.2.113), processes in (2.2.60) and (2.2.114) are identical and thus, Lemma 2.2.9 amounts to Lemma 2.2.6. Suppose now that a is not known to be zero, i.e., c = 1. Then, using the proof of Lemma 2.2.1 for the prototype of {(0, b), i > 1} (note that according to the proof of Lemma 2.2.8, Lindeberg’s condition (2.2.99) is satisfied), one concludes (cf. (2.2.19), (2.2.21) and (2.2.23)) that EfiftVarfc, 6))^ U i y ... ,, , . . 1/9 ( ) (■^'IPj 1J>P)> ^ (2.2.118) (aiV arte.6))'

where {£/*, i > 1} are as in part (c) of Lemma 2.2.6. Mutatis mutandis Remark 2.1.4 of Section 2.1.4, by (2.2.118) and (2.2.123), it suffices to prove (c). In view of (2.2.116),

Vn (v(n), d)Ln(t) 1/2 (E"=i(»7i(n) - V(n), d)2/(n - 1))

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 218

V™ (Cl b)Kn(t) fyLn(t) - (C) + -( {ZUiCi-lby/in-l))1'* ' (Er=i(Ci-C,6)V(n-l))1/2 ^)Ln(t) \ / Eti(Ci-C “2 N1/2 ' , j p _ y d (2.2.119) - C, 6>2/(n - l))172^ V2Zi<*»(n) “ *?(»)(n), d)*J ’ where ifn(t) and Ln(t) are defined by (2.2.61) and (2.2.113). If, as n —> oo,

E ti «i~C,b)2 =*i, (2.2.120) D£.i(*7i(n) ~ V(n), d)2

sup V2 = °p(1)’ (2.2.121) 0<^(Stito-C t< l , b ) y ( n - 1))

sup |v^((C .*>M .)-«-^.w )l = 0p(1)i (2.2.122) 05,51 (S?.i(Ci-C,6)V(n-l)) and

Z £}‘>(Var(r„(n), d})^U, E£i.W(Var) where {?/*, i > 1} are as in (c) of Lemma 2.2.6, then, via (2.2.119), the proof of part (c) of Lemma 2.2.9 for y/n {rj(n), d)Ln(t) ( £?=1 (^(n) - r)(n), d)2/(n - 1)) 1/2 with such {Ui, i > 1} follows from (c) of Lemma 2.2.6 for y/n {(, b)K C> b)2/ (»-DY v \ since

sup Efcf(V«gfa(n), ij^U, 0)(>>). <*}2/(” - 1))V2 SfcftVarfaW, d))1^ E£S:Var(e„ < sup 0)1/2 ( & i Varfe, &>)

E & W(Var«i, (C) ^)iCn(t) + sup 1/2 07(»- 1)) 1/2 |v ^ (C> ^)gn(t)| ( Er=i(Ci-c_&)2 \ + sup 1/2 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 219

( |y g « C , v>L.m - «■ >>«■(.)) I |v » “1” blip 1/0 T blip .1/0 V**1 (E s^te-f. »>>/("-1)) 0S*S1 ( h u g -?, »>»/(»- d ) ( g-ifc-c,^2 y/2 \Zi=i{Vi(n) ~v(n), d)2J

and (2.2.118) and (c) of Lemma 2.2.6 additionally imply that su p o ^ ! | y/n (£, b)Kn^ x(EiLi(Ci - C, b)2/(n - 1)) 1/2 = 0 P( 1), n —*oo. First, we will show (2.2.120). In view of (2.2.105) of Remark 2.2.6 and the Cauchy-Schwarz inequality, it suffices to prove that

X%=i {(Vi(n) ~ ??(n)> d) ~ (Ci ~ C. b))2 n E t i (Ri(n) - R(n))2 _ ' if 6(1) = 6(2) = ^ (2.2.124) n

TlU ( Ri(n) R(n))\2 = o^ (1) ^ .f |6(1)| + |6(a)| > Q

In order to show (2.2.124), using (2.2.59), one first writes

while from (2.2.116),

(rfi(n) - 1j(n), d)

= cfi\S, - 2f) + S 2\et - E) +

+ d14) (j3(Si£E — Sfe) + (si£6 ~ Stf) + (siA. ~ ^e)) + — S{e) + (s.iee —

= *a,(te - I) (4 - 7) - (iff - ??)) + i>(2)(te - D(e. - 5) - g? - fs))

+ »<*>(* - {) + &<%, - s) + &»>((«, - J)(ef - E)- (Si - J e ))

+ - J)2 -(W - (J)2)) + &<7>((ei - S)a - ( F - (E)2)). (2.2.126)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 220

Then, on using the Cauchy-Schwarz inequality, one reduces (2.2.124) to the following statements for the corresponding components of (2.2.125) and (2.2.126): as n —> oo,

” E ( 6(1)((& “€)(Si -8)-(& - tS)) ~ &(1)((& - m)5i - (f - m)sfj = oP( 1) 1=1 (2.2.127) and

“ E (b(2)((& ~ t)(£i £ e)) - 6(2)((& -mfe - (f - m)e)^ = oP(l), *=1 (2.2.128) if |6 ^ | + |6®| > 0, as well as

- E (6(5)((5i “ - 5) - (^e - *£)) -b{5){Si£i - 5i))2 = oP(l), (2.2.129) ^ i=l

1 £ ( ^ ( W -W - ( V - (J)2)) - 6<6)((S?-P))2 = op(l) (2.2.130) Z=1 and

\ E (&(7)(fe - ^)2 “ 0? ~ (e)2)) ~ b{7){£2i - e*)f = oP(l). (2.2.131) i—1 We conclude (2.2.127) via applying the same Cauchy-Schwarz argument, the WLLN (under (C)), (D) and (E) as follows:

-m)6i)2 = ^ E (-(Ci-m)5+(m-^)(5i-5))2

- fefc S fc ~01 ■+ ^ ■"m)2) +(m - ^ - 5)2=op(i) and _ _ i £ - p - ?J))2 = „ op(1). 71 i=i 71 In the same manner (2.2.128) is obtained. Now, by noting that (2.2.129)-(2.2.131) are all handled similarly, we only show (2.2.129). The proof of (2.2.129) easily results from the WLLN (under (C)):

“ E (fa ~ t)(£i - £) - 5i £if = ^E ( - Si£ - t£i +^£f 7 ni=lV 1

^ ~ E (^2(e)2+ £K $ ) 2+ O^)2) - ° p ( i ) i=l

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 221

and _ !£ ((£ -« )-fe)2 = ^ = op(i). 2=1 This completes the proof of (2.2.120). Now, (2.2.121) is to be verified. In view of (2.2.105), (2.2.121) reduces to

sup l ^ M _ Op(1)iif6a, = 6,a) = 0, 0| > 0. 0<<<1 y/nStf

Analysing typical summands in (2.2.117), due to (D), the WLLN and the functional CLT, we get

M gSj^-O I £ Op 0) lELte-m)! 1- fe-n \JnS# V™ 1-/c-n yJnS# Op ( 1) k E t i 6 —r = — max — m = oP( 1), (2.2.133) n k

max m a x ______T .h (S j ~ *)| o(l) Op{ 1) = op(l) 1^ n yJnStf y/n (2.2.134) and

Op (1) n l

sup = Op(l), (2.2.136) 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 222

with

Oi = (Var(Ci, &»1/2 (2.2.137)

and independent standard normal r.v.’s {£/*, i > 1}. Let t > 1} be a stan­ dard Wiener process. For all n sufficiently large, we have

f E h '" ’ ViU. r £ f > <7iU, „ „ _ , 1 IV5W " VCT J ( v max{Kn(t),L„{t)} tj ^min {#„(*),MO) tt ) v °>Uj U -1 I y a , o? J /^max{irn(t),Ln(t)} 2\ 'i {^ / ^i=min{jfn(*),Ln(t)}+l » \ Q < t < 1 >

and, on account of stationarity of a Wiener process, for all n sufficiently large

Kn(t) j E£?

max{A_n(i),i„(t)} 2 ,min{isrn (i),L „ (0 } ^ v E = sup i=l * i - W ' =1 0

Hence, by Lemma 2.2.3, it suffices to prove that

'r^Ln(t) 2 Tr^Kn{t) 2 ^i= 1 *i 2^i=1 °i sup — o( 1 ) , n —> o o . (2.2.138) 0

The following auxiliary result will come handy for the proof of (2.2.138):

£?=1Var^(n) ,1X 2 = 0(!)> n ^ ° o , (2.2.139) 2^j=i

where Ri(n) is as in (2.2.117). Due to (2.2.108) and (2.2.110) in regard of Y U * h (2.2.139) follows from

n "1 Y,U Var Ri(n) = o(l) , if &W = 6(2) = 0, (2.2.140) I( n ^ ) " 1 YU Var R>{n) = o ( l) , if \ W \ + \b& \ > 0.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 223

For the typical summands of (2.2.117), due to (D), (E) and the WLLN (under (C)), as n —> oo,

+ 2 V a r — _ nS# n2S# nStf (2.2.141)

E L i Var ((m - fl (ft - S)) (m - £)2Y ,ti Var (ft - 5)

_ ( m — £)2Var (ft - 5) = o(l), (2.2.142)

E ”=i Var (e ft) Var (eft = o ( l ) (2.2.143) n n and

ZS.1 Var m _ Var m = nv^fe) = 0(1). (2.2.144) n tt Assumption (E), (2.2.141)-(2.2.144) and their analogues for other summands in (2.2.117) yield (2.2.140). Coming back to (2.2.138) with <7* of (2.2.137), with

Vi = (Var{r)i(n), d))1/2, (2.2.145)

we have

y->JKn(t) 2 Ln(t) „2 ^ K n(t) 2 \-^Ln(t) o 2^=i sup L>i-1 ai i < sup - i + sup t - 0

Ef=i(<) 0? + sup + sup Z t P v ? 0 < t< l E t i "«2 0

That

Ai —> 0, n —* 00, (2.2.147)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 224

follows from Lindeberg’s condition (2.2.99). By (2.2.116), (2.2.139) and the Cauchy- Schwarz inequality,

T X .i-'i _ . , g,iVariii(n) 2 £ ? „ coy«Q, . 6), ft(n )) = 1 + o(l), n —> oo. E t i of + S tio ? (2.2.148) Due to (2.2.139) and (2.2.148), as n —> oo,

/ 2 maxi

E t i Varit^(n) + 2 E t i |cov((Ci, 6), ^ (n ))| E"-i 2 £ i l(.72)1/2(V arfii(n))‘/2 < o(l) +

/ £?=lVarfii < o ( l) + 2 = o(l). (2.2.150) V E tl« ? Thus, (2.2.146)-(2.2.150) result to (2.2.138) and conclude the proof of (2.2.122). It is left to be shown that (2.2.123) is valid. We have

e S s m *iu, s t ! l> I'M sup < sup 0 1} are i.i.d. N(0,1) r.v.’s as in part (c) of Lemma 2.2.8. Hence by (2.2.118), (2.2.136) and (2.2.148), it suffices to prove that

E f c f V i - v,)U.\ oF(l). (2.2.151) oS 0,

Ln(t) l/2> P | sup £ (tfi - Vi)Ui 10

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 225

l/2> = P I max - Ui)Ui K i< n i—1 V ar(sj.1(ffj-i/j )Ui) JXi(ff,-n)a ■>2Eli<7? a2 E l I®? ’ and hence, (2.2.151) follows from

E l i f a - n? o(l), n —* oo, E l i <7? that, in turn, via (2.2.148), reduces to

SSL 1 QjVi 1, n oo. E l i of The latter convergence is obtained by sandwiching. Namely, as n —> oo, by (2.2.148),

E l l <711-1 _ E l l (g2!/2)1^ < E l ! (it? + V?) 1, E l l <7? E l l <7? - 2Ell<7i while by (2.2.116) and (2.2.139),

E”=iW i E l l <7? E li(^ )1/2 _ SllK fe, b) -E(Q, b))((W(»), d) — E{rji(n)t d))| E l ! <7? “ E l i<7? E l l b) —E (o, &))((£, 6) - E (Ci, 6) + cft(n) - c£fli(n))| E l i (7? _ E l i (<7,2 - K)(cB,(n) - cE JJi(n))|) E l i <7? E li |E(<6, b) -E (& 6>)(cfli(n) - cSJ^n))! Y r- <7* > j _ E li <7i(Var Rj{n))V2 > ‘ (Eli<7?)1/2(E li Varflifo))1'2 E l i <7? E ll <7? . /'EliVarfi,(n)V/2 1. □ I E l i * ? )

Remark 2.2.7. Combining (2.2.105) of Remark 2.2.6 and (2.2.120), for the norming

factor (£7=i(7?i(n) -7 7(7-1), df/{n - 1)) 1/2 of process y/n (7 7(7 1 ), d)Ln(t) in (2.2.114),

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 2 2 6

we have, as n —> oo, i n — 1 \ 2 p const , if = b® = 0 and/or lim £2 < oo, (E£=i(Vi(n) ~ v(n)t d)2, ' * M_1/V1 » '

( ------^ _p const , if |6^[ + |&(2)| > 0, lim £2 = oo, VE?=i(Vi(n) ~ v(n). d) J n^°° ( 1 / s ,.2 ) < C2W PA1, otherwise, E?=i (Vi(n)~ri(n),d)2J (2.2.152) with b of (2.2.115), some positive constants d\ and d2, and S# as in (E). In partic­ ular, in view of (2.2.152), process in (2.2.114) is well-defined WPAl, n —> oo. It is not hard to see that (2.2.152) implies

(E?=ifa(n) - r)(n), d)2/(n - 1)) 1/2 = 0 P(1) , if feW = bW = 0,

( £ ? =(m(n) i ~ V(n), d)2/(n - 1)) 1/2 = 0 P { \ ) / y f S ^ , if |&W| + |6 ^ | > 0. (2.2.153)

By part (b) of Lemma 2.2.9 and (2.2.153), as n —> oo,

, if M1) = bm = 0,

0<«<1 sup |(>;(n), d)M „| = | " (2.2.154) x H ^ O p i 1), if |6 «| + |6(2)| > 0. V n Combining the concluding lines of Remark 2.2.6 and (2.2.120), we note that Studen- tization in Lemma 2.2.9 is necessary for the invariance principles in (a) and (b) of this lemma to hold true, and normalizer ( J2i=i(Vi(n) ~ v{n)> d)2/(n ~ 1)) ^ can

not in general be replaced with const 1Jn or const yJn/S

Remark 2.2.8. Process y/n (r}(n), d)Ln^ (j2i==i(r}i(n) ~ v(n)> d)2/(n — 1)) ^ of (2.2.114) is a version of the Student process of (2.2.7) in a somewhat loose sense, for special triangular sequence {(^(n), d), 1 < i < n, n > 1} of dependent r.v.’s. Therefore, Lemma 2.2.9 can be viewed as a nontrivial extension of invariance princi­ ples based on Studentization as in Lemma 2.2.1. Moreover, in the context of model

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 227

(2.1.1)-(2.1.2), the results of Lemma 2.2.9 can be applied to all the estimators that are linear combinations of (y, x, Syy, Sxy, Sxx) (with coefficients as in (2.2.112)) and their corresponding processes, and thus, also, to various reasonable estimators based on the latter vector and their corresponding processes. We also note that for Lemma 2.2.9, Lemma 1.2.10 in Chapter 1 is a predecessor and true analogue in the context of SEIVM’s (2.1.1)-(2.1.2) with the explanatory variables that are i.i.d.r.v.’s and satisfy some random companion conditions to those in (D)-(F).

Before proving Theorems 2.1.2 and 2.1.3, the main results of Section 2.1.4, we call attention to, and emphasize the importance of negligibility condition (F) for the main auxiliary results of this section, Lemmas 2.2.6 and 2.2.9. First we note that, in view of Lemma 2.2.1, Lindeberg’s condition (2.2.99) for {(Ci, b), i > 1}, with Ci of (2.2.59) and b € 1R7, b ^ 0, is a key sufficient condition for Lemma 2.2.6 and hence, also, for Lemma 2.2.9. Moreover, according to Lemma 2.2.10 below, (F) is necessary and sufficient for (2.2.99) when \b^\ + \b^\ > 0, i.e., when (Ci, b) are truly based on the explanatory variables £* (as opposed to the case = b ^ = 0, when (Ci, b) are error-based only). Lemma 2.2.10. Assume that conditions (C )-(E ) are satisfied and |6W |+|6^| > 0, b € 1R7. Then, condition (F) is necessary and sufficient for Lindeberg’s condition (2.2.99) for sequence {(Ci, b), i > 1}, with Q of (2.2.59).

Proof. That (F) is sufficient for Lindeberg’s condition (2.2.99) is seen from the proof of Lemma 2.2.6. To conclude the necessity part, it suffices to show that violation of (F) leads to violation of Feller’s condition, i.e., to

By (2.2.107) and (2.2.110), maxi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 228

maxi

where d\ = (b^)2\0+(b^)29 > 0, d2, d3 and ci are constants, such that |di| + |d3| > 0 and ci > 0. Due to (2.2.156) and (E), (2.2.155) follows from

max1

on noting that violation of (F) leads to (2.2.157) via (E) and inequality

max l

maxdig +2(^cm-

P roof of Theorem 2.1.2a. In view of Remark 2.1.4 of Section 2.1.4 and (2.2.56) and (2.2.57) of Remark 2.2.3, to prove Theorem 2.1.2a for y /n U (l,n )(/3ln - /3)j(ln(t) x(Z)it=i(wi(l,n) — u(l,n) )2/(n — 1)) \ it suffices to establish (iii) of Theorem 2.1.2a for y/nU {\,n)(filn - 0)Kln(t)(Et,i(ui(l,n) - u(l,n))2/(n - 1)) 1/2, with (An - 0 ) Kln(t) of (2.2.55). First, we argue that y/nu( 1,n)Kin^ ( )CjLi(i»i(l,n) - u (l,n) )2/(n — 1)) is a special case of process in (2.2.114) and obeys (c) of Lemma 2.2.9. Indeed, since fj, in (2.1.3) is assumed to be zero, Ui(l,n) = —2/?2(A 4- /32)-1(siiXa;A — si>yy — /3~l (\ — 02K.») = -2«A + - «)A -(s,,m - M) - r*(A - /J2) ^ - rf), corresponding vector d is d = — 2/?2(A + /32)_1(0, 0, —1, —/?_1(A — /32), A) and con­ dition (2.2.112) with such d is satisfied, since — df® = 0 and —(2/32)_1(A + /52)(d(3)y92 + d^(5 + d ^ ) = —/32 — — (32)(3 + A = 0. Moreover, for vector b of (2.2.115) that corresponds to d, |&^| + \b^\=\2fid^ + \ + \/3d^ + 2d^\ —

2/?2(A + 02)~l (| - 2/3 - pr\X - $2)\ + | - p p r \ \ - (32) + A|) > 0.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 229

Prom (2.2.29) and (2.2.154) of Remark 2.2.7 (case l&^l + |6®| > 0), for (zn

z )Kln(t) of (2.2.1.12),

A + /?2 r— sup ( Zn - z ) Kln{t) A /32 C SU p Op{ 1) = op(l), 0

(r _ m 2/?2 ^ ,A sign(/3)A(^- zfKin{t) ,9 9 1 . Qv {yin P)Kin(t) — \ i R 2 'n ' Kl "(*)"*■ / \ 3 /2 ’ (2-2.159)

with some random u = u(n) G (0,1). Inequality A < (2.2.160) 2 ((v(2n - z)Kln{t) + z)2 + a) 3/2 2 v ^

uniformly in t G [0,1], and (2.2.159) yield

VnU{l,ri)(/3ln - /3)Kln(t) 1/2 (E2=1 («<(!» n) -u(l,n) )2/(n - 1))

( E2=i(«i(l, n) - «(l,n) )2/(n - 1))1/2 -2/32 A + ^2 (■^n Z)Kin(t) (l + 0(1) {zn z)jcln(f)j V ^ u(h n )Kln(t) 372 (l + 0(1) (zn — z)Kln(t)) (EJLiM 1.*) - u{\,n))2/{n - 1))

y /^ U^ n)Kln(t) + P n k n(t) (2.2.161) (E tiW l,n)-^M )2/(n-l))1/2’

where on account of (2.2.158) and (2.2.154) of Remark 2.2.7 for u(l,n)Kln^ ,

sup & .(« ! ^ O(l) rap |vSu(l,n)ftn(„| rapJtS. - 2)Kl„w | 0

= 0(l)y/nJ— Op(l)^£QL = % £ > . (2.2.162) V n ^/n% y/n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 230

Combining part (c) of Lemma 2.2.9 for y/nu( 1, n)Kln^ ( E ”=i(wi(l, n)—u{ 1, n) )2/(n — 1)) l^2, (2.2.152) for (13^=1 (wi(l,n) — u (l,n ))2/(n — 1)) ^ and (2.2.162), via (2.2.161) and (E), one gets (iii) of Theorem 2.1.2a for the initial left hand side process in (2.2.161). The proof for MLSP’s for (3 follows directly from Lemma 2.2.9, since j = 2 and 3,

\/nU (j, n )0 jn — fl)Kjn(t) Vnu(j,n)K.n{t) . 1/2 ! (EJU(uiO\ n) ~ u(j,n) )2/(n - 1)) (£?=1(«i0» “u U,n) )2/( n - 1)) (2.2.163) i.e., since the thus normalized respective 0 j n — 0)Kjn{t) °f (2.1.22) and (2.1.24) are versions of (2.2.114), with respective vectors d as in (2.2.112) and b of (2.2.115) such that \b^\ + |6®| >0. □

Proof of Theorem 2.1.2b. Via Remarks 2.1.4 and 2.2.3, the proof for the WLSP for a reduces to establishing (iii) of Theorem 2.1.2b for the process in (2.2.58). Suppose first that limn_oo (£2 — (£ )2) = limra_0O S# = M < oo and consider process in (2.2.58). Note that (2.2.158), and hence (2.2.159) as well, also hold with Lin(t) of (2.1.16) in place of K \n(t) of (2.1.13). Therefore,

V n(aln - a)Lln(t) = y/n({y - x(3 - a)Lln{t) - x 0 m ~ P h ln{t)) — yyi ...... (y - a)Lln(t) - fem (t) ~ 2 M p u M LM t) m/32 + ( 5/32 2Sxy(zn %)L\n(t) S * ,(X + F ) M/3(A + /32).

+0(l)x(zn-z)lln^

=: Vnu'(l, n)Lln(t) + p™Lln(t), o < t < 1, (2.2.164)

where u^(l ,n) and m are as in (2.1.17) and (D), and, on account of the WLLN, (D), (2.2.29) and (2.2.158),

J2) 2/32y/n x m sup "n,Lin(t) l-SJ sup I (zn - z ) Lln{t) 0< t

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 231

+y/n\x\ 0 (1) sup I (zn - z ) l ln(t)\ 0

Combining (c) of Lemma 2.2.9 for y/nv'{ 1, n)Lin^ ( E ^L ^i^ l ,n) ~ v'(^>n) )2/(n ~

1)) ^ (respective vector d satisfies (2.2.112) and b of (2.2.115) is such that | 6^)| +

|&(2)| > 0 if m 0 in (D), and bW = b ^ = 0 if m = 0, with condition (2.2.62) satis­ fied in the latter case on account of Var(<5 — (3e) > 0 following from (C)), (2.2.152) as regards (EJLi(«j(l,ra) — t/(l,n ))2/( n — 1)) ^ and (2.2.165), one concludes that (c) of this lemma is also valid for process

(au(t<(l,n) -Jv^ jp /tn -l))1'2 (EtiM(l.n) -7(I^ )V (n- l))1/a‘ To complete the proof of (iii) of Theorem 2.1.2b for the WLSP for a when

lim infn -,00 ( /2 — (£ )2) = < oo, the following convergence has to be shown:

£r=ifa(i.rc)-^M )2 _ - « ( i.» )))2

Sw=i(«*(!»n) - v'( l>n) )2 Ef=i ((yi-y)-/3(xi-x) - 280 (1x 4(1, n) - u (l,n ) ))2

^ 1, n —> 00. (2.2.166)

On observing that the WLLN, (2.2.29) and (2.2.152) of Remark 2.2.7 as regards

( £ T = iM l>n) ~ u(h n) )2/(ra - 1)) 1/2 imply, as n -» 00,

^ ~ Y S ( wi(1’n) - M(1»n))2 = 0Hi)v/% °^(1) = °p(i), the proof of (2.2.166) follows from the Cauchy-Schwarz inequality and (2.2.152) for

( E ? „ 1 M(l, n) - )7(» - 1))'1/2. Assume now that limn_>oo (£2 — (£ )2) = limn_*oo S# = 00. We are to prove that (c) of Lemma 2.2.9 continues to be valid for

______V^C^lw ~ Q)z-In(t)______

(ELiKC1,^) -v(l,n))2/(n- 1))1/2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 232

\/n (y-x(3 - a)Lln(t) - y/n x 0 ln - p)Lln(t) 1/2

../EhWC1!") - v 'i h n ) ) 2\ 1/2 n)f J

y/nv'{\,n)Lln{t) + p™Lln{t) ' ^E?=1(t#l,n) ~^(l,n))2y /2

( E£=i(«i(l,n) - tf(Xn) )2/(ra - 1))1/2 V E?.i(v<(l,n) - v (l, n) )2 ) (2.2.167)

where Lin(t) and ^(l> n) are as i*1 (2.1.16) and (2.1.17) and 0 < t < 1. As n —> oo, from (2.2.158)-(2.2.160) and the WLLN,

sup PnMn(t)\ = SUP \V n x 0 ln - fl)Lln(t)\ = V n —f= = = Op( 1), (2.2.168) 0

while by (D), (2.2.29) and (2.2.152) as regards (E jL i(«i(l,n) — w (l,n))2/(n D)-1/2,

r ~ 7 £(«i(l»«) - u(l,n) )2 = ■; ( 2^ ) _ U^1,n^ ^ = = ° p (2-2-169) 1 “ 1 i=l

The Cauchy-Schwarz inequality, (2.2.152) for ( E?=i(^(l>n) —^( l, n) )2/(n — 1)) ^ (case = ft® = 0) and (2.2.169) imply

Ezti(^(l,n)-u(I~n))2 Lj=i ((vj(hn) ~ v'(l,n)) - ^(Ui(l,n) - u(l,n)))2

E h m , n) - )2 " E t i (vi(l, n) - )2 p 1, n ->■ oo. (2.2.170)

Finally, combining (2.2.167), (2.2.168) and (2.2.152) for (E"=i(vi(l>n) — V {\,n) )2/ (n - 1)) 1/,25 (2.2.170) and (c) of Lemma 2.2.9 for v ^ v '(l,n )iln(t)(EiL].(Vi(l,ft) —v'(l,n) )2/(n — 1)) (respective vector d is as in (2.2.112), = 6® = 0 and condition (2.2.62), i.e., Var(<5 — /?e) > 0, is satisfied on account of (C)), via (2.2.118) and (2.2.123), one concludes the proof of part (iii) for the process of the initial left hand side in (2.2.167).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 233

The proof for MLSP’s for a follows the same pattern. □

Proof of Theorem 2.1.2c. Assuming that p, — 0 in (2.1.3), for process in (2.1.18) we have

V nM (1, n) (&ln $)[nt]

n — 2 /N y/n + Pin) fan — 0)[nt] = y/n([Syy,[nt] - Ad[nt]/n) - 2Sxyt[nt]P + (SXXi[nt} - 9[nt]/n)p2)

+ y/n((Sxx,[nt] ~ 9[nt]/n)0?n - p2) - 2Sxy,[nt}(Pin - P))

y/n26 ■:(* + & ) V nw ( 1, n)M + p£jnt,, 0 < t < 1, (2.2.171)

with Wi( 1, n) of (2.1.49). It is to be shown that, as n —> oo,

sup VS((S1I,M-«[nt]/n)(a-^)-2SraiM](A „-«) + ^(A + & ) 0

Clearly, for (2.2.172) it suffices to verify that

y/n sup lOSw.int] - 9[nt\/n)0?n - p 2) - 2Sxyi[nt}(Pi„ - P)\ = y/n ° P^ = oP(l). (2.2.173) Prom (D),

k — max — m = o(l), n —*■ oo. (2.2.174) n 1

On using (2.2.174), (b) of Lemma 2.2.6 and (2.2.105) of Remark 2.2.6, as n —» oo,

sup |«%e,[n*]| = sup I((£ - c£)(e - ce)) 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 234

sup |((£-m)e) 1+ sup |me[nt]| , if c = 0, o

n LOp( 1), (2.2.175)

where S# is in (E). Similarly,

sup |%,[nt]| = \ — Op( 1 ), sup |S'ee,[„t] - 0[nt]/n\ = ° Pj- \ o

sup \Sse\nt}-(Ant)/n\ = ° P9^ and sup \SSSi[nt}-X0[nt}/n = \ (2.2.176) o

y/n sup \(SXXt[nt] - 9[nt]/n)(P2n - p 2) - 2Sxy>int]0 ln - P)\ 0

+ (S££,[nt] -6[nt\/n) (Pin - P){Pin + P)

- 2<%,[nt](An - P) - 2Sse,{nt](Pln ~ P) j

n% v n nS# y/n V n r~Op( 1) , s = y/n-—— = oP( 1). n The latter proves (2.2.173) and hence, also, (2.2.172). Next, we argue that 1 /2 y/nw( 1, n)[ntj ( X)"=i(^j(l, n) — u/(l, n) )2/(n — 1)) satisfies Lemma 2.2.9. Since for vector b of (2.2.115), corresponding to respective d of (2.2.112), = 6® = 0, condition (2.2.62) has to be verified. Suppose that (2.2.62) fails, i.e., that Va,r(62-\6 -2 p 5 £ + p 2(e2-9 )) = 0. Then (S-pe)2 = \0+P20,6-pe = J(\9 + P29 f and, since E(S — Pe) = 0, X9 + p 29 = 0. The latter equality contradicts positiv- ity of F of (2.2.1.3) (cf. (C)) and proves (2.2.62). Combining part (c) of Lemma

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 235

2.2.9 for y/nw(l,n)[nt](Y%=i(wi(hn) - w(l,n) )2/(n - 1)) 1/2, (2.2.153) implying (E r=i(wi(l>n) ~ w (l,n) )2/{n — 1)) = 0p( 1), n oo, (2.2.171) and (2.2.172), we get (iii) of Theorem 2.1.2c for process y/nM (l, n)(9in — 9 )[nt] ( X)”=i ( ( 1 >n) — w(l, n) )2/ (n _ 1)) ^ • Due to Remark 2.1.4 of Section 2.1.4, we also have (ii) and (i) of Theorem 2.1.2c for this process. Now, we are to establish Theorem 2.1.3c for v/nM (2, n)(02r»—1 Q)[nt] ( n) —w(2, n) )2/(n — 1)) 1/^2. On account of (2.2.1.52) of Remark 2.1.7 in Section 2.1.4,

/3=% = and &2 = r 2 + ^ = , (2.2.177) y n S # Yn<%£

where is as in (E). Due to the lines in (2.2.42) and (2.2.177), for (§2n — 0)[nt] of (2.1.33) we have

{9 In ^ ) [nt] [Syy fat] ^9[nt]/n)(^Sxx,[nt] 9[nt\/7l) (ffiry,[nt] fj>[nt\/7i)2 Syy — A 9

+ (j3 2 + —f= ) (^ 2( % “ <%,[nt]) + ^ 1 ,h )

(^WiM] X9[ni\/Ti) 2/3(iS'ayi[Ttt] 4~ (32{Sxxfat] 9\ni\/n) -t- Rg,[nt] Syy — A 9 (^W,[rat] X9\nt\/Ti) (iS'xx.jnt] 9\nt\/ii) (>S'a:^,[nt] fj,\nt]/n) Syy — A 9

P(S„ - A6) + ^3,[nt] %.M

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 236

where

Ri,[nt] = 2fi(Ses - + (Sgs - 6 - Sss,[nt] + 0[nt]/n), (2.2.179)

R2,[nt] = 202n ~ P)(SXy,[nt} ~ fJ.[nt]/ri) + ( /3 22n - p 2) ( S XXj[nt] - 6[nt]/n), (2.2.180)

^3,[nt] = (p P (*$€£~ )R-2,[nt\ + P ((*^3/y,[ni]— A0[n£]/n)

%P(Sxy,[nt] ~ + P (‘S'xi,[nt] — 0[nt]/n) + i?2,[nt])

- W { S v y ,[ n t\-n[nt]/n) + P 2 ( S xx>[nt] - 6[nt]/n) + R 2,[nt])j(Syy - A0)_1

(2.2.181)

and, similarly to (2.2.43),

R[nt] = -

+ (See,[nt) ~ 0[nt]/n)(Sss,[nt] ~ A6[nt]/n) - (^ £i[ntj - fj,[nt]/n)2

+ 2S^s,[nt]P(S£e,[nt} ~ 6\nt]/n) + 2Sze,[nt}(Ss8,[nt] ~ A0[nt\/n)

2

(2.2.182)

In view of (c) Lemma 2.2.9 for process y/nw(2,n)^(^2Z=i(wi(^in)~w(2,n) )2/{n— 1)) 1/2 (cf. arguments for y/nw{\,n)[nt](j%^(wi{\,n) - w(l,n) )2/(n - 1))“1/2), (2.2.152) (case = b® = 0) implying (Z)^=1(iWi(2,n) —w(2,n))2/(n — l)j ^2 = Op( 1), n —*■ oo, (2.2.29) and (2.2.178), to complete the proof of part (iii) of Theorem 2.1.2c, it is left to be shown that, as n —► oo,

q sup |flM | = oP( 1) (2.2.183) Oyy — a v o < t< i and y/n sup \R3,[nt]\ = oP( 1). (2.2.184) 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 237

From (2.2.29), (2.2.175) and (2.2.176), for R[nt] of (2.2.182), as n —» oo,

_ ° r 0 ) _ „ m ~ " v T - 0p(1)’

while for R it\nt} in (2.2.179),

™p I^.M l = / ^ 0 , ( l ) + -j= 0 ,(1 ) = y ® 0 ,(1 ), (2.2.185)

with as in (E) and, by (2.2.175)-(2.2.177), similarly to (2.2.173), for i? 2,[nt] of (2.2.180),

SUP |R2,M | = (2.2.186) o<*

sup IP^iSyyfrt]-\9[nt]/n)-2(3~1(Sxyt[nt]-fj,[nt]/n)+(Sxx>[nt]-6[nt}/n)\ = — jH). 0

y/n sup |i?3 [nt] | 0

+7S (S«0'(1)+^ (1)) +

= 0p( « f ( f 0p« + ^ 5 < * » ) = ^ = Ml).

Parts (ii) and (i) of Theorem 2.1.2c follows from part (iii) of Theorem 2.1.2c and Remark 2.1.4 of Section 2.1.4.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 238

The proof of Theorem 2.1.3c for (A 03„ - A0)[nt] of (2.2.1.34) goes the same way

A as that for (92n — 0)[nt], and hence is omitted here. □

P ro o f of T heorem 2.1.3a. First, we will show that, as n —► oo,

i - I * (2-2.189)

To establish (2.2.189), we need to verify that, as n —> oo,

a i f e i t - M 2 = o(l), (2.2.190)

-°P(1) ^ ^ ~0P(1)’ (2‘2-191) and that

! C iL l( 5 i,5e — S s e )2 _ £ £ = 1{s i,SS ~ S g s)2 _ /.% — — -op(1)’ — — - op(1)

and See^ = oP(l). (2.2.192)

By part (a) of Lemma 2.2.1 with a £ [|, 1) and (F),

T S - M x - S t t ? , 2E?,1fe~cB4 , 2nS& „ 8M ,<,<.g 2 ^ “(Et.te-cfjf "2s|£ - n —

that proves (2.2.190). Similarly, also by the WLLN (under (C)) and (2.2.175),

2 g ,1((6-cf)(e,-cg));! | 2 n S j , "2% “ n£ii(&-c«)2 n*Sttt .ma>d(1)

< Op(l) + op(l) = Of (1), n - » o o , 2 ^ i= l Si

i.e., the second statement in (2.2.191) is valid. The first statement of (2.2.191) is handled in the same way. Due to similar arguments, (2.2.192) is proved on the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 239

example of EE=iK« - Sgg)2(n2Sa ) \ By to the WLLN (under (C)),

E U (si,5 6 -S s S)2 2EF=i (Si-c6)* 2nS2ss n2% ~ n2% n2% . w nc(5)A . 0P( 1) 0 P( 1) ..

Convergence (2.2.189) for j = 2 follows from (2.2.152) of Remark 2.2.7, implying

/2 j i that i(2, n) - u(2, n))2/(n — 1) J —*■ const, and the Cauchy-Schwarz inequality, since by (2.2.177) and (2.2.190)-(2.2.192),

EU ((^(2, n) ~ 5(2, n)) - (uj(2, n) - u(2, n) ))2 n % Hi=i(Si,xy ~ Sxy)2{P ~ 02n)2 ^ f-,\Ei=l(si,xy~SXy)2 = = 0p(1) ^ ------„ . ,,, HU ((««£ - %)V + (*«* S(sf- + (S«t - S{,) 2/J2 + - ft.)2)

- 0p(1) — ^ ------— ~ ~ ~ ------= oP(l). (2.2.193)

The respective proofs of (2.2.189) for j = 1 and j = 3 go the same way and are based on (2.2.190)-(2.2.192), with the additional note for j = 1 that /3ln is a consistent estimator of j3 and

A-/32 A -& (A + An/W ln-/3) ~ /ix/3 R------Q------“ ------Q ~Q ------_ (.Pin - P). P Pm Pm/3 Finally, (2.2.189), (2.2.118) and (2.2.123) allow us to replace Ui(jfn ) in (i)-(iii) of Theorem 2.1.2a with respective Ui(j, n), j = 1,3. □

P ro o f of T heorem 2.1.3b. Similarly to the proof of Theorem 2.1.3a, this proof reduces to establishing convergence

g,i(gj(j,n)-gfen))2 p „ (22 194)

All Vi(j,n) possess similar forms and hence, (2.2.194) is shown for j = 1 only.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.2 Auxiliary Results and Proofs of Theorems 2.1.1-2.1.3 240

From (2.2.1.52) of Remark 2.1.7 in Section 2.1.4 and the WLLN (under (C)),

i E G 3 - hnf(xi - v f <2(0- (s£f + i £ > , - S)2) n i= 1 \ n i= l /

= 7 ^ 7 (s «0,(1) + O p(l)) = 0,( 1). (2.2.195)

Due to (2.2.29) and (E), ° P '1'1 (2.2.196)

( 0(1, n)) ’ while (2.2.189) for j = 1 and the version of (2.2.193) for Ui(l,n) and Ui(l,n) yield

- 2 ((«*(!> n) ~ s (l>n)) - («*(!»n) - u( 1, n) ))2 = S^oP(l). (2.2.197) n U Thus, by (2.2.195)-(2.2.197),

“ 5Z ((vi(l,n) -Ui(l,n)) - ((«i(l,n) -i7i(l,n)))2 n i=1

= ((ft-&n)(s«-g)~ ^ ((g|(l, w)-g(l,n)) —« l,n ) - u(l,n) ))^

< - 53()9 - A n ) 2{Xi - X )2

J x \ 2£?=i ((tti(l,n) -u(l,n))-(«i(l,n) -u(l,n)))2

= op(l) + op( 1) = op(l), (2.2.198)

where S# is as in (E). (2.2.198) and the Cauchy-Schwarz inequality result in (2.2.194)

for j — 1. □

P roof of T heorem 2.1.3c. Analogously to the proofs of Theorems 2.1.3a and 2.1.3b, it suffices to show

^,M (j,u) -wU,n) r 4l[ B^ 00i .=T^ (22 199) E ”= iK 0 ‘>n) - w Uin) )

Similar structures of W i(j,n) allow to demonstrate the proof of (2.2.199) on the

example of j = 1.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 241

By (2.2.153) of Remark 2.2.7 (case = 0), implying that ( X)iLi(u>i(l, n)— w(l, n) )2/(n — 1)) = Op( 1), and the Cauchy-Schwarz inequality, the proof re­ duces to establishing

— Y] ((wi(l,n) - w(l,n)) — (wi(l,n) — w(l,n) )Y = oP(l), n —> oo. (2.2.200)

Now, (2.2.200) results from (2.1.52) of Remark 2.1.7 in Section 2.1.4 and (2.2.190)- (2.2.192) as follows:

“ (wi{l,ri) - w(l,n)))2 n i=l

= “ S ( ” 2(A" “P)(Si,xy - S*y) + (A n - P2)(si,xx - Sxxj)2 H j=l

= ((*«£-S«f)(An-/3)2+2(s«.-S«.)(A„-/J)2 U i=l +(5j.ec ~ See) (A n—P) (An + P)

~ 2(Sitf — % )(A n ~ P) ~ 2(Sj.Je ~ Ae) (An ~ /?))

''' O Ml ^ = l(S*>K — Ac)2 . r\ / 1 \ Sr=l(si,C£ ~ Ae)2 . /-i\ SiLl(si,ee ~ A e )' -0p(1) ^ + 0p(1) ^ + 0p(1) ^ ------

+ 0p(1) ^ + °p(1) 0p(1)'

2.2.3 Appendix

This Appendix amounts to a crucial step in the key auxiliary Lemma 2.2.8 of Section 2.2.2. Namely, it provides computer code written in “Maple” that verifies factoriza­ tion in (2.2.83) (the first equality) and that in (2.2.95). First, the first equality of (2.2.83), i.e.,

det(Cov 7 „) = (1 - d2)2det(r) det^Cov 7 ^3,7^ (2.2.201)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 242

is checked, with 7„ of (2.2.73), dn of (2.2.75) and T of (2.1.3), i.e., respectively with

_ ^ ( (& -cm)8j ______(& - cm)£j 8j £t 5?-X9 e?-d\ ln~ h \ sJY,U (6-cm )2’ v'ELife-cm )2’ ^ ) ’ (2.2.202) where c is as in (2.1.7),

x e P dnX9 dnfj. d„m21 dnm,3o dnm 12 } p 9 dnfi dn9 dnm\2 dnrri21 dnmo3 dnX9 dnfj, X 9 /* m 21 m 30 m u Cov7„ dnfi dn9 fi 9 mi2 m 21 mo3 (2.2.203) dnm 2i dnrri\2 m24 m42 m22 — fi2 m3i — A 9/ j, m 13 - 9\x dnmzQ dnm 21 m30 m2i m3i - A 0/u m40 - ( A 0 )2 m22 - A 0 2 Kdnm X2 dnm03 m12 m03 - 6>/u m22 - A02 m04 - 6*2 }

and niij = E(5zeJ), i, j = 0,4, (2.2.204) ^ — cm dn — ,1/2 (2.2.205) ( ( ( - c m ) 2) and r = Cov(i,e) = p£),

and with subvector j ^ ’7) Gf vector 7„ from (2.2.202), where j ^ '7) js assumed to be full. Then, (2.2.95), namely

det (Cov 7n) = (1 - d2)2 det(r) det (Cov 7^3,7)) , (2.2.206)

is verified, with 7„ as in (2.2.93), i.e., with

X _ (M) » ( 3 ,7 ) \ In in J In J > (2.2.207)

where tail 7^3’7^ is defined as follows. Vector 7^3’7^ has the first two components y/n5 and ^/n£, and the rest of the components (maximum two), if any, composing a subvector of vector (>/n(de — //), y/n(S2 — A6), y/n{£2 — 9)j with the same order of components that they have in j n of (2.2.202). Hence, (2.2.206) is shown for seven

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 243

possible matrices corresponding to C o v j^’7\ The code below (lines starting with >) is supplied with preliminary comments inserted into { } that follow % sign. > with(linalg);

%{ Inputting Cov 7n denoted by An, with 1, t, mu, dn and mij, i,j= 0,4 standing respectively for A, 9, n , dn and of (2.2.204), i ,j = 0,4.} > An:=matrix([ > [I*t,mu,dn*l*t,dn*mu,dn*m21,dn*m30,dn*ml2], > [mu,t,dn*mu,dn*t,dn*ml2,dn*m21,dn*m03], > [dmid*t,dn*mu,l*t,mu,m21,m30,ml2], > [dn*mu,dn*t,mu,t,ml2,m21,m03], > [dn*m21,dn*ml2,m21,ml2,m22-mu'‘2,m31-l*t*mu,ml3-t*mu], > [dn*m30,dn*m21,m30,m21,m31-l*t*mu,m40-(l*t)'"2,1*122-1*^2], > [dn*ml2,dn*m03,ml2,m03,ml3-t*mu,m22-l*t"2,m04-t"2]

> D; %{ For shortness, displaying An, that normally follows inputting, is cut here.}

%{ Inputting Cov 7^3,7^ denoted by Bn.} > Bn:=matrix([ > [I*t,mu,m21,m30,ml2], > [mu,t,ml2,m21,m03], > [m21,ml2,m22-mu~2,m31-l*t*mu,ml3-t*mu], > [m30,m21,m31-l*t*mu,m40-(l*t)~2,m22-l*t“2], > [ml2,m03,ml3-t*mu,m22-l*t''2,m04-t''2]

> D;

%{ Verifying (2.2.201), i.e., that det (Cov 7n)/ det (Cov = (1 — d£)2det(r) (=

(dn - l)2(dn + 1)2(A 02 - fi2)).} > normal (factor (det (An)) / factor (det (Bn)));

(dn — l)2(dn + 1)2(112 — mu2)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 244

%{ Similarly checking (2.2.206), i.e., det (Cov 7„)/det (Cov 7^3,7^) = (1 — d£)2det(r). Various Cov % and Cov are denoted by Ain and Bin within the code, i= 177. As before, the code is shortened by omitting displaying the matrices that automatically follows their inputting.}

%{Inputting Cov 7n and C ov^3’7^ that correspond to 7^3,7) = (y/n8, y/ne, y/n(5e— fi), y/n(82 — XO)') and checking (2.2.206).} > Aln:=matrix([ > [I*t,mu,dn*l*t,dn*mu,dn*m21,dn*m30], > [mu,t,dn*mu,dn*t,dn*ml2,dn*m21], > [dn*l*t,dn*mu,l*t,mu,m21,m30] , > [dn*mu,dn*t,mu,t,ml2,m21], > [dn*m21 ,dn*ml 2 ,m21 ,m 12 ,m22-mu ~ 2 ,m3 l-l*t*mu], > [dn*m30,dn*m21,m30,m21,m31-l*t*mu,m40-(l*t)‘'2] ]); > Bln:=matrix([ > [I*t,mu,m21,m30], > [mu,t,ml2,m21], > [m21,ml2,m22-mu'‘2,m31-l*t*mu], > [m30,m21,m31-l*t*mu,m40-(l*t)~2] ]); > normal(factor(det(Aln))/factor(det(Bln)));

(dn — l)2(dn + 1)2(112 - mu2)

%{Inputting Cov 7„ and Cov 7^3,7) that correspond to = ( v ^ > y/ne, y/n(5e—

fj,), y /n (e 2 — 0)) and checking (2.2.206).} > A2n:=matrix([ > [I*t,mu,dn*l*t,dn*mu,dn*m21,dn*ml2], > [mu,t,dn*mu,dn*t,dn*ml2,dn*m03], > [dn*l*t,dn*mu,l*t,mu,m21,ml2] , > [dn*mu,dn*t,mu,t,ml2,m03],

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 245

> [dn*m21,dn*ml2,m21,ml2,m22-inir2,ml3-t*inu], > [dn*ml2,dn*m03,ml2,m03,ml3-t*mu,m04-t''2] ]); > B2n:=matrix([ > [I*t,mu,m21,ml2], > [mu,t,ml2,m03], > [m21,ml2,m22-mu~2,ml3-t*mu], > [ml2,m03,ml3-t*mu,m04-t',2] ]); > normal (factor (det (A2n)) /factor (det (B2n)));

(dn — l)2(dn + 1)2(1 t2 - m u 2)

%{Inputting Cov 7n and C ov^3,7) that correspond to = \fne, y/n(S2— X8), y/n(e2 — O)') and checking (2.2.206).} > A3n:=matrix([ > [I*t,mu,dn*l*t,dn*mu,dn*m30,dn*ml2], > [mu,t,dn*mu,dn*t,dn*m21 ,dn*m03], > [dn*l*t,dn*mu,l*t,mu,m30,ml2] , > [dn*mu,dn*t,mu,t,m21,m03], > [dn*m30,dn*m21,m30,in21,m40-r2*t'‘2,m22-l*t~2], > [dn*ml2,dn*m03,ml2,m03,m22-l*t'>2,m04-t'‘2] ]); > B3n:=matrix([ > [I*t,mu,m30,ml2], > [mu,t,m21,m03], > [m30,m21,m40-l''2*t~2,m22-l*t~2], > [m 12,m03, m22-l*t ~ 2 ,m04-t ~ 2] ]); > normal(factor(det(A3n))/factor(det(B3n)));

(dn — l)2(dn + 1)2(112 - mu2)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 246

%{Inputting Cov 7„ and Cov 7^3'7^ that correspond to 7^3’7^ = (y/nS, y/ne, y/n(8e— /z)) and checking (2.2.206).} > A4n:=matrix([ > [l*t,mu,dn*l*t,dn*mu,dn*in21], > [mu,t,dn*mu,dn*t,dn*ml2], > [dn*l*t,dn*mu,l*t,mu,m21], > [dn*mu,dn*t,mu,t,ml2], > [dn*m21,dn*ml2,m21,ml2,m22-mu''2 ] ]); > B4n:=matrix([ > [l*t,mu,m21], > [mu,t,ml2], > [m21,ml2,m22-mu~2] ]); > normal (factor (det(A4n)) / factor (det (B4n)));

(dn - l)2(dn + 1)2(112 - mu2)

%{Inputting Cov 7n and Cov^3,7) that correspond to = (y/nd, y/ne, y/n(82— X0)) and checking (2.2.206).} > A5n:=matrix([ > [l*t,mu,dn*l*t,dn*mu,dn*m30], > [mu,t,dn*mu,dn*t,dn*m21], > [dn*l*t,dn*mu,l*t,mu,m30], > [dn*mu,dn*t,mu,t,m21], > [dn*m30,dn*m21,m30,m21,m40-l''2*t'‘2 ] ]); > B5n:=matrix([ > [l*t,mu,m30], > [mu,t,m21], > [m30,m21,m40-r2*t',2] ]);

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2.3 Appendix 247

> normal(factor(det(A5n))/factor(det(B5n)));

(dn — l)2(dn+ 1)2(1 t2-m u 2)

% {Inputting Cov 7n and Covy^3’7) that correspond to %3’7^ = ( y/nd, y/ne, y/n(e2— 6)) and checking (2.2.206).} > A6n:=matrix([ > [l*t,mu,dn*l*t,dn*mu,dn*ml2], > [mu,t,dn*mu,dn*t,dn*m03], > [dn*l*t,dn*mu,l*t,mu,ml2], > [dn*mu,dn*t,mu,t,m03], > [dn*ml2,dn*m03,ml2,m03,m04-t~2 ]]); > B6n:=matrix([ > [l*t,mu,ml2], > [mu,t,m03], > [ml2,m03,m04-t~2] ]); > normal(factor (det (A6n)) / factor (det (B6n)));

(dn — l)2(dn + 1)2(112 — mu2)

%{ Inputting Cov 7„ and Covy@,r) that correspond to y!£’7') = (y/nS, y/n/j and checking (2.2.206).} > A7n:=matrix([ [l*t, mu, dn*l*t, dn*mu], > [mu, t, dn*mu, dn*t], > [dn*l*t, dn*mu, l*t, mu], > [dn*mu,dn*t,mu,t] ]); > B7n:=matrix([ [l*t,mu], > [mu,t] ]); > normal (factor (det (A7n)) /factor (det (B7n)));

(dn - l)2(dn + 1)2(112 - mu2)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Epilogue

In conclusion of this thesis, we outline some works that are immediately related to it, and presently are in progress by the author. Namely, we are to address items Martsynyuk [46]—[49] in our Bibliography, most of which have already been briefly mentioned in the course of the thesis.

Work [46] is related to a problem described in part (a) of Remark 2.1.9 of Chap­ ter 2, and concerned with noting that not all the weak invariance principles of Lemma 2.1.3 of Chapter 2 are yet completely free of unknown parameters, in con­ trast to those of the corresponding Lemma 1.1.3 of Chapter 1. Namely, the weak invariance principles for the key processes (J3jn — P)Kjn(t) and (ajn — a)Ljn(t), un­ der limn_oo (£2 — (£)2) < 00> continue to contain some unknown parameters in time functions Kjn(t) and Ljn(t), j = 173. In [46], via first obtaining a general result on invariance principles by way of self-randomization and Studentization for independent nonidentically distributed random variables, we eliminate all unknown parameters from K jn(t) and Ljn(t), j = 1,3, and thus achieve complete data-based forms for the corresponding weak invariance principles that can in turn be useful for some applications.

Multivariate CLT’s via Studentization in the context of FEIVM’s of Chapter 2, that were originally planned to be a part of Chapter 2 to match their analogues for SEIVM’s of Chapter 1, are left for more thorough investigations in [47].

In Chapter 1 we introduced a new type of SEIVM’s, namely those with ex-

248

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Epilogue 249

planatory variables having an infinite variance. In [48] we revisit special robustness features of such models in view of the asymptotic theory of Chapter 1 (cf., e.g., Observation 1.1.1, part (e) of Remark 1.1.6, and conclusive lines of the second subsection of Section 1.1.5 in Chapter 1), and continue our investigations on formal connection and nearness of such SEIVM’s to regression models along the lines of Re­ mark 1.1.6 of Chapter 1. FEIVM’s as in Chapter 2 with limn_*oo (£2 — (£)2) = oo, as companions to SEIVM’s of Chapter 1 with Varf — oo, are then closely examined in the same spirit. It would seem that, while the idea of closeness of regression models and EIVM’s has always been attractive to statisticians in view of the simpler and well-available results for regressions, developing this idea on a formal level under appropriate conditions has been first initiated in this thesis and further explored in [48]. In [49] we consider processes for the slope and intercept of FEIVM’s of Chapter 2 that are defined differently from those of Section 2.1.2 of Chapter 2. Namely, they are fashioned after Renyi’s approach in [57], a la Csaki [13], to studying weighted versions of the classical Kolmogorov-Smirnov statistics. Convergence in distribution of their special sup-functionals is established. Naturally, the problems considered in the discussed works in this Epilogue rep­ resent only a few of various immediate lines of research this thesis can lead to, and reflect the author’s current scientific interests.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography

[1] Adcock, R.J. (1877). Note on the method of least squares. The Analyst 4 183— 184.

[2] Adcock, R.J. (1878). A problem in least squares. The Analyst 5 53-54.

[3] Anderson, T.W. (1976). Estimation of linear functional relationships: approx­ imate distributions and connections with simultaneous equations in economet­ rics (with discussions). J. R. Statist. Soc. B 38 1-36.

[4] Anderson, T.W. (1984). Estimating linear statistical relationships. Ann. Statist. 12 1-45.

[5] Babu, G.J. and Bai, Z.D. (1992). Edgeworth expansions for error-in-variables models. J. Multivariate Anal. 42 226-244.

[6] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.

[7] Carroll, R.J., Ruppert, D. and Stefanski, L.A. (1995). Measurement Error in Nonlinear Models. Chapman & Hall, London.

[8] Casella, G. and Berger, R.L. (1990). . Wadsworth &; Brooks/Cole, Pacific Grove, CA.

[9] Cheng, C.-L. and Tsai, C.-L. (1995). Estimating linear measurement error mod­ els via M-estimators. Symposia Gaussiana: Proceedings of Second Gauss Sym-

250

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 251

posium, Conference B: Statistical Sciences (Mammitzsch, V. and Schneeweiss, H., eds.), pp. 247-259. Walter de Gruyter, Berlin.

[10] Cheng, C.-L. and Van Ness, J.W. (1991). On the unreplicated ultrastructural model. Biometrika 78 442-445.

[11] Cheng, C.-L. and Van Ness, J.W. (1994). On estimating linear relationship when both variables are subject to errors. J. R. Statist. Soc. B 56 167-183.

[12] Cheng, C.-L. and Van Ness, J.W. (1999). Statistical Regression with Measure­ ment Error. Arnold, London.

[13] Csaki, E. (1977). Investigations concerning the empirical distribution function. Magyar Tad. Akad. Mat. Fiz. Oszt. Kozl. 22 239-327.

[14] Csorgo, M. and Revesz, P. (1981). Strong Approximations in Probability and Statistics. Akademiai Kiado, Budapest-Academic Press, New York.

[15] Csorgo, M., Szyszkowicz, B. and Wang, Q. (2001). Donsker’s theorem and weighted approximations for self-normalized partial sums processes. Technical Report Series of the Laboratory for Research in Statistics and Probability, 360- O ctober 2001, Carleton University-University of Ottawa.

[16] Csorgo, M., Szyszkowicz, B. and Wang, Q. (2003). Donsker’s theorem for self­ normalized partial sums processes. Ann. Probab. 31 1228-1240.

[17] Csorgo, M., Szyszkowicz, B. and Wang, Q. (2004). On weighted approximar tions and strong limit theorems for self-normalized partial sums processes. Asymptotic Methods in Stochastics: Festschrift for Miklds Csorgo (Horvath, L., Szyszkowicz, B., eds.), pp. 489-521. Fields Institute Communications, Toronto.

[18] Feller, W. (1976). An Introduction to Probability Theory and Its Applications. Vol. 2, 2nd ed. Wiley, New York.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 252

[19] Fuller, W.A. (1987). Measurement Error Models. Wiley, New York.

[20] Gine, E., Gotze, F. and Mason, D.M. (1997). When is the Student f-statistic asymptotically standard normal? Ann. Prob. 25 1514-1531.

[21] Gleser, L.J. (1981). Estimation in a multivariate ‘error-in-variables’ regression model: large sample results. Ann. Statist. 9 24-44.

[22] Gleser, L.J. (1983). Functional, structural and ultrastructural error-in-variables models. American Statistical Association Proceedings of the Business and Eco­ nomic Statistics Section, pp. 57-66. American Statistical Assoc., Alexandria, VA.

[23] Gleser, L.J. (1987). Confidence intervals for the slope in a linear error- in-variables regression model. Advances in Multivariate Statistical Analysis (Gupta, R., ed.), pp. 85-109. D. Reidel, Dordrecht.

[24] Gleser, L.J. (1989). Commentary on “Indoor air pollution and pulmonary per­ formance: investigating errors in exposure assessment”. Statistics in Medicine 8 1127-1131.

[25] Gleser, L.J. (1991). Measurement error models. and Intelligent Laboratory Systems 10 45-57.

[26] Gleser, L.J. (1992). The importance of assessing measurement reliability in multivariate regression. J. Amer. Statist. Assoc. 87 696-707.

[27] Gleser, L.J. and Hwang, J.T. (1987). The nonexistence of 100(1 — a)% confi­ dence sets of finite expected diameter in error-in-variables and related models. Ann. Statist. 15 1351-1362.

[28] Gnedenko, B.V. and Kolmogorov, A.N. (1954). Limit Distributions for Sums of Independent Random Variables. Addison-Wesley, Reading, MA.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 253

[29] Hahn, M.G. and Klass, M.J. (1980). Matrix normalization of sums of random vectors in the domain of attraction of the multivariate normal. Ann. Probab. 8 262-280.

[30] Hasabelnaby, N.A., Ware, J.H. and Fuller, W.A. (1989). Rejoinder to comments by Leon Jay Gleser. Statistics in Medicine 8 1133-1135.

[31] Kendall, M.G. (1951). Regression, structure and functional relationship, I. Biometrika 38 11-25.

[32] Kendall, M.G. (1952). Regression, structure and functional relationship, II. Biometrika 39 96-108.

[33] Kendall, M.G. and Stuart, A. (1979). The Advanced Theory of Statistics. Vol. 2, 4th ed. Griffin, London.

[34] Kukush, A.G. and Martsynyuk, Yu.V. (1998). Consistency and inconsistency of the weighted least squares estimator in linear functional error-in-variables models. Theory of Stochastic Processes 4(20) 172-179.

[35] Kukush, A.G. and Martsynyuk, Yu.V. (1999/2000). A criterion for the consis­ tency of the least squares estimator for a functional linear model with errors in variables. Theoriya Imovimostey ta Matematichna Statistika 60 95-101/105- 112 in Ukrainian/English.

[36] Kukush, A. and Maschke, E.O. (2003). The efficiency of adjusted least squares in the linear functional relationship. J. Multivariate Anal. 87 261-274.

[37] Levy, P. (1937). Theorie de VAddition des Variables Aleatoires. Gauthier- Villars, Paris.

[38] Lindley, D.V. (1947). Regression lines and the linear functional relationship. J. R. Statist. Soc. Suppl. 9 218-244.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 254

[39] Logan, B.F., Mallows, C.L., Rice, S.O. and Shepp, L.A. (1973). Limit distribu­ tions of self-normalized sums. Ann. Probab. 1 788-809.

[40] Madansky, A. (1959). The fitting of straight lines when both variables are sub­ ject to error. J. Amer. Statist. Assoc. 54 173-205.

[41] Mailer, R.A. (1981). A theorem on products of random variables, with applica­ tion to regression. Austral. J. Statist. 23 177-185.

[42] Mailer, R.A. (1993). Quadratic negligibility and the asymptotic normality of operator normed sums. J. Multivariate Anal. 44 191-219.

[43] Martsynyuk, Yu. (2001). Asymptotic behaviour of weighted least squares es­ timator in linear functional error-in-variables models. Technical Report Series of the Laboratory for Research in Statistics and Probability, 353-July 2001, Carleton University-University of Ottawa.

[44] Martsynyuk, Yu. (2004). Invariance principles via Studentization in linear structural error-in-variables models. Technical Report Series of the Labora­ tory for Research in Statistics and Probability, 406-October 2004, Carleton University-University of Ottawa.

[45] Martsynyuk, Yu. (2005). Invariance principles via Studentization in linear func­ tional error-in-variables models. Technical Report Series of the Laboratory for Research in Statistics and Probability, Carleton University-University of Ot­ tawa. To appear.

[46] Martsynyuk, Yu. Invariance principles via self-randomization and Studentiza­ tion for independent nonidentically distributed random variables and their ap­ plications to linear functional error-in-variables models. In progress.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 255

[47] Martsynyuk, Yu. Multivariate CLT’s via Studentization in linear functional error-in-variables models. In progress.

[48] Martsynyuk, Yu. A note on linear structural error-in-variables models with explanatory variables having an infinite variance. In progress.

[49] Martsynyuk, Yu. Renyi type limit theorems in linear functional error-in- variables models. In progress.

[50] Meerschaert, M. (1994). Norming operators for generalized domains of attrac­ tion. J. Theoret. Probab. 7 739-798.

[51] Meerschaert, M.M. and Scheffler, H.R (2001). Limit Theorems for Sums of Independent Random Vectors. Wiley, New York.

[52] Moran, P.A.P. (1971). Estimating structural and functional relationships. J. Multivariate Anal. 1 232-255.

[53] O’Brien, G.L. (1980). A limit theorem for sample maxima and heavy branches in Galton-Watson trees. J. Appl. Probab. 17 539-545.

[54] Petrov, V.V. (1987). Limit Theorems for Sums of Independent Random Vari­ ables. Nauka, Moscow. In Russian.

[55] Prohorov, Yu. V. (1956). Convergence of random processes and limit theorems in probability theory. Theory of Probability and Its Applications. 1 157-214.

[56] Reiers0l, O. (1950). Identifiability of a linear relation between variables which are subject to errors. Econometrica. 18 375-389.

[57] Renyi, A. (1953). On the theory of order statistics. Acta. Math. Acad. Sci. Hung. 4 191-232.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography 256

[58] Resnick, S. and Greenwood, P. (1979). A bivariate stable characterization and domains of attraction. J. Multivariate Anal. 9 206-221.

[59] Revesz, P. (1968). The Laws of Large Numbers. Academic Press, New York.

[60] Schott, J.R. (1997). Matrix Analysis for Statistics. Wiley, New York.

[61] Sepanski, S.J. (1997). Some invariance principles for random vectors in the generalized domain of attraction of the multivariate normal law. J. Theor. Prob. 10 1053-1063.

[62] Sprent, P. (1990). Some history of functional and structural relationship. Sta­ tistical Analysis of Measurement Error Models and Application (Brown, P. and Fuller, W., eds.), Contemporary Mathematics 112 pp. 3-15. American Mathe­ matical Society, Providence, RI.

[63] Van Montfort, K. (1988). Estimating in Structural Models with Non-Normal Distributed Variables: Some Alternative Approaches. M & T Series 12. DSWO Press, Leiden.

[64] Vu, H.T.V., Mailer, R.A. and Klass, M.J. (1996). On the Studentization of random vectors. J. Multivariate Anal. 57 142-155.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.