INVARIANT ESTIMATION WITH APPLICATION

TO LINEAR MODELS

by

Mary Sue Younger

Dissertation submitted to the Graduate Faculty of the

Virginia Polytechnic Institute and State University

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

in

Statistics

APPROVED:

Dr • Lawrence S • M~yer , ''G;hairman Department of

Dr'. Ra§morid/ijf. Myers Dr. d Harshbarger, H ad Department of Statistics Dep tment of Statistics

Dr. I • (fP"Good Dr. Paul W. Hamelman Department of Statistics Department of Business Administration

Dr. David L. Klemmack Department of Sociology

August, 1972

Blacksburg, Virginia ACKNOWLEDGEMENTS

The author would like to express her appreciation to the following people for their help in the preparation of this dissert~tion:

To Dr., Boyd Harshbarger for his encouragement throughout the author's years of graduate study and for his securing the financial aid which made graduate study possible. i To D:r. Lawn:::nci<:' S. Mayer, who directed this research with tireless I i energy and without whose confidence, enthusiasm and guid,nce this dis- sertation would ha.ve never bee.n undertaken.

I To Dr. Raymond H. Myers for his invaluable suggestions while co- reading this Ule.nuscript and his continued concern and encouragement throughout the author's years of graduate study.

To the Faculty serving on the Committee for their in,terest and critic.ism.

To the author's colleagues in the College of Busine~s for their ' I interest and concern for the completion of this dissertat,ion, and for their atd in having copies made.

To who did the computer work necessar~ to the examples presented.

Finally, to who typed th~s manuscript quickly and accurately, and who never became irritated at' having to make so many corre.ctions and revisions. TABLE OF CONTENTS Page ACKNOWLEDGMENTS •••••• ...... ii CHAPTER I: INTRODUCTION • • ...... 1 CHAPTER II: REVIEW OF GROUP THEORY...... 5. CHAPTER III: THE STRUCTURE OF THE ESTIMATION PROBLEM. . . . . • • 14 Section 3.1: Definition of an Equivariant Estimation Problem • • • • • • • • ...... 14 Section 3.2: Additional Structure of the Proposed Problems ••• ...... 16 Section 3.3: An Example of the Estimation Structure Under Consideration • • • • • • • • • • • . . . 18 CHAPTER IV: INVARIANT ESTIMATION OF ORBITS OF THE PARAMETER SPACE • • • • • • • • • • • • • • • • • • • • • . . . 26 Section 4.1: A Group Theoretic Function Which Indexes Orbits ...... 26 Section 4.2: A Maximal Invariant Parametric Function • • 31

Section 4.3: An Inva~iant Sufficient of the Index of Orbits • • • • • • • • • ...... 33 CHAPTER V: APPLICATION TO THE LINEAR MODEL: NONSTOCHASTIC INPUT MATRIX • • • • • • • • • • • • • • • • • • . . . 38. CHAPTER VI: COMPARISON OF THE OF THE MAXIMAL INVARIANT PARAMETRIC FUNCTION • • • · • • • • . . 51 Section 6.1: Properties of !"' ...... 51 Section 6.2: Properties of the Usual Social Science Estimator • • • • • • • • • • • • •. • .. • • • • 6 7 . CHAPTER VII: RELATIONSHIP OF PROPOSED ESTIMATOR TO THE MINIMUM RISK EQUIVARIANT AND FIDUCIAL ESTIMATORS . . 75 Section 7 .l: Relationship to the Minimum Risk Equivariant Estimator • • • • • • ...... 75 Section 7 .2: Relationship to the Fiducial Estimator. 78,

iii iv

Page

CH:\PTER VIII: APPLICATION TO THE LINEAR MODEL: STOCHASTIC INPUT YJ.ATRIX. • 88

Section 8ol: Alternative Parametric Formulations . 88

SEction 8,2: Properties of the Estimator in Formula t:i.on Oue. • • • • • . 91

Section 8,3: Properties of the Estimator in Formulation Two • . • • • • 97

CHAPTER IX: ESTIJ\llA.TION IN THE LINEAR MODEL: INVARIANCE UNDER NONSINGULAR TRANSFORMATIONS . . 103

CHAPTER X: EXAMPLES OF APPLICATION OF THE PROPOSED PROCEDURES 109

Section 10.l: Application to Sociological Data . . . • • 109

Sect:i.on 10. 2: Application to Engineering Research Data 121

Section 10.3: Summary. 126

REFERENCES , 127

VITA .. • f, ~ 129 LIST OF FIGURES AND TABLES

Figure Page

10.Ll Path Diagram showing influence of family characteristics a.nd age> ma.rriage aspirations (from Klemmack) ...... 110

Table

10.Ll Values of B, 'VB and B* for the regressions indicated in Figure 10.l.l •... 113 '\_, * 10.1.2 Values of B, B and B for pred~cting x10 (FOCI) and x11 (F'OC2) from the other nine van.ables. • • . . . 114

10 .1. 3 Correlation matrix for all eleven variables 116

Multiple coefficients of determination •.. 118

10.LS Values of (n-1) 112 } • ~for the regressions of FOCl and :F'OC2 from x1 , .•. ,x9 • · · • • · • • · • • · 119 1/2 'V 10.1.6 Values of (n-1) ! = ~ for the five regressions indicated in Figure 10.1.1. , ••.. 120

10.2.1 Correlation matrix for data from Gorman and Toman • 122

10 2.2 Marginal ~t~ndard deviations of x1 , ... 9 X10 and y for Gor1nan 8.D.G .lon18J.1 data . .. " '1 "' " 0 • 0 .. 0 " e " • • • 123

'V 10.:Z.3 Values of f3, B, B and B * for the Gor;man and Toman data. 12.4

o.c (11-1) 1/2 '\,, 10 .2 "lf Values .l t -- B for Gorman and Toman data. . . 125

v CHAPTER I

INTRODUCTION

The principle of adopting a group of transformations on the sample space :Ln a statistical estimation problem is quite common. Usually the group structure adopted affects the problem in one of two ways: i) The group structure is isomorphic to the parameter space and is assumed to

"ca.rry" the esU.mation problem in a natural manner. Attention is re- stricted to cst:tmators which are also "carried11 by the group in a nat- ural manner. ii) The group structure generates orbits on the parameter space and the problem ts to estimate in which orbit a parameter lies.

Attention. is restricted to estimators which are unaffected by the group.

Estimation problems of type i) e.re called ~_q~ivariant estimation

.E.EE.!21.£~ and have been widely studied. For reviews of this work see

Zacks (1971), Ferguson (1967). Fraser (1968), Pitman (1939), Brown

(1966), and Younger (1969), Equivaria.nt estimation procedures are usu- ally adopted whe.n it is impl)ssible to find a "best" estimator within thf:! class of a.11 estimators; by restricting attention to the class of equivar:i.ant estimators an optimal estimator may appear. Much of the literattn:e deals with the properties of the "best" equivariant esti- ma tor when consi.dere.d as a member of the class of all estimators. Hora and Buehler (1966) show the relationship between the best fiducial and best equivariant: estimators.

Estimation problems of type ii) are called inv_ariant estimatio~

E!PJ:?.le~.' These ha 11e been considered by Wijsman (1965), Stein (1959),

.,. - (-q-·;) ".- /19 ' . T'' -· c-12tcr ,L.:i, , h.ioo \.L 55), Hall, ~.1Jsrr..an and Ghosh (196::>), and Zacks

(1971), Jnvar:i.ant estimation problem~: arise when the orbit of the

1 2 pa.rarneter unde.r a certain group, but not the parameter itself, must be estimated. The group can be thought of as a "nuisance" group, since estimau.on :Ls restricted to be modulo the group. The most notable re- s1il.ts dealing with these problems include the Stein theorem relating invariance and sufL'Lc:i.ency, as presented by Zacks (1971) or Hall, Wij s- man and Ghosh (1965); Wijsman's (1965) results obtaining probability ratios of m.snd.nial invariants, utiliztng invariant measures and differ- ent:.ial manifolds; and th~:. results in multi.variate distribution theory using invarianc:e under the orthogonal group presented by James (1954,

196l:).

In thls dj_ssertation we study estimation problems which give rise to both types of group structures. In particular, the parameter space m.ipportr; a nuisance. gToup of transformations and the objective is to estimate the orb.it in which the parameter lies, thus giving rise to an i.nva:r:i.ant est:lmation problem. In addition, a second group can be de- f:i.ned such that the set of ordered p:;

The motiva.tion for the problem studied here is the problem of est:Lmat:iI:g the pararnet1::rs in the general linear model, the nuisance group being the group of scale changes on the dependent and independent vsri.abli.~s. Thi!_, problem is of interest since. in many areas of research, 3

mc~1t n.otc.bly in the soc.:La1 sciences, an estimate of the. relationship between variables is required which is independent of the units in which the variables are measured" The invariant estimator commonly used is cl:lllecl ;.:; "b~;t:a coefficientr: or "standardize.d regression co- eff:Lcf.e.nt," but the.;c,;o .ccppears to have been no rigorous derivation of th Li meaGure.rne:nt and no investigation of its properties.

In Chapter II is a review of the definitions and concepts from group theory necessary to the development of invariant estimators.

\.Jli:U<:. rnost of t11e notions presented c:an be found in any elementary text on group theory, some less conventional ideas are introduced in order to fac:LUta.te. the application of group theory to statistical problems,

ChElpter III cons!.ders the structure of the estimation problem, defining the pars.meter~ sample~ and estimation spaces of interest and

!;he groups of tn.111.sfonnations defined on them. Chapter IV develops the theon:.>.t:Lc baG:i.n for· invariant estimation of the orbit in which

Application of th~; theory developed in Chapters III and IV to the probJ.(:.m of invariant est.imat.:i.on in the linear model for the case in which the. indr2pe,:.1'.d.ent ve.rinbles a.n: nonrandom is contained in Chapter

V. The primary result in Chapter V is that the invariant estimator obta:l.ned through group theoretic considerations is not the. same 2.s the commonly used nbeta coeffic:i.ent." Comparison of the properties of the two estimators in Chapter VI reveals that the estimator ob-

t( 1a 2s 1cnsistency, simplicity of distribution, and magnitude of 4 bias. Chapter VII compares the proposed estimator to the fiducial and minimum risk equivar:Lant estimators. showing that the proposed estimator

is a simple function of the fidud.al and mi.nimum risk equ.ivariant esti- mators.

It is shown in Chapter VIII that the results obtained for the case

in which the independent variables are fixed carry over to the case in whic:h the input i,1atrix is stochastic. However, in this latter case,

scme jm~t:i.fication can be made. for the use of the usual "beta coeffi-

c.1.ent."

Invariance under nonsingular transformations is considered in

Chapter IX. It is shown that the proposed theory can be straightfor-

wardJy appJiecl and a.:n invariant estimator obtained. Interpretation of

th'=' phys1.cal me.3.nin.g of this invari.ant estimator appears to be obscure,

bmvever.

In the last chapter, we illustrate the application of the methods

developed h€rein to actual rei;earch problems: one from a sociological

study and another fro\:1 an engineeri.ng experiment. A dramatic change

in the results is apparent in the engineering example, and to a lesser

ext.ent in the sociology example, due to the poor correlations among

variables ir ths latter problem. CHAPTER II

REVIEW OF GROUP THEORY ---~--.._ .. _....,. -

The concc-'.!pt cf <:< group of transformations Js fundamental to the notion of invariaucc. In this chapter we present some of the rudi- ments of group tteory; in particular we present those results which wilJ be i.:.til5.2E;d in ov.r approach to invariant estimation. For a more comp.let;; yet eJ.err,.enta::-y discussion of group theory, the reader is referre.d to E~rnte:Ln (196~-); for a more advanced discussion, see Gold- haber a;:i.d EhrJid1 (1970), Proofs of well-known lemmas will not be givt:.n here.

We begtn with the. notions of a group and a subgroup:

Definition 2, 1 o Let K be a nonempty set and let 0 be a binary operation on K; that is, c is a mapping defined on K x K. The pair

(K, "), abbreviated K, is a J~.E.':?UP_ provided the following four prop- ertieB are satisfied:

i) K is closed under o; that is, if a,b E: K, then a 0 b EK.

ii) c is associative in K; that is, for a,b,c E K, (a o b) o c

a ., (b 0 c).

J.ii) K contains an identity element; that is, there exists an ele-- ment e c K such that for e\'ery element a. E K, a o e "" e ,, a = a.

iv) Every eler:ient J.n K has a.n inverse which is also in K; that is, -1 there exists sen element a-l c K such that a 0 a ., -.r.. a o a (:.,,

Fro;;1 nciw o~·,, "-'"''' -,,511 Hri.ti<.: simply ab for a "' b if the opf~ration

5 6

~Definitio11 2.:2'1 A group K is a .&.1'.:.92P_ 2f transformations on a space a if for all k c K, k is a map from a to a and if a,b E K and w s I~ theo (ab)w = a(bw); that is, if the group operator is composi- tion .

.12..~L~.33-_~j.on_,~ ...::}_· A group K is called Abelian if the operation in

K is corr~utative; that is, if for every a,b EK, ab= ba.

A subset K1 of a group K is a subg~rou:e. of K if K. is itself a group under the operation defined in K. .L

L~.~.Ii~t 2..:1._. Necessary and sufficient conditi.ons for K1 to be a subgroup of K are:

i) K1 must he closed under the operation in K; that is, if

~ E f(._ . '.•10.·.n••• n"b ~ v a ,,_, j_' r... "' "' ·.. "'1'

i.l) K1 must contain i.nverses; that is,

Lemma 2,2. If K, is a subgroup of a group K, then the identity .l element of K is: in K and is the j_de.nt:i ty element of K • l 1

The concepts of a coset and a normal subgroup are central to our development o.f th,2 thecry of inv2.riant i;stimation.

Defi.niti{Jn 2.5. Let K1 be a subgroup of a group K, and let k be an arbitrary element of K. Then the set

.,.,, Jr -- f1 .. ~'J ,._ L r'l k'• 7

The class of all right cosets of K1 j_n K will be denoted

CR(K1 :K)o

Lemma 2.3. If K1 is a subgroup of a group K1 and a and b are ele.m:;;nts of K, then the following conditions are equivalent:

i.) a t: K1 b

H) K1 a = K1b -1 Hi) ab - E K1

Le~nw'- 2. ~.· If K1 is a subgroup of a group K. then the dis tine t right (left) cosets of K1 in K partition K into equivalence classes, such that each element of K is in one and only one equivalence class and any element in an equlvalenc.e class can be reached from any other elf~ment in. the same equivalence class via the operati.on of some

Definiti.on---- 2 '6. A subgroup Kl of a group K is a normal subgr:ouE • .c -1 H and only J..i. kk,k is in K. for every £ and k E K. .;. l. kl Kl

Lemma 2 .. 5. If K1 is a normal subgroup of a group K, then kK1 = K1k for every k E: K, That is, if K1 is a normal subgroup, then th'-'! right and left cosets of K1 in K are. identical. We note that if K is an Abelia.n group, then all subgroups of K will be normal subgroupso

1£.!!!!"a ~-~· If K1 is a normal subgroup of a. group K, then for

is normal, then a natural 8 operation can be defined on CR (K1 :K). Note~ however, that if K1 i.s not normal, while the cosets of K1 in K partition K~ they do not in- herit a natural operation and thus do not possess group structure.

Lr;;!rnma 2. 7 • of group KJ then ,__., .._.,...w.,-....~---- is a normal subgroup a

CR(K1 :K), the set of all distinct right (or left) cosets of K1 in K, is i.tseJf a gToup~ calle.d the quotient .B!.2E.P.. of! modulo K1 and denoted

K/K, " J_

The notion of an :i.somorphj_sm is essential to the development of beth equiv2,ri2:nt e.nd :Lnvariant estimation problems.

Definition 2.7. Let Kand L be groups with operations 0 and *' respectively, A J:.omgEl_~~J2hi!:!.JE. is a mapping µ of K into L such that for ,2very a and b :i.n K, µ(a 0 b) - µ(a) * µ(b). If this last relation-

D~qE.,.i~J.:.9E_f...:..?.· A mapping from K onto L requires that every ele- ment of L be tlK~ image of at least one ele.ment of K. A mapping from K

to L is a .£2£.··_!=_£_-2E_£ mapping if each element of L is the image of at most one element of K.

homomorphi.sm which i.s a one--to·-one mapping of K onto L. If an iso- utorphism of Y cnto L exists, we say that th<'" two groups are isomorEhi<:_.

In order to apply results in group theory to statistjcal estima-

tiou problem~,, lt 1,., r1eces;;ary tc d<:'fine .f.somorphi.m11s betv.Tccn groups 9

De.finitio:n 2 .1.0. A group K of t3_·ansfm..1T1ations on a space r2 is transitive on n if for a.11 .:i, b c ~; there exists an element k c K such t: h8 t Cl '" k ( b) ..

Definition 2.11. A group K of transformations on a space Q is

k r K such that a k (b).

We note that t) the isomorphimn defined in Definition 2,11 is not a gr.)up :Lsomorphisrn ::;:Lr~ce r:: :Ls not a gro1..:p;, and :i_i) if K is isomorphic

Def.i.!1.ition 2~12 .. A group L of transformations on a space Q is

is isomorphic to a subgroup of K.

I.nvari.<~.nt e~3timati.on problems require that the parameter space be partitioned into equivalence classes, or orbits:

D0;:f init:Lon. 2, 13. Tf K. is a group of transformations on a space. r1,. t:he.r1 fr nny k c K} i" the .:~_-....:._- r'oi· __ t .. ~--::..o" ~, · 0

Lemma 2 .. 8 . The orbits in U partition Q into disjoint exhaustive

J?:cocf .. We must show that i) every ~ c n belcngs to at least one

t: r,:::. -~~ y pr..-:' J_i; r in ~~·: ,. SJ r;_c.~2. K_ :is 3. group of t r:ans f or;na t ions or1 ~, t be

is in the orbit of Clearly :i.f 10 w "' k 0w0 for some k0 £ K the orbit of w is the. orbit of w0 • Suppose w in

Q is in the orbit of u:, 0 and in the orbit of w1 . Then w = k0w0 = k1 w1 -1 and thus (u 0 •-= k0 -k1 ul1 so that the orbit of w0 in \2 is the same as the orbit of w, i.n !li. Ther:efore uJ is in only one orbit. J,

Lf7E~:na ... :£. 9, If K J.s a group transitive on Q then Q is a single orbit.

Proof, If K is transitive on n, then for a fixed w0 and an arbi- trm:y w in n, there exists an element k t: K such that w = kw0 , so that the orbit of

Definition 2.14. Let the equivalence relation ""'K" be defined on

\t by orbit membership; tha.t is, if w1 ,w 2 E Q, then w1 "'K w2 if and only if there exists an element k E K such that w~ ~w L = "' l •

J_,en2!na }.::..:~2.' If the gro'..1p K is isomorphic to n and L is a sub- group of K, then the cosets of L and the eleme.nts of \/./'\ = s1 (0.) are in one-to-one correspondence.

Proof. Choose a point WO in a to identify with e, the identity transformation in K. Consider the mapping f : CR(L:K)-+ s1 (0.) defined by f(Lk) = {w : w Qkl''o for some £ s L}. i) We must first show that

'~UD')CS<> v

i(Lk1) 11

must now show that f :Ls an onto mapping, For fixed c.u 1 s r2, consider

{w : w = 1w. for some 9, s L}. Since K is isomorphic to O, there J_ exists an element k1 s K such that w1 = k1w0 . Consider

-· [ w 0; "' £w 1 for some £ r, L} • iii) Finally, we must show that f is one-to-one. Suppose f(Lk1) f (Lk..,) • Then L

and thus Uc., Q, 'k for some 9. 1 E: L. Therefore, Lk Lk'). .L 2 1 "-

Tht: above lernmae. can be used as follows, Suppose a problem in- volves determining i.n which orbit of a group K an unknown w lies.

Lemraa 2, 8 implieH the problem is well defined. If the group K is transit:i.·ve on n the problem is degenerate. since all w lie in a single orbit (Lemma 2. 9). Suppose the g·roup K is a proper subgroup of a group H isomorphic on rt; then by Lem;na 2 .10 the orbits of K on \2 and tl:e cosct.s of K .in H ;:e.ts t:l'"' F:cobJ.ern of determining in which orbit w lies is equivalent to the problem of choosing a coset. 12.

Suppose the group K is transitive on the space r1. but not isomorphic to ll. Then t.h

Definition 2.16. If K is transitive on Q, let K be the isotroPX_ WO o-·'"'o·,·~ oi~ uJ i·r 0 '·7 '1e"·e ·k F K ..:n1pl1"es t1'at kw - '·1 J:>.:'._ ...':J;'. -·- ' 0 _':..'.. .::.:.. '". J.. . ' - w 0 .;. ' 0 - u. 0 0

Lemma 2. lJ.• If K is transitive on a, then there is a one-to-one

and elements in n.

Proo[. Define a function g by g{kK } = ku.1{). WO \, then

g{k1 K. } ~ i..... o

so g i.s wc11-defined' ;)J.nce K is transitive, if lJ.\ E n. then w = k * l\lo for :::ome k * £ K and thus lJ.\ = g{k *K }, so that g is onto. Finally, g u.·o is one-·to--one. since H g{k 1K } ""' g{k2K } , then k., wl' = k 1 0J 0 imp Hes 0)0 IJ.IQ .I. .• ,_ -- 1 that k~ 4 k.. E K ' r,7J:d c.h in turn implies that k1K /. l (1\0 (;Jo Finally, we introduce the notion of decomposing a group into the product of subgroups.

subgroups K1 and K2 if every k s K can be written as k

.(·.,·• i)•.~a.. I{ - v -"d 1. - On- v • ·-' , l c. 1'. l. c;.,.. ~- 2 "- 13

Thus :if a group K of transformations on a space r2 is the direct product of K1 and K2 , then if k s K the action of k on a space fl can be expressed c.=w the composite act.ion of an element k1 in K1 and an element k2 in K2. If K is not Abelian the notion of a direct product can be re.pla.ce.d by the notion of a semi-direct product:

De.f.,!pHi::;r~_J, l.~-· A group K is the (left) semi·-direct product of two subgroups K1 and K2 if every k E K can be written uniquely as k '""k2k 1 for one and only one k1 c K1 and k2 E K2 . The decomposition of k into k,,k_ will be called the left decomposition. A right de- '- .L ccmposition can be defined similarly. CHAPTER III

______THE STRUCTURE ..._~ ... ---- OF THE ESTIMATION PROBLEM

The cst:Lmatfon problems studied in this dissertation involve six concepts: X0, A. P8 , O, L(•;•), and V. We denote by X0 a sample space and by A a a-algebra of subsets of X0. P8 is a probability u:ro;asi.n .. e on A, the indt:,x G belonging to 8, the parameter space. L ( • • ·) is a function defined from 8 x 8 to R+ the positive real line, and is ' called the , V denotes the space of all mappings from X0 to 1J; these mappings are ca.lled estimators.

The problem i.s to estimate the value of e. Given an estimator d(·) EV the statistician observes a value z E X0 according to the measure l\,~ estimates G to be 6 = d(z), and suffers loss 1(8,8). The t, goal :i.s t:o find a.n estimator d ( ·) which minimizes the expected value of I.. (Ci, e) for alJ values of G. The search for such an estimator is usually fruitless unless attention is restricted to some subclass of the class of estimators (for example, unbiased or sufficient esti- mators), or e.lternative.J.y, unless the parameter space is reduced in sorn,e manner. Al though the most coramon method for reducing the class of .::1vailable estimators is probably by using the criterion of un- biaE;o-~dness, a second method of reduction is by equivariance, as was defined in Chapter I. In applying this latter criterion, the stat- istlcian requires that his estimator be equivariant under some group cf t ra.!1:-;tormations,

ht this dJ~3sert:3tion an estimation problem is called eg_y_iv§s.J..2n!_ 15

v if a group H* of transformations on "'o is defined which affects the problem in the followJ.ng mann.er. i) H* induces a group H of trans-

formations on the param<::ter space 8. If z has a distribution with parameter value 8, then h~,, z has a distribution with parameter value

Ee . ii) 11 is i.somorp hie to 8" and iii) the loss function L ( • , ·) satisfies the invari.ancc. cond:ition

L(B,B) - L(Ee, ~8)

iv) In an equivariant estimation problem, attention is restricted

to estimators which satisfy the property

h[d(z)] "" d (h* (z)]

As <>.n :.Ulust:ration of the requirements of an equivariant decision

problem, consider the problem of estimating the true mean of a set of rneasurt-=!ffier~ts. S~J.ppose. the true tnea1'l is t'tVO (feet); transforming the

m~asurements to i.nc:hes maker, thi2 true. mean 24 (inches). If the problem

is equivaria.nt the estimator of the mean when the measurements are

taken in in.chef: should be 12 times as great as the estimator obtained

:l.f the mea.surements a.re taken in feet. Moreover, regardless of

whether the measurements are taken in feet or inches, the loss incurred

in estimation should not be affected.

The 11otion of .s.n equivariant estimation problem can be made more

precise by introducing the following definition:

to be .~.~~\.:!!E.!.§D.!. under a group K of transformations on ~ i.f J.6

f (kw) = f (t0)

for uJ E ~! and k s K. H) The function f is said to be ~uivariant;_ under K if

f (k(l)) - [~ f] (w)

'\, '\, "\, for w t. 17., k s K ar1

We ma.y se:;i.y that. .a function is equivariant if it commutes with the grolip operator, and invar:lant if it is constant on the orbits of K in rl.

An -~quivarta.nt estimation problem is thus one in which j_) the distribution of the is equivariant, ii) the loss func- tiou is invariant, and iii) attention is restricted to decision rules which ar.e equiv

In this dis13ertat:Lon we consider estimation problems which have a eor:mi.on basic structure. which g1.ves rise to beth an equivariant and an invariant (-Ostirnation problem. It is assumed that any problem under con- sid":rati.on y:tel

A ~·mh-:i.somorphh~ group G~* of transformations on X is g:1.ven which induces .J.. 0

:;ub·--:L:rnmGrphic groupn G1 and (\ of transformuti.ons on X and 0, 17 respectiv(::ly, The problem :Ls to estimate the orbit of 8 (with re·-

:=;pect to c1) in which 8 lies.

It is assu~ned that a group G*2 of transformations on X0 can also be defined such that if

G then Hn ope.rat:Lo0. can be defined on G* wh:l.ch makes G * a group which is the sem:L···d~\xect product of isomorphic images of GI"')* Furthermore, '!; '· G induces groups G and G of transformations on X and 8, respectively, to which G and G a.re isomorphic.. Finally~ it is assumed that since G

is isomorpl1ic to the space 8 1 ~ gives rise to an equivariant decision problem.

Suppose reference points x0 and o0 are chosen in X and 8 and identified wJ.th the. 1.de.ntity elem;.o'.nts in G ar.. d G, respectively. Then sinc:e G .::ind G are isomorphic to X and 0, any point :x. s X can be wd.t:ten

for a. urdque. gx 1n G; simJ.larly ~ any point e c 0) can he written

for a unique g8 in G. Therefore, once x0 and s0 are chosen, points in X and 8 can be identified with operators in G and G in a natural 18

--~Structure .. ____ , ___ ._..,._.....,__..._Under Consideration As an example of the structure. of the estimation problems con- si.dcred in this dissertation, conside.r the problem of estimating the

7 mea.n µ and va:ri.ance o·- of a normal distribution, given a sample

) {~} z "'' ( z 1 , ..., , z 1 of :n observations. The sample. space is X0 = and

the pararae.te.r space is

ll t: R and "'2v ,.c. R'+} •

'?~ Con:~ider the cla.ss H of .location and scale transformations on >

? 2 {[c,1'~] E R+} H* -- c R, A € ~

-;"< where the operation is defined on H by

·J~ Lemrn.a 3 • 3 , 1 • H is a group.

2 1 Prc•of" j_) Since 1'c 1 + c s R and 1' :x~ E R+, n ' is closed. ii) For associativity, we have

and

'/ 1 ' rL'-.J''·l,J ,, \ .. _

1 2 2 2, ( ,\ ;\ 1 c •) c C' ,•, ,\ 1 :\ ·.~ J .!. tC + ,\ 1 + •' _ 19 iii) The identity element is [0,1] since

2 (0,1} rLC,A 2] = (c,.\ ] and

7 2 2 [c,;\-) [O~l] = [AO + c, ;\ l] [c,;\ ] •

r.: [ iv1\ T1el inverse or c,A.2] 1s. since

-1 2 -2 [.;\ (-- cA ) + c, ;\ ;\ J

[O ,,l] and

-·l -·2 ,-1 -1 ,-2,2] [- c;\ , ;\ ] [c~;\J [ I\ c - CA • A I\

[O,l]

In order to make H* a group of transformations on X0 we now de- fi.ne how an element tn H* acts on an element in X0 . Let

2., ( ) ( ..,...· ) r._c,.\ 1 .z , .... ,zn = >.z c, •• ,, ;\z + c • 1 1 n

(Not~;! th.o-1t square brackets are used to denote transformations in a g!~oup, and round brackets to denote elements in a space.)

TtK' group of transformations induced by H* on G is

H C ( R<_ > A'2 € R+'>; ' where the opera~:.:i.cn in H is defined in exactly tbe. same way as in H* . 20

Lemma 3.3.2. H is isomorphic to 8.

Proof, Given e1 and e2 E: 8, where

8"L.

f 2) '·µ1,0,J_

2 2 2 b = 0_/02 _L a.11d

and this representation is unique. The isomorphism between fi and 0 is represented by choosing the reference point

e ~-= 1) 0 co ' in G so that if 8

2 (µ,0). and thus [ u,o 2] and (w,o2) are identified with each other.

The problem give::> r-ise to the two--dimensional sufficient statis-

.-· 2) ' - ~ . tic \x,s and tnus tn2. sufric1.2nt statistic space X = R x R+ • The 2 induced group H en X contains elements h "" (c,J. ] , where

? - 2 - 2 2 [c.,\-] (x,s ") = (>.x + c, \ s ") ; 21 and where the group operation is defined as before. Clearly, H is isomorphic to X, and H and fi are isomorphic groups.

By definj.ng an inva.ri.ant loss function sucl:i as

2 4 2 2 1(8,8) (JJ - d1 (x))- + (l/o ) (o - d2 (x)) where

is equivariant under -13, we have an estimation problem equivariant under the group H.

However, suppose the problem is not to estimate (µ,02), but to estimate a function of (µ ,02 ) ·which is invariant under scale changes.

<';; Let G1 be the class of scale changes on (z1 , ••• ,zn) where

)., E R+ } with operation

and where

·I< J:~uua_ 3,.3.3. G1 is a group of transformations on x0 • + Proof. Sfo.ce Al ;\2 E R • cl* is closed. Associativity is imme- diate. ThE: identity element is [1] 1 • and the inverse of [:..] 1 is ·-) p, ] i ·

,~ G1 induces a group of transformations c1 on G, where the operation 22 in c1 is identical to the operation in G*1 , and if

+ g £ G, for A £ R , .1. then

g8

Lemma 3,3,4, c1 is not transitive on 8 and thus not isomorphic to

= [:>..] (,, a2) Proof. The proof 1s ir:imediate since if . 1 .... 2' 2 then µ 1/o1 = v2!a2 and thus for some e1 ,e 2 t: 8 there does not exist a g s ~l such that e1 - ge2 •

S:i.nce c1 is not transitive on 8~ SG (8) contains more than one 1 orbit and it makes sense to estimate orbits.

In order to complete the necessary structure., let G*2 be the class of location changes on X0 . We define

c t: R} with the. operation

:.'< Proof. G2 is closed since c1 + c2 t. R, a.nd associativity is immediate. Ths identity is [0} 2 , and the inverse of [c] 2 is [-c) 2 .

)\ Define the operation of G2 on X0 as 23

2 Definition 3.3,l. Let G* be the set of ordered pairs [g2 ,g1 ] for 2 2 g2 t: G*2 and g1 f: G*11 where ([>.] 2) = [A ] 2 , and let the following operation be defined on G* :

Thus G* is e. class of location and scale changes on .X 0 •

Le~~b£_. G* is a group of transformations on X0. Proof. i:i) To demonst1·ate associativity,

.and

iii) The identity is (0,1], since

and

2 iv) ThE: inv.e.:r:se ....,.,f_._ r•1.>2 o ''"1" ] ;.. s since 24

'J rg p"") l "2. 01

- [O,l] and

[0,1] .

Let G be the group of transformations induced on 8 by G* .

I_,emma .. 2.,!.}. 7. G is a group of transformations isomorphic to 8.

Proof, The proof is immediate since it is apparent that G is identical to Hand Lemma 3.J,2 shows that His isomorphic to· G.

•!:~ Lemma 3.3.8. G is not Abelian.

Proof.

~' Lemma 3.3.9. G1 and G*2 are isomorphic to the subgroups H*1 and

H*2 of H* , respectively~ where if h1 E H1* then

~~ ~nd jc ~ E C,.L • .L ' 2 H 2 > then

h2 ( c, l] •

1hus c1 is 3. sub""is~'morphj_c group with respect to 0.

;.~ Proof, Tl1(' proof i.s L!r:medj_a.te if one identifies ( ,\ j 1 :i.n G1 with 25

.~ * * Lemma 3.3.10. H is not the direct product of H1 and H2 . Proof. The proof is immediate since H* is not Abelian, and direct products must be Abelian.

Lemma. 3o3.11. H* is the semi-direct product of H*1 and H*2 .

Proof, The proof is j_mmediate since any [a,b-}2 r:: H* can be written

The structure is complete since H (or ~) is isomorphic to 8, H is the semi-·d:i re ct prodt1ct of l\ and H2' and the problem is to estimate the orbit of (or in which 8 lies. Hl c1 ) CHAPTER IV

INVARif.J'i'f ESTIMATION OF ORBITS OF THE PARAMETER SPACE ·-···-·-----~-~------

In th:i.s d:apter we will derive an estimator for the orbit of G

(w:i th respect to a sub-isomorphic group L) which is a natural function of the sufficient statistic. We begin by showing that if K is a group of tnmsforrnations isomorphic to 8 and L is a subgroup of K, then estimating orbits of 0 is equ~valent to estimating the cosets of L in

K, Thus i.f L is a. llOnnal subgroup, estimating the orbit of G is equtvalent to estimating elements of K/L Our estima.tor will be ob- ta:l.ru;:d from group theoretic considerations and not by trying to mini- mize the appropriate invariant risk function. However, in Chapter VII we will :relat;:, our estimator to the minimum risk equivariant estimator.

The problem of estimating the orbits of a sub-isomorphic group L in 0 is facilitated by introducing the notion of a function which in- dexes orbits~

isomorphic to a space Q, A function f defined on the space a will

and cu 2 belong to the same orbit of L in n. That

f {tu") .t

Thus a function Lndexes orbits if it is constant on an orbit and if lt t2kcs d1fferent values on different orbits.

26 27 usually called a ~,ax.!£l§.1.J:n~ariant (see.., for example, Wijsman (1957)).

Le.rnma 4 .1 .. 1. If f indexes the orbits of L (a group of transfer- mations) :i.n .a space ll, then f is invariant with respect to L, and any other function invariant with respect to L is a function of f.

__Proof, .. ___ .,. If f indexes the orb:lts of L in Q then

f(w ) 1

1'.: and thus f :l.s invad.ant w:i.th respect to L. If f is invariant with respect to L then

f * (tw1) if £ c L and w1 c Q and thus f * can be written as a function of f,

___Oe:finition, __,..r,.,_._,...... ,..,._,,, __4 .. L2.·",.,.'--~ If f is a function which is invariant and such

tha.t any other iuvariant funct:i.on is a function of f, then f is called a __maximal,,,~--- ... .,__...._....j_nvariant ... ___ ...... ,.. ____ function, ,, ____ _

Thus, any function which indexes the orbits of L in 12 is a maxi- mal invari.ent func ti.on under L.

') Continuing the exampl'" of e.stimating 8 = (µ,o-) from a normal

distr:Lbution, conside.:r estimating the orbits of the group of lcca- H2 tion changes in 8. The function which indexes the orbits of H2 in 8

must be invariant under tram;formation=~ of the form

2 (·µ + c. 0 ) '

Thi.w the pr:oblerr: of f:sttmating the orbits of H,1 i"' the problem of -!.. 28

I • 2 estimatjng the.. e_ement'I ot- ~l associate d w1t1. l (_µ,o ) , or equ1va. 1 ent 1y, 2 estimating o Since i.t is a nuisance parameter and the problem is to estimate the cost~.t or n2 ~ H2 might be called a p.u:lsance z..EOup.

Now cons:l.der the problem of estimating the orbits of the group H1

in ch.! Si''-mc:.: problem. Si.nee

2] ( 2) (' ,2 2) [o ,A fl~O . ·- A]J, I\ 0 >

') neither. µ nor o- indexes the orbits of H1 in (::), so the problem in this

cas<:. does r,ot reduce to the problem of estimating a single parameter in

the presence of 2 nuisance parameter,

The proble:m of estlmat:lm.1 in the presence of nuisance parameters

has been explored elsewhere (Zacks, 1971); we will concentrate on

problems of the second type~ in which orbits of the nuisance grcup are

not in.C.exe

obta:in a funcU_(Tt wh:i.ch indexes the orbits of H.1 in H~ provided that H.2

i.s a normal subgroup of a group H,-· where H- is the semi-direct product

Lemma L1 .1 • 2 , Suppose L1 and L2 are subgroups of a group L, where

L2 is nonna.1 in L, If

f'o-- ~ ' }/., f 0 - 1 ar-d 0 c T i-}- "'1 rro.,.. SO"l'" 0 I ,. L2 - '· . ~ L,' 1' Nl I:. .,.,1 <. • Ao2 L ""2' ~.t~. ·'· . ""'- ·'2 -

i) -~ (4.1.1)

and

ii) ( Q,l: ) ·-1 X y y I - Q, '2 'l 2 (4.1.2) 29

Proof, We first note that since L2 is normal in L we can define

Q,'2 t: . L 2 by.

so t.hat

9, (I' "'2 9,:l £1 .

( Q, I) ~-1 0 t t (X, r) ·-1 ii.) £ (t' Q, ' \ {9,i)-1 .Q,2 = , l' "2 l 1 2 1) = £,i .Q,2 .

and L1 and 1 2 is rwrmal in L, then for every £ s L and ~l s L1 , there

n 11 t.... T !.-.~ncl 1 ex 1st ;<., ~'_,' <-'~~ £ s L such that 2 - - l l

Q, l £ I 2 'l

and

ii) 9,

Proof. The proof of i) follows immed:I.ately from the fact that L is

the. semi·-d:i.rect product of L2 and L1 • Part ii) follows from Theorem

For a.ny Lt> E :J, Lot 9-. be. the element in L such that w

where u0 1.s the :·ef,;,,·ence p-'.}:i.nt chosen in ~., Recalling that since L is 30 the se.mi·-direct product of L2 and L1 , a.ny !l e: L can be written

for some £1 E: L1 and £. 2 e: r.2 , the following theorem shows how to use the group structure to construct a function on a space n which indexes

the orbits of L1 in n, or equivale11tly~ is a maximal invariant function under 1 1 .

The.orem... _...... ____ 4 .1..1. _ Let L1 be a sub-isomorphic group of transforma·· __ .l tions on a space rl. Suppose a group L2 of transformations on Q exists, such that if L :i.s the semi-direct product of L2 and L1 , then L

is .isomor.phi.c. to :ii and L2 is a norma.l subgroup of L. Suppose

w !l w - !l !l w = w 0 2w lw 0

-1 l)J(w) 9,1 w = n -1. n 9.. W (0 .... lw .... 2w lw . 0

indexes the orbits of L. in n . .1 •

.rro12.f.· Let uJ' = £.{ (J) for some R. 1 s L1 • Then

"' !l y !l" WO 2w lw

..f or some ·'"ol c 1 .. an. d some ~o" c L . Then by definition~ 2w L 1 W 1

= ( .e." ) -1 i' i" t\lo lw 2w lw

= (Ji.1' Q, )-1 n' (£' .Q. ) lw ~2w 1 lw wO '

by part :l.:1.) of the Coxolle.ry to Lemma 4 .1. 2. Then 31

-·l lj;(w i) ::: (.Q. l )-1 2. l 11w 1 20..1 ll.i .R,lw WO

-1 = JI. .Q.lw 2~J £lw WO ' by part ii) of Lemma t~ .1.2. Thus we have that

~ £-l 1 1] w = ili(w) , lw 2 w .w 0 and ~; is constant on orbits.

Now let qi (t.0) be some othe.r invariant functfon. Since cp(u.1) is invariant~ q, (w) -

so that any other function constant on the orbits of n is a function of i.µ. Thus tjJ index.es the orbits of 16.

We w:Lll now apply Theorem 4.1.l to the estimation problem defined in Chapter III.

Section 4,2, A Maxj_mal Invariant Parametric Function -·-~------·-·--·------.... ------·-- Applying the results of Theorem 4.1.1 to the estimation problem defined in Chapter IIIs we ha.ve immediately a corollary to the theorem.

COf.£~.~a.!.Y... to The.£rem-5... :.l. l. Let H1 be. a sub-isomorphic group of transformations on the parameter space 0. Suppose a. group ii2 of t.ra.nsfo:rme.tions on 8 exists. such that if H is the semi-direct product of }i 2 a.nd ii1 , t.hi.;'n H is a group isomorphic to G and f.I 2 is a normal subgroup of H. Then 32

indexes the orbits of l\ rn O. We call 1/!(8) a maximal invariant

Let us return to the example of estimating (µ,a. 2), from a normal distribution to show how Theorem 4.1.1 is applied.

Lemma 4.2.l. The group H2 of location changes is a normal sub- group of the group H of location and scale changes,

···-1 2 j 1 -----Proof. h h2 h - r_ C $'I,' 1 [c ,l] [c,1'2f-l

2 -1 2 1 = [c,A. ] [c',1] [ -·cA • >..- J

r' , '-1 Ll\\-CA + c) + c' ,

We ca.n apply the Corollary to Theorem 4.1.1 to obtain an index of the orbits cf H, (the group of scale changes in 8. Since -!-

e [µ,o2 J (O,l)

2 [u,l] [0 1 0 ] (0,1) , we have

r ·-2. 2 -- [ ,! " a J ( i; , o ) 33

-1 (a µ, 1)

a.nd thus ij,1(8) indexes th•~ orbits of H1 in 8. Now the problem of e.sti- n~cUng t.he orbits is equivalent to the problem of estimating µ/o.

Note, howevr"!r. ~ that Theorem !+. l. l does not apply to the problem of estimating the orbits of H2 since

Proof.

f)( '.-1) c, ,2,,_2,,-2] l \ -CA 1 A + A I\ A

2 Since the orbits of A2 in 8 are indexed by o , as was seen to be the c1:-i.se in :3.;:ction 4.1, the proposed procedure only appl:f.es to certain problems - those in which a particular subgroup (ii 2) is normal.

------·...,,.--..Est:i..mator of the Inde.x of--·-· Orbits-·#-----... The problem of estimating th2 orbit of iL, in G in which 6 lies is .l now r:eclu;:cd to the problem of estj_mating the index of the orbit, a maxi- rnal L:1v;;:riant par2met:r:Lc function. In the estimation problem definis.d in

ChapLi2.:r TIJ ·~ the g:foup G is :tsomorphic to the sufficient statistic 34 spacep X~ so we can obtain an invariant sufficient estimator of 1J!(6) by forming a function on X in the same way as \jl(8) was formed on 8.

The Stein theorem, as presented by Hall, Wijsman and Ghosh

(1965), stated below without proof, assures us that this will be the same estimator we would obtain if we f:i.rst reduced the sample space by invariance and then further reduced it by sufficiency. We will state only the portions of the Stein theorem relevant to the problems under consideration.

First~ we must present some preliminaries relevant to the Stein theorem. Denoting the sample space as

where X0 is the space of observations z with probability distribution

P and o-algebra A~ the sufficient statistic space can be represented

(X, A , P ) s s where and P are the a-algebra and probability measure induced on As 8 X by the sufficient statistic x. The reduction of the sample space by invar:lance yields

( y 'I~ * A(H) P(H )) 1 '01 • 1? t'• ·1* • *i

J.s the space of orbits of H*1 in x0 and A (H *1 ) and P (H*1 ) are the induced a-algebra a.nd probability measure,

If we reduce the sample space by sufficiency and then reduce the suff ic :Lent ~~pace by invar1ance, then ;·1e obtain the space of .invarian!=_ st1fricicnt st.at!stic:s 35

If we reduce the sample space first by invariance and then by suf- ficiency, we obtain the space of sufficient invariant statistics

J.erru-ua_i_~_l:l_, (Stein) Suppose U(x) is a measurable function. wh:ich indexes orbits of X/H1 and V(x) is a measurable function which

Then U(x) ~ V(x) if h(A ) = A for all s s h s H. (that is, if the group preserves the measurability of func- .L tfons).

The follow1ng Lemma is well-known.

Lemma 4.3.2, If a space (Y, B, P) is isomorphic to a subspace of Rn with the Borel o-algeb:ra~ then the. a-algebra is preserved by nonsingular transformations a.nd location changes.

The above lemma insures us that Lemma 4.3.1 will apply to our problems.

The irn.portance. of Lemma 4.3.l is in that it is easier to attack a problem by first apply:i.ng sufficiency and then invariance than the othe.r way around; however, the more ne.tural approach is to reduce by invariance and th12.n by sufficiency. Lemma !+. 3, 1 implies that both approaches yield the same result, so we will take the former approach for the sake of simpli.c.ity.

'..f.!}~.2E~E.1--.~~J.::~· If we consider the estimation problem defined in 36

Che,pter III with the group of location and scale change.s isomorphic to the sufficient statistic space~ then an invariant sufficient esti- mator is a suf£1cient invariant estimator.

Let x be the reference point chosen in X, and let h E H be 0 x such that

Recc-!ll that 1\ and H2 induce groups H1 and H2 on X, so that any x E X can be represented as

Theorem 4 , 3 ,, 2 • If It, i.s a normal subgroup of H ~ then L

T(x) is an invariant sufficient estimator of lj;(8); equivalently, T(x) is

Hn invariant scfficient estimator of the orbits of H1 in 8.

Proof. Since x is sufficient, T(x) is a function of the suf- ficient statisti.c. That T(x) is invar:i.a.nt can be demonstrate.d in the same way

H and H, T(x) can be used as an e~:timator of :JJ(G).

2 A.gain referring to the example of estimating (µ,a ) from a normal - 2 the suffi.cient statisU.c (x,s ) ca.n be written 37

-- 2 (x~s )

tx,1] [O,s 2 J (0,1) ,

An. invariant snff:lcient estimator of the orbits of H is then 1

T(x) -- [O,s2]-1 [x,11 [O,s2J (O,l)

, -2 - 2. [O,s ] (x,s )

-1- (s ·x~ l)

- cx.1 s, i) .

Sinc\2: T (x) "'' (x/ s) estimates tbe orbits of H1 in 8, we can say that th'~' invaxiant sufficient statistic x/s estimates the index of orbits,

1.1/ 0"

Note that (i/s)-l = s/x is the coefficient of variation found in elementary statistics texts. Therefore s/x is an invariant sufficient estimator of" tlF.: orbits of the scale t1:ansformation group, and thus of

0 /Jl,. the population coeff:Lcient of variation~ which indexes the orbits. CHAPTER V

.-..-.-APPLICATION..... -0-...__._,,,___.._,_ TO.... ___ THE._. LINEAR MODEL: NONSTOCHASTIC INPUT MATRIX

In this chapter the method of invariant estimation developed in

Cbe.pter IV is u::>ed to obtain invarJ.ant estimators of the parameters in a linear mode.l,

Consider the model

where ,,,,.,~.v a11d c e.re n x l vectors of observable and nonobservable random variables, :respectively; Z = [.;h_, ~~1 , .•. ,l£k] is an n x p (p = k + 1) non-

stochastic input matrix (n > p) of full rank~ and ~ is a p x 1 vector

[a, s1 ,.,.,Skl 1 of unknown parameters to be estimated. We assume that

then values of each of z1 , •.• tzk have been expressed in deviation form 2 and rnake the usual assumptions that E(E_) = Q. and E(.£. .£_ 1 ) = a I.

Suppose that the k input variables z1 , .. ,zk and the independent

varis.blr.o y havE arbitrary units of rneasuremen.t. (This assumption is

quite co1mnon in many areas of research, especially in the social sciences

when cleal:i.ng w:Lth such measures as a.ttitudinal or belief intensity

scales.) The problem is to estimate f. in a manner invariant under scale

ehanl?:es on z. , , •. > z and y. ~ .l 1<: In order to appJ.y the methods of Chapter IV, the appropriate group

structure must be defined. Although in the nonstochastic case of the

linear model, y a~~one i::-; ue>ually considered to be the random variable~

to c1pµJ.y our procedure scale changes on z 1 , ••. ,zk must be expressed as

a group of oper2tors un the sample space. Thus we define the estimation

problem as follows.

38 39

Let V (?:.O) be the space of all n x p ma trices such that if Z s.

V(~), then

Z Z D ·- = .:::.0 - for some positi';e definite diagonal matrix D and fixed ~' where

'~ 0 ''~''' ] .?:.,.v [.:L, .s,U.1 , · , vK is such that

Since..?'.. is nonstochastic~ we may treat it as a random variable with a one-point distribution in V(_z.-0). That is, we assume that there exists a D (diagonal) such that

-;'c P(~ ~ ~ Q ) = 1 .

Note that the leading element of D* must be a one. The sample space is the:n dt~fitted HS

\Vhere

z (~, y) is a point in X0 .

Consider the class of transformations H* defined on X0 and contaj.n- 2 ing. e1.e.m12.nts., [.0.' ' s_. y 1, wh ere nA is• a p x p posi.tive de f inite (d i.agona l . matrix!~ diag(l, ~l' ...• ~k), c is the p x 1 vector [c1 , ..• ,cp] 1 , a.nd y 2 is a positive scalar. The operation of H* on X0 is defined by 40

') (!, S:.• Y"" j (~. y) - CI L_, Y :x.. + Z. fl. s) and the opere.t.ion in H* is defined by

Proof. j_) Since .£!.1 fl. is a p x p positive definite diagonal matrix wi.th a one as its first component, fl.-l y s_1 + .£. is a p x 1 vector, and y 2 y, 2 is . a positive scalar, H* is closed. J..

-1 ·-1 2 2 2 A + £., y Y, y2] [.£!.2 !l. 1 .£!., y

-1 -1 2 2 2 A-1 A_ y £_, y [Ji2 A1 !, -_[_ yl .£.2 +fl. y .£.1 + y l Yz} and

-·l 2 2 2 [A J\ y --1 !.;., y E.1 + .£., yl] [!:_2 ~ .£.2' Y2]

(f\. /1'-1 + A-1 2 2 2] [ J\ ' • 1) y + £., y Y0 -.. 2 .b..1 A, -.l - yl £.2 y £.1 yl ...

-1 -1 -1 2 2 2 [ f, A /\. y A y y ·--2 A1 fl., --1 yl £.2 + y .£.1 + .£., 1 y2] so the ope.ration is associative.

ii:i) The :Ld.entity element is [1_,Q~ 1] si.nce

2 [ ~ 2' 'TI..;..-:.• --·~0 J)- ['1.!..:..' .£., y ] - ~' £~ y J and.

2 2 [A. c, y ] [I, 0, l] [A, c. y ] -·' - -·... - - _,

2 -1 -1 -2 iv) The inverse of [_~, .£.9 '( ] is [.£!. , - Y A .£., Y J si.nce

-J. -1 -2 r.~:;., _s_, y y ) /1 [! ' .!.~c £.,

[1~-1 _,-1 -1 2 -2 ~-} !l y (-- y A£) + .£.~ y y ]

;:: LI, Q. l] and

-1 -1 -2 2 (~. -) - y !::...£., y ] [!, .£_, y J =

-== Ll. Q. ll ·

Thus H* is a group.

The parameter space for the linear models problem is

whe.re I:!. is the space of all p x p positive definite diagonal matrices p e.nd e i:.: 8 :i.s of the form

2 8 =-= (D * , lit a ) -n 42

Thi; induced group R or: 0 contains elements [_~, _£., ·?J which act on E' a.s follows:

'D'~ Q a 2 ·) l. ~-n- ' _f.>_.

To see that H* on X0 induces the above operation on 0, consider

Y,'l Yif+y E

-·l f.!;_(yfl ..§.+.£.)+ys so that .if .:z:. is trans:formed to y y + '!:. !:_ £~ then Z is transformed to

Z L, and since

Z ~ D

Z A * "' -0Z D

•) 2 2 so D* is transformed to D* i\, Also, o'"" is transformed to y c and ..§. ·-1 is transformed tu y A -S + --c, Lernma 5 "'2,. If the operation on H is defined by

. ' 2] f \ c U!.• .£.• y '~:J-.' ·-.L.,, then H is a group. Proof. The proof is exactly the same a.s for Lemma 5.1.

1:.£i;:~.§_S ·'.i.· If we assume that c :ts normally distributed, then the sufficient statistic in the linear model problem is

2 (D (Z), _B, s ) , n-

(z) d . (. l' ·-J./2 1/2 1/2) Dn ·-- - 1ag,(n-; , d1 , ••• ,dlc

-1 and d. is the (i~l)st diagonal element of (n-1) Z'Z (note that since l

of z 1 ~ •. o ,:z:k err;; in deviation fo:cm, Dn (~_) contains standard devi- ations of the. K input variables); f is the usual least squares estimator

') .I. 2 2 and s s ic the residual or error mean square, 8 v~z c: " .J "-

2 s (y -· L 2) 1 (:y_ - ~ .§_)/(n-p)

}'roof, It is wel.l known. that for the case in which Z is a fixed

• • (•A 2 ( 2 inpt.;t matn.x, .E'.,, s 1.. i.s jointly sufficient for .@., o ) (see Graybill

(1961), fer example). Thus the conditional frequency distribution

2 is independent of (~~ (1 ) • Since Z and thus D (Z) hns a one point dis- r1 - t:... ibution, it is JrninF..di3t.e that the condit:i.onal density

~. 'l ? '· 1 J s ot (II·, _8._., G ; i SO s-) is sufficient for i.n.dependent ·-n. that (D. n (Z)',- _§_"___ , 44

'".h 2 (D--n ' ~. G ) •

The class of operators induced on X by H* and H is

whexe the operation in H is defined as were the operations in H* e.nd

H, 1:1nC. wh:2.re

2 , ~ 2 2 y } (D (Z), ..§.. /) (Dn(~) f':..~ y 11.-..1. .f~+.£.t y s). n --

It :i.s :Lrnmediate that H is a group of transformations on X.

Although the grc,up structure we have introduced appears quite cumbf.';xs1;1rn.e, it is motivated by the following lemma.

l~~!~ .. ?...!..~~, i1 is isomorphk to X, H is isoiilorphic to 0, and H and

.fEE_

1;:lements and operations in the two groups are identical. To show that

E i~; :lsomo:cph:i.c to G and that H is isomorphic to X, (];_, Q, 1) is chosen as a reference point in both 0 and X. Then any point (D*, .§.~, 0 2 ) s G -n can be wrjttea uuiq~ely as

2 [ D* , Ji) a ] (l, Qt 1) - (D-n * , __S , il

? 2 for: * _E_:, o-) c H; and any poiiit (D (Z); _B, s ) t: X can be written lD·---n , n -- uniqu·~1y tis

2 2 (?.), _i_:,_, s ] (D (Z), ru n .... 0' l) n- .0.• s )

') for F,, s'· J Thus H is tsomo:rphic to C and H h> isomorphic 45 to X.

7, \Ve :now consi.dsr the class G of scale changes on X = {(1, y)}, l 0 ') containing elements [A~ 'Y~·J 1 , where

L~:. Y2 Ji c~_, y) (~ !._, y y) •

,<.,; Lerr_i_:~~.:..i· G1 is a group under the operation defined by

2, y 'l

.?... !: ... C?..£f· Closure and associativity are immediate. The identity is

.f,nothe.r class c*2 of transformations on X0 exists, where c*2 is the class of location changes containing elements [c],..,, where - L

rsJ 2 Cf'..... y)

Lemma 5.6. If the operation in G*0 is defined by -·-,-·----·~- .:..

the~ G,..,* is a group. "-·

P·.i~oof I> G*2 is closed and associative with identity element [Q] 2 and inverse ( -·sJ 2.

I.1c:m:ma 5 , 7 , The set G* of ordered pairs

with the operation 46

·- [(~-1 v c' ' - is a group of transformations isomorphic to H, and thus also isomorphic to O.

Proof. i) Closure is apparent upon inspection.

( [ t, -1 A, t •• ly y ! C II -!- A -1 I ] r A 11 A I f 2 I 2 If 2 ] ] :: J'l. J! H y C _.,. £. 2 J ".!;._ ~ J_, y y y 1 and

[[ .-1. '...1.] [ l!'A 2 ,2] ][[ 11 - [A" y"2] ) D:. y .£. , S-. 2' ...:. _, y y 1 .£. J 2 ' - ' 1

so the cperatiori is associative.

iii) The identity element is [ [Q] 2, (_"£, 1] 1) since

and

iv) Th.:, i.nverse 47

sJ.nce

and

Thus G* ts a group. To show that G* is isomorphic to H, we need only ide:atify tbe element [[.£.]2, [!, i11J E: c* with the element[~_._£_, /1 ( H.

2 Lemma 5.8. If 1\ contains elements of H of the form f!, Q, '( ] and H2 contains elementfo of fi of the form (_!_, .£..- J.], then c*1 and 1\ are

:isornorph:i.c, G* and a;:e isomorphic, and .;,md are subgroups of H. 2 H2 Hl Hz 2. Proof...... i) If we identify [!:_, '( I E: G* with [l1_, £ H and __ __ jl l Q, /J 1 .£., 1] s g2 , then it is apparent that c*1 is iso- mo-..:phic to H-· and c* is isomorphic to H " ii) First consider H. . For 1 2 2 l. [_0_, g_,

so H1 is closed. Also,

..., 1 0, y"~J- . -1 --2- l!:_ , _Q, y J [ H1 48

- so that H1 contains identities and by Lemma 2.1, H1 is a subgroup of H. iii) For Ll • ..£.• 1) and [.!_, s_1 , 1) E H2 ,

so H2 is closed. Also,

so H2 contains inverses and is therefore a subgroup of H.

The problem is to estimate orbits of H in 8. 1

Lemma 5.9. is normal subgroup of H. ------~-- H2 "·

-1 2. 2 -1 -----·--Proof. h h2 h [f:., £.., y J [1_, s..' • 1) [.!}_) £.., y ]

21 -1 -2, y J CI l] [A-1, y y L~... £., LI., - ' .!}_£., J

r 2] [-,-1 -1 I -2] ,.!:_, .£.., y !l ' - y .!:_ .£. + .£. ' y

-i - (_~. A - ·y .£.' ' 1] E H2

and H7 is a normal subgroup of H.

Lemma 5.,9 allows us to obtain an index of the orbits of 81 in G.

Jh~9E~m 5,1_. In the linear model estimation problem,

'l' Ce) is ,_, mc:i.xirnal invariant parametric function.

Jroc!_, Usi..:r,g the result of the Corollary to Theorem 4.1.l, 49

- -1 '¥ (6) = hl8 h2e hl6 e0 =

2 -1 = [D * a ] [1_, .§., 1] * .Q., a2] .Q., 1) --n' Q, [D-n ' (!,,

*-1 -2 = [D , O, a ] * .§., a2) -n - (D-n '

(~, D Bia, 1) = -n* - .

Thus the parametric function

B "" -nD * -Bia

2 indexes the cosets of H1 in 8, and the problem of estimating (~,* .§., a ) invariantly under sea.le changes is equivalent to the problem of esti- mating E..

Note that D Sia is a parametric counterpart of the "standardized -n* -- re.gre.ssj_on coefficient" or "beta coefficient" widely used in social science research.

To obtain an invariant sufficient estimator of _!!, we compute T(x) as ind ica tcd by Theori:mi 4. 3. 2. The estimator we will obtain is not, howevt!r, the estimator commonly used by social scientists.

Theorem 5.2. In the linear model problem,

Proof. T(x) 50

Thus, by Theorem 4,3.2,

is an invariant sufficient estimator of ]_.

Note that the usud.l social science estimator of B uses s , the - y marginal standard deviation of y, in the denominator, rather than the

'\, residual standard deviation. Chapter VI will demonstrate that B is a reasonably good estimator of B and is in fact superior to the usual social science estimator according to several criteria. CHAPTER VI

COMPA..~ISON OF THE ESTIMATORS OF ---THE YlAXIMAL INVARIANT PARAMETRIC FUNCTION In this chapter we consider the properties of the proposed function

as an estimator of the maximal invariant parametric function

B = D* -n Ji/a and wiJ.l compare ~ to the usual social science estimator of B. We will

"\., show that the invariant sufficient estimator B of the maximal invariant

parametric function A. is consistent, has small bias, and has a relatively

simple distributi.on. We will also demonstrate that the social science

estimator possesses none of these properties. Throughout this chapter we assume that the error component .f. follows a normal distribution.

"\., Section 6.1. yroperties of B

Definition, _____ .. 6.1.1. If 8 is a parameter which depends on n and 8 ______n n

tends to a constant a as n * 00 , then T-n will be called a consistent

esU.rnator of e if T converges to a (in probability). ------n n

'v ------Theorem 6.1.1. B is a consistent estimator of B. 1< Proof. Assuming D is chosen so that it converges to some fixed -n "\., diagonal matrix j)_., we must show that _!l converges to ~§_/a (in proba-

~\ b:i.Hty), fl_~ If --uD convel·gcs to -· then Dn (Z) - also converges to _ti (i.n prohahll i ty). S:Lnce s 2 ccnvergcs to o2 , Slutsky' s theorem (Cramer,

51 52

1946) assures us that. s converges to o; a.nd since Ji converges to Ji, we

'\, have. (again by Slutsky is theorem), that~ converges to ~§Jo (in probabiUty).

the random variabl<:::

n 2 sz = l i==J. and

n 2 s I

freedrnn and noncentrality parameter

1/2 8 = (n-1) s2 B/o .

Pr?..?f· Under normal theory assumptions,

and thus

'' ~ ( ]\.l.L' / Q ,.11- -I s 2 µ (l (n-1) 1/2 S~ 6 --··-·------'V N --- t, Cl I o J \ j 53

Similarly

2 2 2 T2 = (n-2) s /o ~ xn_2

Since T1 and T2 are independent (Graybill (1961))

(n-1)1/2 (n-1)1/2 sz S/o sz 8 = s 0]1/2 Gn-2) s2I (n-2) is distributed as a noncentral t random variable with n-2 degrees of

112 freedom and noncentrality parameter 6 = (n-1) s2 S/o.

~2r~llary to Lemma 6.1.1. For the rank two case, the random variable B is distributed as (n-1)- l/2 times a noncentral t random variable with n~-2 degrees of freedom and noncentrality parameter ,. = ( 1)1/2 QI 11 n-·. s 2 P a.

Pr_~of. Let t = (n-1) 112 ~. By Lemma 6.1.1, t follows a t distri- bution with n-2 degrees of freedom and noncentrality parameter 8. Since '\· . ·-1/2 B = (n-1) - t, the. proof is inunediate.

1/2 ~ (n-1) -~ Bis

follows a multlvariate noncentral t distribution with n-p degrees of

frc•,!dom and noncentrality para.meter !}_, where

1/2 1/2 1/2 - diag(soo ' sll , .•.• skk) 54 a.nd (1/:3 .. ) is the (i, j) element of (n-1) (~'-~)-l for i, j = 0,1, •.. ,k; 1J and

~ = [( l)l/2 1/2 I r -l)l/2 1/2 I ( l)l/2 1/2 0 I J' u - n- s00 a o, ,n s11 s1 o,. , . , ,n- skk µk o

Proof. The proof follows from the fact that (Graybill, 1961)

(calling s0 = a) has a marginal noncentral t distribution for

'\1 ~ Theorem 6. 1. 2. B :: D (Z) S/s is distributed as a multivariate ------n - - noncentral t random variable with n-p degrees of freedom and non-

centrality parameter.§_ (where i is defined as in Lemma 6.1.2), up to

the scale factor (n-1) 1/ 2 D D-l (Z). ~ n - Proof. The proof follows immediately from Lemma 6.1.2.

To obtain an expression for the density of T0 consider a non- singular matrix A such that

Then the model

can be written 55

and

Then

-1 1 l w' W ~A Z' Z A- I

<.~.:nd

where

2 Var (23) a I

so that the. n. are independent, identically distributed normal random :L variables with means n. ana' variances. a 2 • Thus l

n. Is for i -- 1, ... , p l.

are i.ndepe.ndcnt, identically distributed noncentral t random variables,

and the joint density of

is the product of p univariate t densj.ties: 56

D ~ L. r, ('1 l '! .Pl. 2 i=l •?' Ti 1-,

112 Since i~ the joint density of Tn- ·- (n-1) -::.oD -S/s = w is

I' p/Z fl}l 2 If

The following two lemmas will be used in Theorem 6.1.3 to show

'\, th.at: _ll is asymptotically unbiased.

Lemma 6.1.3. If the random variable u. is distributed as a chi- square random varJ.abJ.e with r degrees of freedom, then

r f!:ij'( 2 ;.::.: -----· 1/2 r lr 2 . I:l2)

Proof. If u ~ x~. then for u > 0,

r u 1 2 - 1 2 f(u) = -----·-· u e '1 12 ~ ( .<.r,r, l ,--1·' r . . !}J 57

u E (ul/2) = __ !__ e 2 du 2r /2 r (t) 0

r-1 2 2 r-1 u r (r;lJ "" 2 -2 u e ::: du f r~l 2r/2 r (f) 0 2 2 r ( r;l)

=----- 21/2 r (f)

Lsnma 6.1.4. If v is distributed as a noncentral t random variable ---·.. -- with r degrees of freedom and noncentrality parameter o, then

E(v) = 0f!.'J1/2 ~ [¥] 2 r(~)

Proof. The first moment of the noncentral t is found by noting

that if v"' t (6), then r

v =

2 u2 "- xr' and u1 and u2 are independent. Then 58

u - cS Q v "'"' __1 ___ + ---- [>J1'2 [:2J1'2

2 Since w ~ N(O, 1), E(w) = O. Since u2 ~ xr'

by LeHmJCJ. 6 .1. 3. Thus,

'\, -··---·---.--Theo!'e.m 6.1.3. The bias in B is asymptotically zero. Proof. Consider the case k = 1. Since by Lemma 6.1.1, (n-l)l/Z ~ has a n.oncentz:al t distribution with n-·2 degrees of freedom and non- 112 ccntrality parameter (n-1) s2 S/o, we have from Lemma 6.1.4 that

(n-1) 1/ 2 S 1/2 . / 1/2 ..., ______8_z_ 2] r(¥) E[°'n--l) BJ = 0 (n-2 r [n;2)

Thu~>,

~ E(B) 59

Stirling's formula gives the following approximatton to the gamma function:

r (x) ~ e -x xx-1/2 fu .

Thus

_ n-3 n·-3 2 2 [ l % e r-J2 fn:l) /.. and

n-3 n-2 - -2- % e [n;2) 2 (Zn)l/2 •

Now

n-3 n-4 - -2- 2 e fn-3) (2n)l/2 r[n;JJ t 2 ----% n-3 n-2 r 2 fE;?-J ---2 ' J e [n;2J (2n)l/2

n-4 -· . (2e) 1/2 [!_l-3)2 - (n-Z)l/2 n-2

n-4 ==f2e"Jl/2 [i -_J_J-2 ln-:.:'. n-2J 60

= (~J 1/2 Rr1 - _1_J n-;1112 r1 -- ...1.....J-1 • n-2 L\ n-2) J l n-2

For large n,

,1 - _!__) n-2 -1 l n-2 -+ e and

so that

~)1/2 -1/2 = i12 ---+ ( e (L) ru-21 n-2 n-2 r1-Jt 2 for large n. Thus for large n, ..

s2 B r.£-Z)l/2 [-'.?.._ll/2 sz 13 E(~) % -- 2 2 =--= B. a • n- J Ci

f£rollary to Theorem 6.1.3. The unbiased estimator of B, in the rank two case, is

'\· B 61

Proof, The proof j_s immediate since by Theorem 6.1.3,

p-2] l/2 _r [n;3 ( J •

2j r(n;2)

Tt.!~E£m 6.,1~£:.· Fork> l~ the unbiased estimator of Bis

B* " [-?_r2 r(n;2J Jl n-2 (n-3] - r -,,-..

--Proof" If the model Y.. =I£.+ .f.. is reparameterized as

V = W n + E ""-- -- -

-1 where W = Z A ·- and !l = ~ f for some nonsingular matrix A such that

then

'V •\, B "' -·-A Dri (}i) !J./ s = (b1. ' • e • 'b p ) and

Now,

'\, E(b.) = g b. J_ J.. where 62

so

E (_B") g B tl --A

-~A == Dn (~) !J./ a

But

n., B ::: D (Z) _'3/s 1l -

-1 ~ Dn- (Z) A _n/s

D (Z) A-l B n·--- -A

Thsn E(f)

= g D (Z) A--l D (W) ~Ji/a n - - n -

g D (Z) S/o n- 63

Note that ,_,.f'"I/' \n -2)]1/2 r[n;2);r[n;3) is less than one. Thus, B*

'V wi.11 tend to be a bit smaller than !, in absolute value.

"I.· We now have that the natural invariant sufficient estimator B of the maximal invariant parametric function I3_ is consistent~ has very small bi.as for large n, and has a relatively simple distribution.

Theorem 6.1.5 will sbow, however, that the unbiased form, _!!,* is not efficient in the Cramer-Rao sense. We refer first to a theorem pre- sented tn Zacks (1971), presenting only the essentials and omitting the prc>of.

Len2!_l~~-__(~-~J..:.2· (The generalized Cramer-Rao inequality.) Let

{f(x; 8) : 8 E O} be a family of density functions satisfying certain general rcgularlty conditions (all of which hold for this situation).

Let g (!;!) be a vector-valued function on C0 such that the matrix of parU.al derivativeB with respect to 8 exists for all 8 c. 0. Let g(x) be an unbiased estimator of g(e) such that

< "" for a.11 i ;· 1, .. , ,r and all e. If the matrix of partial derivatives is nonsingular, then g(x) attains the minimum possible generalized variance if and only if g(x) is a function of the minimal sufficient statistic fc:r IJ and for e.ach i "' l ~ ... , r, 64

a --- log f(x; e , •.• ,er) = c (~_)[g(X) - g(§)] 36. 1 1 ].

where c :l (~) may depend on 6 but not on x.

Theorem 6,1.2_. B* is not an efficient estimator of B (in the

Cramer-Rao sense).

Proof.

1 2 z-

·exp {- ~1~ [(.§.-.§)' Z 1 Z (.§_-_§) + a2 u]} 202

and

+ n-E-2 n , ~ log f = 2 log a 2 log u - 2 ~og 2 - 2 log n

a.nd

a 2 1 ~ = - 2 log a + -, (§.-_§) 1 z 'z (§.-_§) ClCT 20'+ 65 which is n.ot a function of c(B) (!* - .!!,) so that ]!* is not efficient in that it does not attain the Cramer-Rao lower bound.

Even though .~* does not attain the Cramer-Rao lower bound, B* may be considered to be efficient in that it is the uniformly minimum var- iac.ce unbiased e.stimator of B. This is so because B* is unbiased for B and is also a function of the complete sufficient statistic (8, o2) (see

Graybill, 1961).

B* also has smaller mean squared error than '\;B. For simplicity, consider k = 1 and let '\;B = n B* , where

n..., [(n-2)/2]1/2 r(n;3)1r(n;2) > 1 •

Now, E(B* - B)""') == Var(B* ) = v > 0 and

= E(nB *-nB) 2 + E(nB-B) 2

2 2 2 = n v + (n-1) B .

Suppose

2 2 2 v > n v + (n-1) B .

Then

v - 66 or

and

2 (n-1) B2 v < 2 1-;1

2 since if n > 1, l - n < o. Then

2 (l-n) B2 1-n v < Z (1-n) (l+i:) = l+n B

Since n > l~ 1 - n < 0 and

1-n 2 1+11 B < 0 •

which implies that v < 0, which cannot be. Thus, v must be less than or equal to n2 v + (~-1) 2 B2 , or

Si.nee B* is i.m:!.formly mini.mum variance un.biased and has smaller mean

squared error than ].,'V it might appear that 11* would be preferred to 'VB as

an estim.ator of B. However, we feel that in practice, since the bias in

i~ :i.s quite small and since computation of the quantity

[ 21 (n-·2))1/2 rt n;2-) /r [ n;3)

'V :ts quite laborious~ B 'Ls the natural estimator to use unless the problem

spt:c:i_fica.lly re:qui".i::f.=.s use of the unbiased form. Ou.r intent is not, 67 furthermore, to compare the relative merits of the alternative forms of

~ ~ ~' but rather to compare ]_ to the commonly used estimator.

Section 6.2. P!_~E_erties of the Usual Social Sd_ence Estimator

The estimator cf ~ commonly used in the social sciences is

B D (Z) S/s n ·- - y wb2re

11 2 - 2 s = (y. - y) /(n-1) I l. y i=l

We will show that ]. does not possess desirable properties: although J?. is invariant and is a function of the sufficient statistic for (D , 6, -n* - 2 A o ) ,. ~ is not a. consistent estimator of ]., it is not unbiased, and it has an unwieldly distribution.

;Lemma 6~.f_.1_. B is not a consistent estimator of B.

Pro_of, Recall that in order for B to be a consistent estimator of

]i, -~must converge to !::._ §.! 0 in probability, assuming that D* is chosen -n

to converge to a diagonal matrix ~· We note that

where

a.nd 68

Z 1 Z = n 0 • • • 0

0

for ~J1 ~ 1 a k x k matrix. Then - L

n 2 1..rY -· Y. ·- 2/ _. -01' f1 + SSE I 1 i~l~ y j_jl n ' ~ ~1' ~1 i"-'l [ whc;:re

is the error or residual sum of squares. Thus

n \' 2 l Yi i:

2 Now, SSE converges to (n-p) a and ~iI1 /(n-l) will converge to some positive ck'.finite. matrix K. Thus, s 2 converges to K S + a2) 1 / 2 :/; a, y CB'-1 - -1 and B does not converge to~ (in probability).

Lemma 6.2.2. B is not an unbiased estimator of B.

1; Proof. If D is chosi;n to converge to some dia~onal matrix _fl., ·-n ~ ~- th er:. By Lemma 6.2.1, ~converges to • 69

112 !:._ fl./(Jii !S. 01 + ..;2) :f. ~fl.la unless.§_= .Q. or K = Q. Since the parameter and the estj.mator do not converge to the same value, B cannot be un- biased for B.

We also note that since !. is positive definite, .iii K !i > O, so

that l:i B/(8' KS cr 2) 1 / 2 < _B, and B tends to underestimate]. in - - -1 - -·l + absolute value. The effects of this underestimation will be illustrated

in Chapter X.

Lemma 6.2.3. For k == l, B can be written

1/2 B = _u__ ..,.... (u+r)l/2

where u. is distribute.cl as a noncentral F random variable with 1 and

r = n -· 2 degrees of freedom and noncentrality parameter

A = (n-1) s~ s2/2cr2 •

Proof. u1-.e note t h.at tor - k· = 1 , s 2 can b e written . --- y

Thus

s s s y y

s sz 8 = ~~~~~~~~~- s 70

l ·72 2 ~1/2 1 ~--~~- + n-2_ 2 n-1 s l- -·

2 A2 f 2 L'o::t u -- (n-·l) "'z B , s Then since

2 2 6 ~ N(B, o /(n-1) sz) ,

f 1 , ') A 1/2 (n·-1) _,.f ,_ s B/o ~ N(n-1) - s S/o, 1) z 2 and

2 A2 ') 2 (n-1) s B /o'- '\.· x, (\) z .1.. whe.re

A (n-1) sz2 "2/02µ .

Also,

so that

and

1/2 ... 1/2 Theri u \n-1; s.,7 6/ s arid 71

1/2 1 \l B (n-1)1/2.

1 ul/2 1/2 -(n-l) 1/2 ~---±-

1 1/2 = -----·1/2 u [u + r]

Le:~:Jua ?_::~. .:!~· The dj.stribution of B, in the rank two case, is

r-2 + 1 + rJI -A 2 2 i rf2i 2 2e. (1-·_w_."..___ ~ ;\ · 2i f (w) -- t. L - --w r[~) i.;O i! r[2\+ ~)

F 0.). That is, for u ~ 0, l,r

r/2 ~ll + r + 2iJ 2i - 1 oo -A. i r 1 I 2 2 ·- e A. · u f(u) ) --1--. 2i + l + r · ~ i. ..J i--.::::o r \2c~ . z ::.__lJ r [· ) 2 -i (r + u)

Let 72

Then

2 w == u/r + u)

2 2 w r+w u=u

w 2 r=ul-w ( 2)

and

du "' 2rw (1 - ,}) ·-l + w2 r (-1) (1 - v.,2)- 2 (-2w) dw

3 3 2r.:w - 2rw + 2rw ~' ----·--2-,.,--- ( 1 - w ·) '-

2 2 2rw/(l - w )

Also,

2 rw r + u r +--2 1-·w

2 2 r - rw + rw 1 - w2

r/(1-,}) 73

2i -..i "" 2 -~;21 2__ 2rw f(w) K I Gi Jrw I {l - __ 2i 1 r i=O + + [r/(l - w2)] 2 where

and

Simplifying the expression for f(w),

2i - 1 21 + l + r 00 ---- 2 2 ir_w2) 2 (1 - w ) 2rw f(w) ::;: K. G. I }_ 21 - l 21 + 1 + r i'"o ~------(1 - w2) 2 r 2 (1 - w2)2

2i + l 2i + 1 + r r--2- _w_z_..)_-=~~~2--~== \' -w2_:-__{} 2K l G; i=O -- 2i - 3 ------2i + 1 + r (1 - w2) 2 r 2

r-2 G. w2i (1 - w2) 2 :;.

r--2 '"' _?:y,;_ ( 1r/ 2w~J...~ I Gi r i=O Thus

r-2 i + ~ + r) -/... r/2 rt'21 e r /... 2i f (w) ,,,, JJl - ,}L2__. \' .:__ ·- ---- w r/2 (.,.I l ·I r i~O 1 ' rf 2i + 1) r 1.::..J \2 l 2 J

r-·2 -/... 2 2 _k_~ "" >.. i 2i I . I ------w ,Jl.r·J i=O i. rf 2i_±__l) 1 l 2 \ 2

CJ.ea.rl.y, B does not have a simple distribution, even for the k = 1

case.

We note J.n closing that .!?_,'V but not B* or lJ_, is a maximum likelihood

estima.to': of A. since by the invariant property of maximum likelihood

sta.tlstics (HQgg arid Craig, 1965) the.y .are single-valued fur1ction.s of

t,,e" maximum 11ke.1:ihood estimator of ( Ji, a 2.) •

'\, The results pr12sented thus far indkate that the estimator B

introduced in this dissertation is superior to the usual social science

'\, estimator]. as an estimator of A_, the "beta coefficient." B is consis-

'\, tent, has a relatively simple distribution, and an unbiased form of B

can he co1;.structed which has uniformly minimum variance; none of these

'\, properties are true for !· The ramifications of the use of B instead

of B w:U1 be illustrated in Chapter X in m.nnerj.cal examples employing

actual. social science and engineering research data. CHAPTER VII ----,.,..RELATIONSHIP OF PROPOSED ESTIMATOR TO THE MINIMUM RISK._£;.Q.~IVARJ;f\.NT AND FIDUCIAL ESTIMATORS

In the linear model problem under consideration, the group of tnmsfonnations H = { (!:_, .£.• /J} is isomorphic to the parameter space e, and thus if an. invariant loss function is defined, an equivariant decis:i.on p:r.oblem is obtained. Since H is isomorphic to 8, the maximal invariant function with respect to H is a constant and the problem of finding a minimum risk equivariant estimator of 8 = (D*, .£3.., a2) which -n 2 is a function of the sufficient statistic x = (D (Z), S, s ) can be n - -- treated in the usual manner. We also note that if Hwere transitive on, but not :i.somorphic to G, an equi.variant decision problem could still be defined (Kudo, 1955).

Consider an equivar:i.ant estimator

Y.,,-h,_,<,-,_· ""· _t~. < (? ) -=- ,. Rn e E R+ ::ind v ...... _c 1 "' ·o• :-~o. , =-2 c. ' • . c - . ' "- .

L(8, e) trace(D *-1 )' (e - D* )' (e - D)* D*-1 - --n -1 -n -1 :.1 -n

~, -.:. + a (e - _B)' D*, D* (e,. - _B_) ·-2 -11 -n --1..

75 76

+ 0-4 ( e3 - o 2)2

is invariant under H* . Proof.

-1 * * * -1 L(htl, he) ~ trac.e.[(f2u* !l) ] ' (e1 Jl.-12n1\)' (~1 Jl. - ~ Jl.) (~ ]})

-l ? ? 2 2 ~ + (yo) ..., (y'· e3 -· y a ).:.

2 2 2 h (e - o ) ] 3

-2 2 l * * 1 + (ye;) y [ (~-2 - ~) I (Jl.- ) ' AI (12.a) ' ~ !)__ !!.

-4 4 ( 2)2 ( ya ) y e - a + 3

=tr.ace, rn*-l\, , ( e - 0*) 1 (e - o*> n*-l ·-n - 1 ~ - 1 -n -n

-2 4 2 2 .'· a (~ - _c~) ' (D * ) ' D* (e 8) + a- (e a ) ' 2 ~- -n -n -2 - -- 3 -

""' L(8~ e) .

Recall that risk is df;fined as the expected value of the loss. 77

(D (Z), _13, s 2) n- is a min:Lmum risk equivariant estimator of 8 == (D * » .§.., o 2 ) • -n Proof. Consid<.:.r the first term of the loss function. Its expected value,

*-1 Eitrace(D ) ' . -r1 can be written

*-2 Ef~race( 0 - D)* D (e - D* )'] '._ -!:.1 -n --n ·-1 -n s:i.rn:e. traces a.re :tnvariant under such changes. Si.nee this term is non- negative and D (Z) = D with probability one, this term will be zero if n - -n*

1.-• D (Z) , and thus minirr1ized., ..::.1 n ··~ The second term of the risk function is

-z n* 1 n* E [ 0 (~2 - Ji) I -n --n (~2 - s) ] •

Since i3 is unbiased for _@_, this term can be expanded as

-2 '~ -2 E[0 - .§)I D ' D* f)) E [ cr Ji) I D*, D * (~ (.::_2 --n --n (!7_2 - + (f - -n 11 - §J ' where the second ter~m does not depend on ~2 and the first term is rol.ni- mi:<:e.d by letting £2 _[3_ ~ Similarly, the third term in the risk function can be expanded into

2 and this term is min:i.mized by taking e 3 s .

Thf~Or em 7 • l , 2 . T (x) -- (l_, ~) 1) J.s a maximal invariant function of '--"····-~~-·- --.. ·~--.... -.. - "" 78

the minimum risk equivariant estj_mator.

'\, ·-1 ~ 2 --·Proof. T(x) == (!., ~' 1) = hlx (~ (~), ~' s ) is a function of the 2 mini.mum risk e:quivariant estimator (D (Z), 8, s) and T(x) indexes the n- - • v orbits of H1 l.tl /\.

See!_iotl 7. 2. Rel.ations!)._ip to _t..,h.e Fiducial Estimator

'\, We will now show that B is also a maximal invariaut function of the

f1d'.J.cia1 estimator. Although a th(~orem of Hora and Buehler (1966) can

be used which immediately shows that the minimum risk equivariant esti-·

mator is a minimum risk fiducial estimator, we will show this result

directly in ord1::r to demonstrate the similarity between the structure

of the problem studied in this dissertation and the corresponding fid-

ucial problem. We do not intend, however, to pursue the philosophical

implications of the fiduc:iaJ. argument.

Our approach to f:!.ducial estimation will follow that of Fraser

(1961 b); that is~ we shall relate. our proposed estimator to the fid-

ucial est.imatcr as defined by Fraser, which is not necessarily the same

as the one proposed by Fisher (1934).

fJ.ci~nt statistic 2.nd the parameter which has a fixed distribution

regardless of the value of the parameter.

Frase:r developed a pivotal quantity as follows. By the isomorph-

isms of H to x and H to (:) and the isomoi;phism between H and H, each

point x c x corresponds to a point e £ e, where the distribution of x

has :Lndex valur:. 8 . ·S] ··"'-·-~e· .,(~ = h· e e' o' 79

is a nrndom variable with a e0-distribution. Thus

i.s a function of e and x having a fixed distributi.on regardless of the val.ue of 8; p is a pivotal quantity. (Note that the fiducial argument rel.i.es he.avily on the permissibility of applying transformations in H t(• elements of X. This procedure is apparently philosophically justi- fie.d by the relaticnships between X, H, H, and G.)

Lerru.!lLl..!.£.::.1.· In the linear model estimation problem defined in

Chapter V, the pivotal que.ntity (with respect to H) is

*-1 A 2 2 [D (Z) D , D* ((3 - (3)/a, s /cr ] n- -n --n --

~roof. In the linear model probleni,

which implies that

-1 2 -1 h * .§_. a ] [D (Z), ..§.., 82] (.!_, 1) he x. XO ·- [D-n ' n- Q,

~' 2 -1 ") :::: .§_, (J J (D (Z) ~ ..§.., s''') [D-n ' n-

''t-l -·1 (J-2] 2 -· f D 0 D* 8, (D (Z) 9 .§.., s ) '-,:1 ' -n - n-

A :~-1 -1 -1 ;'{ -L 2 (D (Z) D • (J D * B a D 8. 0 " s ) n ·-· -,) - -n - -n -- 80

*-1 2 2 (D (Z) D .. D* (_fs - _S)/a, s /a) • n-- -n '-n

To verify that p is indeed a pivotal quantity, we first note that

p is a function of th•::. sufficient statistic and the parameter. Now,

.tJ.2 is distribut•::d as N(Q., ..!.>, since

and UJ2 is distributed as x21 (n-p), so that p ha.s "" fixed distribution

B.nd is a pivotal quantity.

As a function of x and 8, pis invariant under transformations in

H, Following Fraser's proof, if h is a transformation in H, then

x = h x x 0

is transformed into

and a distribution

for x becomes the distribution

h8

for hx" Hence 81

- -1 h h p . 8 x =

Specifically for the linear model,

2 lt h -- .£., y ] [D (Z), _§.., s2] 'X [A. n -

·-1 ~ 2 [D (Z) /\ y s2] - n- f;., y ~ + £., and

* 2 -1 2 -1 - [p"TI' fl.: (j ] [.!!., .£_, y ]

J *--1 -1 -1 -1 -2 -2 - o y D* A c - a D* B, a y ) -uf-~, -u-- -n--

-] *-1 -1 -1 -2 -2 [A .. D -- a D * (y .!:.£..+..§..),a y ) • - -n ' -n

-1 ,,_, (!:_ D ~ -1 '" •-1 ) -2. -2 ( ) -1 A y2 5 2} -n - er Qn (y fl f;-_ + ..§.. ' a Y ] [Dn ~ fl~ Y !:.. ..§.. + £..,

-.1. 1<-1 -1 -1 * -1 A -1 * -1 = [D (Z) A A D , CT ~ -~ (y !:.. ~(y n ·- --n Y ..§.. + .s) - a. !:.S:.. + .§),

-·2 -2 2 a "Y ·y s 2]

A '~-1 -1 '"}r:. -·l -2 2 [.D (Z) D . 0 D 6 CJ D* _§_, a s ] - n -· ·-n • -n -- -n

-;,_1 )~ 2 2 f D (?'' 1) '- D -· s /o ] . 1.1 . _ _) -'1.1 , -n (l3_ .~) /o'

r> < 82

Any function of x and 8 ,.;h:i.ch is invariant under H is a function - -1 of the pivotal quantity p = h 8 hx The proof of this statement is again Fraser 1 s: Let F(x; 8) be a function of x and 6 which is invariant under H, Then

F(hx; h8) - F(x; 8) for h E H and h E H

Then

F(x; 6) F(h -· x xo: Fie 80)

-1 -1 }' 'h h I. e x xo; he he 80)

-1 F(h h -· 8 x xo; 80) •

- -1 so that F has been expressed as a function of h 0 hx = p. Any in- variant pivotal quantity is a function of the pivotal quantity p, so tl-:tat in a se11se p is rnaxi-c1al.

The fiducial distribution for B can be obtained as follows. Let p be a variable with the fixed pivotal quantity distribution, let x be the observed value of the sufficient statistic, and let e be a vari- a.ble de<3ignating possible values of 6 given the frequency and observa- tional information, The pivotal equation can be solved for 8 in terms of r and x:

p ::::: h he x

·-1. = h p Ee x 83 which implies that

-1 a - he e0 = hx µ e0 .

Theoren1 7 . 2 . 1. In the linear model problem, the fiducial distri-

buU.on is

t n-p •

Prg_of. From Lemma 7.2.l~ the pivotal quantity in the linear model

is

and

--1 p -1 u-2] [!:!.1 • - u;l Y.1 !:!.2, 3

-1 u-2J u.. - u3 !:!.2' 3 w:i.th probabil:Lty one. Now,

and thus

-1 -1 v; (Z) s2] 0 "" h Iv ·- Ln_;rn s Q;_, - u - u:Z) x eo -' 3 Y..2' j

_., 2 --2 ··-· (D (Z) s D ~(Z) (- u-1 + s n -- ' n - 3 Y.2) f. u3 )

•) '· x~ I (n·-p). Thus 84

2 2 2 s s (n-p) s - '\, ---· u2 2 3 x

Also, since .f!.2 ..1 .. N (Q_, D ,

-1 s D (Z) _U n ·-·· 2 "v s D-l (~) t u3 n n-p

wh~X:·2. t is a. t random variable 1o;ith 11-p degrees of freedom. The n"·P fiducial variables are therefore

D* ,,. D (Z) ~n n ·-

and

B = B s D-l(Z) t n -- n-p

Tha.t is,

gives the fiducial distribution.

Under squared ~r-ror loss the fiducial estimator is the mean of the fj_ducial distriln:tion. We will now show that the fiducial estimator of 2 8 is the rn:i.nimum risk equivar:Lant es time.tor, (D 11 (~_), f~ s ). We may restrict our attention to the pivotal quantity

I 2, 2. ( * ~~/ ) --11 D - A ,a, s ;a ) 85 and for s:implic:i.ty we will consider the case for k = 1. Then the pivotal quantity becomes

s 2 /cr 21

)

The joint probability element of (T1 , T2) is

2 s e - 202 d f8 -__e.1 o/s l z ) whe::re r - n - 2 and

l c = r+l .1/2 2 n 2 o r r [r12)

In order to obtain the fiducial distribution we need to find the 2 probability element dS do • It can be shown (Nachbin, 1965) that

the left invariant Haar measure generated by (A, y) is of the form

S - B) d [ .1 - o/s2 J -

and

2 ') = 2-- do"" 2 Ci

w.k:rc\ \'-', I s., )2 and ,s;r .. o.)2 are the appropriate modular functions. L"

Th~· t idud.a.1. probability element can be written 86

2 do • [~J [::] dB

Since the loss is squared error and the fiducial estimation problem is analagous tc• ~ Bayesian estimation problem with a Haar invariant prior

(Good, 1972; Fraser, 1961 b), the fiducial estimator is the mean of the fiducl.al deuslty (the posterior). These means are obtained (similar to

Frase.r 1 196l;) by integrating the marginal fiducial densities of ~ and 2 G • Th~:se matgi.na.ls can be represented as

s -- T~ B "·· S ..)

and

where T3 is a t random variable with r = n - 2 degrees of freedom and ·-l T4 is r times a chi-square variable with r degrees of freedom. Then in ordex to f:lnd the means we evaluate

r ~ 2 ~ 2 2 - 1 ·- (S - S) I (20 .. /s ) f5 2.1 2 2 2 [ 21 f 2 e 9 20 . See 1 [tj :2J dB di · I. cr J and

- ( h~ - ~')2/(~ LO 2/ B2) ( 2 s /2a .2._ L dS . 2 J er ce e 2 2 [ J 2 [ 21 do sz 02 J

Since these integrals are Haar invariant) they are the same as 87

and

,, L f s s 2/2a 2 d f S~ - S] d r~ 2] . . -~2 ce \a!sz la2

Thus,

and

and the f iductal estimates a.re also the minimum risk equivariant esti-

'\.· mat ors. Thus B :Ls a maximal invad.ant function of the fiducia.1 es ti- ma.tor, as well as a maximal invariant function of the minimum· risk equivar:i.<:mt estimator. CHAPTER VIII ------APPLICATION TO THE LINEAR HODEL: STOCHASTIC INPUT MATRIX

Section 8.1. Alternative Parametric Formulations

It is of ten the. case that the input matrix Z in the linear model

cannot b(~ fixed by the experimenter. For example in social science re- search, it may be the case that the levels of the independent variables z_ ••.• , z •.. can11ot be controlled by the experimenter, but are rather .l. , I.\. rand.cm observations on the k variables. This is usually the case in areas such as ~rnrve.y research. We will show that, depending on what the experimenter wishes to measure, there are two maximal invariant para- metric fuuctions which arise naturally. In one case, the natural in-

'\, variant estimator will be ]?_, and in the other case the natural invariant est:l.mator wil.1 b"c; the social science estimator ].. We will consider only the rank two model and will assume that z and y are_jointlv distributed as bivariate norm.al random variablesc

The group of transformations on the sample space is the same as in

Chapter V:

G* = {(A, c, yJ} where

P, , c , y j ( z ~ y) = ( z A , yy + ~Ac) •

Depending upon the aims of the researcher, the joint distribution

88 89 cf z and y can be parameterized in one of two alternative manners" One parameterization, to which we shall refer as Formulation One, defines the parameter space as

2 2 a , o , S) } 8 = {(µ z ' jJ y ' z y whe:ce p a.ncl p are the means of z and y, respectively, and o2 and o2 are z y z y th•2 marginal variances of the two variables. Formulation Two defines the parameter space as

2 2 8=={(µ,µ, z y a z • o ' S)} '

2 where o i.s the conditional variance of y, given z. We. will show that the investigator chooses Formulation One or Formulation Two as he wishes to estimate the. ste.ndardized change in y relative to the marginal or conditional standardized change in z, respectively.

For example, suppose the researcher is interested in predicting the ratf.; of drug use in a community as a function of the community's atti- tude toward drug use. He might then decide to predict the standardized chDDg

tion will oecorne apparent when the parametric functions are obtained.)

The group of trans:fonnations induced on 8 is, as before,

G·- {[.' 9 c, y]}, 90

•·Il-l'i:!re, under Formulation One,

2 2 y] { ]J jl o 0 S) [ .\' c' \ z' y' z' y'

'l 2 2 2 ·- (;\ ]J • .\""er y 0 y>,-\3 c) z yµy~ z. y • + • and under Formulation Two,

2 2 The J··ointl';' sufficient statistic for (µ , µ er o , B) is usually z y; z' y vnitten

2 s , B) y

2 2 and the. sufficient statistic for (µ • µ er , er , S) is usually written 2 , y' z

(z ~ y i B) ; and the induced group G on the sufficient statistic space acts exactly

1_ 1..• 1

The groups R, H1 ,, H2 and the indue.ed groups H, H1 and H2 are de- fined as before, lea.ding to the maximal invariant parametric functions

a B/o for Formulation One z y and

c G/G for Formulation Two • z

Nots that botl1 B1 and B2 are standardized regression coefficients; the 91 standard:Lzations ,, however) are different.

Referring back to the example in which the researcher wishes to predict change in drug use from change in attitude towards drugs, B1 measures standardized change in drug use rate for a one-standard devia- tion chang•.z in attitude• as compared to all corrununities, regardless of their attitudee. B2 measures the standardized change in use rate as compared to Gther communities of the same attitude,

The invariant suf f icie.nt estimator of n1 produced by the theory introduced in this dissertation is

B -- s B/s z y and the invariant su:ff:kient estimator of B2 is

'\, " B s s/s = z

'v In the stochastic input case, both B and B are reasonable estimators of tbe:Lr respective parameters. In the next subsection we will consider the consistency a:r.d distrtbutJ.ons of the est:I.mators,

Let us fi:cst consider Bas an estimator of B1 • We note that Bis equivalent to the sample correlation coefficient p (recalling that k =I). To sho~ this, let

n <:' \ - 2 2 '-' - /, (z~ z) (n-1) 8 zz _l = z i=l

- 2 2 (y. - y) (n-1) s :!. y 92 and

n s zy = l (zi - z) (yi - y) . i=l

Then

s 8 (n-1)112 s B= _z_ = z 8 Sy (n-1)1/2 S y

= [:zz]l/2 :•Y YY zz

s = __.... z..,.y ___ = P (S S )1/2 zz yy

To show that B is a consistent estimator of B1 , we note that, by Slut.sky's theorem,

s 8 a 8 B ""_z_ converges to z s --=a y y

For the distribution of B, recall that in Lemma 6.2.2, when z was nonstochastic we found its distribution was 93

2i f(w) w

where r n - 2 and A= (n-1) s 2 2/2a2 • That is, = z s

-(n-1) s 2 82/2a2 r-Z 2 i r (2i+l+r) 00 n-1) s 8j. 2 __2e z (l-w2) 2 z 2i f (w) - -- w i=Ol ~ r (i] f(w) can be shown to be the conditional density of S given z in the sto- chastic case. Considering the case in which z is normal, note that f(w) depends on z only through s 2 • Let x (n-1) s 2/o 2 · then - 2 = z z '

and the marginal density of x is

r-1 1 2 -v/2 g(v) = ----- v e for v > 0 , r+l 2 2 r [r;l) where r = n - 2.

Let

2 2 2 2 (n-1) s a (n-1) s a z z z z u = 2 = -2- 2 == -2- x a 0 a 0 z

Then 94

2 0 x=--2 u 0 z and

2 dx '""-0 - du 2 0 z and the density of u is

2 0 r-1 - - t/2 2 a z e [!> tj 2

The conditional dens:lty fJ3 Iu (w) may be written

2 r-2 2 -tB /2 (l 2, 2 ts2j i r (_2i+-~+r_J w2i = -=-~ -~~ ~ 2 [ r(fJ i=O 2 i! r(2~+1)

Thei.1 the joint density of S and u can be written £BA,,u (w, t)

t 2 2 2 - -(B + a /a ) r-2 r-1 2 z 2 2 2 r (2i+l+r) e (J.·-W ) t (o/o )r+l 00 z 2 J :::::; ------·----,.-~ .. w2i r-1 --- l ? i=O rt~l . I i. r (2~+1) 2 ~ r rf~2) r.I [r;l) 95

r-2 r (2i+~+r) (l-w2) 2 1,a I a ) r+l 00 r+2i-l /0 2) __z ___ (wt3)2i 2 - !.(B2+022 z = l t e r-1 i=O 2i 2 2 r r ( i! r (2~+1) l!:)l2 r;l)

fB~ (w. t) dt The marginal density of B is fi(w) = f , u . t

()0 00 r+2i-l ( 2 = c G. t I 1 i=O J 0 where r-2 2 2 (1-w ) (0/0 )r+l C "" -----·---z'---r-1 2 2 r (f) r (r;1)

and

r(21+1+r] l 2

i ! r (2;+1)

and the. integrand is in the form of a gamma density with parameters

a= (r+2i+l)/2 and 96

Thus r+2i+l

c G r (r+2i+l) 2 2 I [ 2 2. 2 i=O i 2 B +a /az l

r-2 2 .. , 2) 2 t ... -w (a/a )r+l z s2i w2i r 2L(2i+l+r)/2) =' ------l r+2i+l i=O 2

fB(w) is the density of B when y and z are bivariate normal.

Note that i.f S ::: O, all the terms in the sum will be zero except the first (for io::O). Then fi(w) becomes

r-2 r2 2(1-w (o/a )r+l (r;l) 2) 2 z fB(w) r+l r r 2 (f) (r;1J (a2/a 2) r (%) z 97

r-2

2(1-w2 ) 2 r (r+21)

= ~~~~~~~~~ r(f) r(1)

Graybill (1961) gives the density of the sample correlation coefficient p • when p '" 0, as

n-4 r(~) c1-?) 2 f(p;O) for - 1 < p < 1 r(n;2) r(})

If we consider that f~(w) was derived for w > 0 and recall that r = n - 2, we see that for the case in which 6 = O, the central density for B re- duces to the central density for p. Note also that the density, f~l 5 (w), of B j_n the nonstochastic input case (Lemma 6.2.2) reduces to the same density, although B cannot be interpreted as a correlation coefficient if z ts nonrandom.

Eection_§_,}. Properties of the Estimator _in Formulation Two

"\, "\, Directing our attention to B, it is apparent that B is a consistent estimator of B2 since by Slutsky's theorem,

a "\, s f3 z e B ""_L_ converges to --= s 0

~ When z was nonstochastic~ we found that B was distributed as the

constant (n-1) 112 times a noncentral t d:i.stirbution with r = n - 2 98 degrees of freedom and noncentrality parameter 6 = (n-1) 112 s 8/o z (Theorem 6.1.2). That is,

li2 .r/2 -02/2 ( n-· l) r e r[(r+i+l)/2] o1 2112 u1 f(u) = -..-.-.--'--·-·-- l _r+l i=O 1.. I ( r + u 2) i/2 "1/2 r (~) 2 (r+u2)

or since n·-1 ·- r+l and o = (r+l)l/Z s S/o, z

(r+l)s 2 s2 z r[~J tr+l)2 sji/2 00 f(u) r+l l i=O i! (r+u2)i/2 1fl/2 r (fJ (r-t-u2) 2

In the stochastic input case f(u) is the conditional density of B 2 given z. Note that f(u) depends on z only through s 2 Let x =

(r+l) s 2 ;aI 2 ; then x is distributed as a chi-square variab 1 e with z z n-1 = r + 1 degrees of freedom and thus has the density

r-1 w 1 2 2 g(w) w e for w > 0 .

Let

- . 2 2 ( r+J.) s a (r+l) s 2 a 2 z z z z v = ------:::: 2 2 ---2- = -2- x I} 0 o a z 99

2 2 2 Then c/r,;fo z = x and (cr /o z ) dv = dx and the density of v is

2 0 -- 2 t/2 0 z g ( t) .= -rT-,1- 1--- e 2 2

Then the conditional density of "'B given v may be written

Then fB~ (u,t) - f~B'I (u) g(t) ,v 1v

r+l 1/2 2 2 r fr+l) 1T l 2 r (f)

I f[(r-~i+l~/2J(t8 2 2i/ 2 2112 u1 i=O i! (r+u2)i/Z 100

r/2 (r+l) 112 r ( o / 0 ) r+1 = __..,. ____ z r+l r+l 2 2 1/2 2 2 r [r+ll-- 7T (r+u ) 2 ) r(f)

- I(B2+02 /oz2) e

'\, The marginal de.nsity of B is obtained by integrating f"-B· (u, t) over t: ,v

co r+i-1 t(Q2 2) -- µ +o 21 a 2 2 z t e dt

where

(r+l)l/2 rr/2 (o/o )r+l z k = ------r+l r+l 2 2 1/2 7T r(f) (r+u2) 2 and 101

The integrand is in the form of a gamma with parameters

a=(r+i+l)/2 and

Thus

r+i+l 2 [e2.J, 0 .2]

r+2i+l i 2 r2 (r+~+l) Si u 2 "" = k I r+i+l i=O 2 •I (r+u2)i/2 (132+o2/o 2) 1. z

(r+l)i/2 rr/2 (o/o )r+l z = --.. --- r+l r (r;l) TTl/2 r (f) (r+u2) 2

r2 (E+;+1) (2ui3) i

I r+i+l i==O 2

If S 0, then f~(u) becomes 102

(r+l)l/2 r r/2 ( o I o )r+l z r+l r+l 2 r r (r+i/) (02/0 2) (-r;l] ·rrl/2 (f) 2 z

r [ r+l) \ 2 r+l 1/2 2 71 r (.E.l (r+u2) 2) which is (r+1/i2 "' (n-1) 112 times the central t distribution. In the central case, then, the distribution of "'B is the same, whether z is stochastic or nonstochastic. This suggests a natural procedure for testing the hypothesis B = 0 in the stochastic input case. CHAPTER IX

ESTIYJATION IN THE LINEAR MODEL: INVARIANCE

UNDER NONSINGULAR TRANSFORMATIONS

In thJ.s cha.pter the proposed method of estimating orbits is applied to the. problem of e.stimating the parameters in the linear model in a manner invariant under nonsingular transformations. It was hoped that such a. procedure might lead to an estimator "invariant" to the effects of multicolinearity.

Consider the problem of estlmation in the model

when~ Z is a non-stochastic input matrix such that I' ~ is very nearly s::l.ngular. (For simplicity of notation we do not assume Z has a column of ones and thus the model has no intercept.) That is, the eigenvalues

\. of Z' Z are such that J_

> \ > 0 p but

\ < E, p where E, is a pre-selected small positive number.

This situation ar:Lses when the input variables z1 , .•. ,zp are very

highly cor~elated. The result is that the net regression coefficients

s1 , •.. ,Bp are very unreliable estimates of the effects of the inde- pendent variables, since the generalized variance off,

p -1 J1 A.-J. ' i=l tends to infinity as \ approaches zero. p

103 104

It seems natural to try to obtain an estimator invariant under the group of orthogonal transformations on the input variables. However, application. of an orthogonal transformation does not reduce the gen- er.alized variance. If we write the model as

where P is a p x p orthogonal matrix;

u -· z p and

o=P'.§.

Then

U' U = P' Z' _?;__f = .Q.'QQ where Q j_s orthogonal• and

Thus

o = P' S and

2 p -1 V(§_) - a IT .\. i=l )_ 105 so that the generalized variance is not reduced.

Thus we consider a group of transformations

G* = { [ N ,_c:.., y J } , when~ !i. is a p x p nonsingular matrix, .£ is an n x 1 vector, and y is a positive scalar, acting on the sample space

wh~:ire

if

-Z = =oZ -A for a nonsingular p x p matrix A and a fixed matrix ~ which satisfies the c.ondition

The act:i.on of G* on X0 is defined by

fB.,.£., y] (~,y)

We may write the model y == ~§. + e: as

v=Z A8+e: "'- =-0-- -

(9 .1) 106 where

Th.,; parameter space consists of elements of the form

2 e =

-1 2 2 = (~B_, Y.~ A+£, Ya) •

The group of transformations C is isomorphic to 0, where (~-0,.Q_,l) is chosen as a reference point in E1 and is also isomorphic to H, the semi- direct product of the groups

and

Hl = { [B_,Q, 1] } •

It is simple to show that IL. is a normal subgroup of H. Thus the. theory l .. presented in this dissertation applies.

The maximal invarie.nt parametric function is

-1 2 ¢(8) [A ,Q,l J

In ord'i?:r to estimate i? invariantly under the subgroup H1 , one could use the maximal invariant function 107

-1 -1 A 2 T(x) = hlx x = [,~ ,.Q.,l] (~,.§_,s )

This estimator is nothing more than the estimator obtained under an orthonorrnalizati.on of the problem. This is easily seen if we consider

tb12. model wdtten in (9 .1),

7 Since ~J ~o !, this model is orthonormalized:

T (Z' Z )-l Z' v = Z' v .:::.0 .:::.0 .::0 "'- .::.0 -'-

-1' But -~ = .?-.o A implies that !:.Q = (~ A-l)' = A ~I' so

-1' T·-A Z'y_

A-1' = Z' .§. (!' .::0 ~A)

-1' A (!' ~) .§.

The generalized variance of this estimator is

V(~)

so that the generalized variance is constant regardless of the value of

Z' 7. Clearly, the estimator l is unaffected by rnulticolinearity since

the sc~ctalized variance of 1 js c2! regardless of the generalized 108 variance of .§_.

Thus it is a simple matter to estimate ~ invariantly under the

group of nonsingular transformations. Interpretation of 2.. =!§.,how-

ever, is unclear. If the orthonormalized input variables have physical

rne.aning in a gi.ven situation, the interpretation of .!.. presents no

problem. Otherwise, this approach does not seem to be profitable in any

actual research situations.

We note in closing that it might be possible to generate a group of

transformations which affects only the smallest eigenvalues of '!:..' '!:..·

lf sucb a group can be formulated and the estimation problem carries the

group naturally, it may lead to a quite useful tool with which to tackle

the problem of multicolinearity. CHAPTER X

EXA.i.'1PLES OF APPLICATION OF THE PROPOSED PROCEDURES

Section 10.1. Application to Sociological Data

As an appl:i.cation of the methods developed previously, we consider a sociological linear model provided by Dr. D. L. Klemmack, Department of Sociology, Virginia Polytechnic Institute and State University.

Models such as the one presented are called "path models" in sociology.

As indicated in Figure 10.J..1 the effect of any independent variable upon a de.pendent variable can be represented by an arrow, or "path" fr.om the independent variable to the dependent variable. The strength cf the relationship indicated by the arrow is denoted by a "path" or

11 beta11 coefficient. This diagram leads to the following multiple regres- sion equations

We. can also consider the equations

109 Figure 10 .1.1. Path diagram showing influence of family characteristics and age, marriage aspirations (from Klemmack).

...... -- AF_S__ --If .148 ·--..JI-.---F-OC_l ______;------FS --.------;_· ------.139 1 ! ·~--... / ' _, • ty of : Family size i -~ Anticipated ! j FemJ.m. . ; x -::;.i famil'r size I ! occupat7onal . 5 ___..----·.:/~! X I aspiration - 1 .------· ' /f\_9_____ , .--/''J XlO I j . / J MW ___..--·•).?:?-.------I / t ______Mother's work -----··- !I / / ___..--- t / x4 __.. I / ~~ I -.210 / 1 L----··------·------·1 I ~ ~ ------, .17 4 ::::::J DS l "Jl i AGE [- ..--I Dating status 1 I ...... l Age ' l x7 ' I I-' 0 i x I ' 6 I'' ! - '...... '~./ •./. --~ ~"' L______·--~~---I·d-e--1~-L--ag-el!

FO of marriage , • [ FOC2 ! Father's occu- I ""'"- .161 ).jI X , Feminity of 1 ~ I ! I pational status ·---~ _,,.. l______s ______, ~ Occupational I I - xl ____J ~ '>I Aspiration - 2 I ' -- ---, .124 -:7'1 Xll I r m L Mother'stion educa- ·-~-~-~- ; x3 ,_ ...:;_, J ______. I

I Father'sFE educa- i tion J L.I ______x .. z. ____ _ 111

We note from Figure 10.1.l that the variables x2 and x3 appeared not to be: significantly related to the intervening and/or dependent variables an.d were therefore deleted from the path diagram by Professor Klemmack.

Calculati.ons of 11 path coefficients" were based on a sample of 300 observations (with some missing data) and were performed on the Virginia

Polytechnic Institute and State University computer system with a modi- fied version of the SPSS computer package. The results are presented in

Tables 10.1..1 and 10.1.2. Table 10.1.1 displays values of B, "'B and B* for the five multiple regression equations indicated in the Path Diagram of Figure 10.l.l and ~ 'v * Table 10 .1. 2 displays B, B and B for the cases in which x1 , x2 , ••• ,x9 are all included as independent variables to predict x10 and x11 . .,, To obtain B from B, it is necessary only to multiply B by s /s, y since

s A s 6 8 s B _y_B =....L-L::-L= B. s s s s y

For example., when considering the relationship

as in the first row of Table 10,1.1 s = .5004 ands= .49377, so that y

(1.01343) (.17353) = .17586 .

·A To obtain B it w<:

n-2] r ( c r_2_J 1/2 -2- ln-2 r [n;3)

We used the following approximation to the gamma function, found in the

C.R.C. Standard Mathematical Tables:

-x e

For n • 300, C % .99944. Then to obtain B* , one must simply multiply

the value of ~by .99944. For example, in the first row of Table 1,

B* .99944 '\,B = .99944 (.17586) = .17576 .

We note that there is very little difference in the values of B, '\, +: '\, B and B far any of the regressions considered. The reason that B and •fr, B differ so little is, of course, due to the fact that C is close to

one for a sample of 300. Comparison of s and s for each situation y reveals that in no case is s very much larger than s; thus the ratio y '\, s Is 1-d.ll be close to one and B will not differ much from B. y That s and s are almost equal indicates very poor correlation y between the depender:t and independent variables. Inspection of the car-

relation matrix in Table 10.1.3 reveals that there is, indeed, very

little correlation among any of the variables.

Further, we can look at the multiple coefficients of determination

for each regression. These express the ratio of the variance in the •'"\,... 7~ Table 10 .1.1. Values of B, B and B for the regressions indicated in Figure 10.l.l.

Marginal Standard ------Depende.nt Independent De•.dation of Standard Regression ' "- Variable Sari_able(s) -q_~p~endent Variable Error Coefficient B B B *

X.., (DS) X,:.;(AGE) .5004 .49377 .08122 .17353 .17586 .17576 l ·-

X1(FO) .01257 .16149 .16572 .16563 x8 (AM) 1.8299 l. 78320

x7 (DS) -.66009 -.18052 -.18525 -.18515

~ f-1 X4 (MW) .10129 .12470 .12894 .12887 l,,i.)

(AFS) X fFS) .13902 .14374 .14366 x9 5' 1.1884 1.14936 .11198 X (AM) -· .13669 -.21048 -.21763 -.21751 8

x8 (AM) -.05593 -.20874 -.21663 -.21651 x10(FOC1) .4903 .47245

x9 (AFS) .06111 .14811 .15371 .15362 .

x6 (AGE) .04971 .12370 .12796 .12789 XJ.l (FOC2) .4296 .41529

x8 (AM) -.05762 -.24539 -.25385 -.25371 'l 11< Table. 1011:.2. Values of B, B and B for predicting x10 (FOCl) and x11 (FOC2) from the other nine variables.

---- :Marginal Standard Regre.ssion Dep

X2 (FE) -·. 01508 -.05490 -,05651 -.0561+8

X, (ME) .00876 ,02442 .02514 .02513 _J

x4 (MW) -.,02230 -.06655 -.06850 -.06846

X"n(FOCl) Xc(FS) .4903 .47631 -.01210 -.03642 -.03749 -.03747 J..\,,. _) ,_..f--' x6 (AGE) -.02006 -.04375 -.04504 .04502 ~

x7 (DS) -.03958 -.04040 -.04159 -.04157

x8 (AM) -.05654 -.21101 -.21721 -.21709

x9 (AFS) ,06794 .16465 .16949 .16940 - x1 (FO) .00097 .05310 .05456 .05453

x2 (FE) .00197 .00818 .00840 .00840

X3 (Yi.E) -.01764 -.05612 -.05766 -.05763

X4 (MW) -.00393 -.01338 -.01375 -.01374 x11 (FOC2) x5 (FS) .4296 .41809 -.00830 -.02850 -.02928 -.02926

x6 (AGE) .04447 .11067 .11372 .11366 Table 10.1.2 continued

Marginal Standard

A Dependent Independent Deviation of Standard Regression "c. Variable Variables DeEendent Variable Error Coefficient B B B * X_(DS) .02838 .03305 .03396 .03394 I •x 8 (A-=·-, "'f) -.05337 -.22732 -.23358 -.23345

x9 (AFS) .02876 .07955 .08174 .08169

..... \JI Table 10.1.3. Correlation matrix fer all eleven variables.

_.._ ..,.._,__ __ -,---·-.-~--.--.------~--- y "{ -..{ x, x0 x ~\ XS .. , L. x8 x ·-----..M-----.... /.. 3 0 j 9 ··10 "'11

C.()/. t; '.( J),lv 1. 00000 .. _, vo.+ ...... _, .25222 -.21324 -.02433 - • 01155 .02036 .15781 -.09399 -.02168 .00266 v iJ..r} 1,00000 .44029 - .197l19 .05317 -.H048 -.06958 .ll.578 -,09081 - . 0!+628 -.04001 /... v ..!~,.... l,00000 .02Lt56 .05223 -.08717 -·. 07 356 • 076Li5 - .. 00989 -·. 00572 -.07119 ..) v ... :.., LOOOOO -.12161 -,07787 -.05463 .00673 ,10638 -.03619 ~-.02772 '·I· x c: 1.00000 .09305 -.00254 -.00798 .12553 -.01241 -.00865 .) X, l,00000 .17353 .04792 .06173 - • 04372 .11194 b t-' x.., 1.00000 -.17724 .10460 .01308 .10631 !--' I °' v "'8 1.00000 -.21075 -.23995 -.23947

"9" 1.00000 .19211 .12756

XlO LOOOOO .46052 xll 1.00000 117 dependent variable accounted for, or explained by, the independent var:iable(s) to the total variance of the dependent variable. Table

10.1.4 shows that the coefficient of determination is quite small in every ease, indicating that the relationships are weak ones. Much more dramatic results can be seen when '\,B and B* are calculated for data in which the relation.ships are stronger, as our second example will show.

Befo:ce turning our attention to the second example, however, let us 1/2 '\, consider briefly significance tests on (n-1) B, In order to convert

1 /') v (n-·1) .... ,_ B to a vector having a multivariate noncentral t distribution, since we were given the standard errors of the regression coefficients 1/2 '\, "''ach element of (n-1) B was multiplied by

The results are prese11.ted in Table 10. 1. 5. 1/2 '\, Since the elements of (n-1) A have marginal univariate t distri- . 1/2 '\, but:i.ons (not independent), tne components of (n-1) ~are compared to critical values for testing whether they were significantly greater or less than zero. An asterisk next to a t-value in Table 10.1.5 indicates significance at the .05 level; two asterisks indicate significance at the 1/2 '\, . 01 leve.L We note that the components of (n-1) 1 in each of the five re.gressicns in Table 10 .1. 6 are found to be significantly greater or less than :::ero at Ht least the .05 level of significance. 118

Table 10.1.Lf. Multiple Coefficients of determination.

Dependent Independent "J Dependent Independent Variable Var:Lable_(s) R"'" Variable Variables R2 x7 (DS) x6 (AGE:) .03011 X1 (FO),X2(FE), x8 (AM) x1(FO),X7(DS) .05748 x3(ME), x4 (MW), x9 (AFS) x4(MW),X5(FS), x10 (FOC1) x5(FS),X6 (AGE), .08817

x8 (AM) .08082 x7(DS),X8(AM), x10 (FOCl) x8(AM),X9(AFS) .07854 x9(AFS) x11 crnr•"1\'-''-, x6 (AGE),X8 (AM) .07261 x1 (FO),X2(FE),

x3 (ME) ,x4 (:MW) '

x11 (FOC2) x5(FS),X6(AGE), .08491 x7 (DS) ,x8 (AM)'

x9(AFS) ___ ,,_..., ______119

Table 10.1.5. Values of (n-l)l/Z l = t for the regressions of FOCl and

Dependent Independent (n-1)1/2 l Variable Variables = t ·------·------x1 (FO) .48084

X2(FE) - .72006

x3 (ME) .36419

X4 (MW) -1.05596 x (FOCl) X~(FS) .59659 10 ::> -

x6 (AGE) - .70684

x7(DS) - .65270

x8 (AM) -3.36134**

X (AFS) 2.64745** 9

--~-----·------·------

X1 (FO) .75234

X2 (FE) .10703

x3 (ME) - .83529

x4 (MW) - • 21193 x11 (FOC2) x5 (FS) - .46595

x6 (AGE) 1. 78468*

x7 (DS) . 53296

x8 (AM) -3.61579**

x9 (AFS) 1.27699 120

Table. 10.1..6. Values of (n-1)112 i = t for the five regressions tndicated in Figure 10 .1.1. -----·--- Dependent Independent (n-1)1/2 ]. Variable Variable(s) = t __ ...... _...._ ..___ _

x6 (AGE) 2.847**

2. 70961** 121

Section 10.2. AppJication to Engineering Research Data

We now consider data taken from routine operations records for a petroleum refining unit, as presente.d by Gorman and Toman (1966).

There we.re 36 observations taken on 10 independent and one dependent variableo

Calculations were perf orrned on the Virginia Polytechnic Institute and State University system using a modified version of the BMD03R.

The correlation matrix for these data is presented in Table 10.2.1.

We note that in several instances the correlation coefficients are reasonably large, indicating strong relationships in certain pairs of variables. The multiple coefficient of determination is .9039, indicat-

ing that x 1 ~ ... ,x10 are effective in explaining about 90% of the varia-

tion Jn y.

Table 10.2,2 gives the marginal standard deviations of all eleven variables, and Table 10.2.3 gives values of B, '\,B, B and B* for the

GonnB.n and Toman data, For n = 36,

c -- .97742 '

so that even for relatively small samples, the bias in B is small. We

'\, note that the differences between values of B and B are much greater for

these data than in the previous example. In every case, I\,B and B* are

larger than B, in absolute value, verifying that B does tend to under-

estimate B1 in a.bsolute value. ' 1/2 '\, Vahtes of .!:. = u::-1) A_ are presented in Table 10 .2 .4. As 1n the

first example, components of ! are treated as univariate t statistics

and

x, y .L x2 x3 x4 xs x6 x7 xB x9 XlO x, 1.000 -.04117 .5110 .1175 -.7082 -.8679 - .1158 -.0001792 -.1609 -.3153 -.1991 .L 1.000 -.003293 -.1568 .06393 .09460 .1300 .005631 -.2415 -.3094 -.2078 x2

'\( '"3 1.000 .003326 -.5897 -.6484 -.09676 .3382 -.02926 -.3654 -.2252 x4 1.000 -.06697 -.08928 -.02487 .. 07954 ·. 05285 .05433 .1008 ··sii' 1.000 .8409 .2524 -.3652 - .07572 .3726 .1800 1.000 .1525 - .1977 .1114 .3967 .2562 x6 ..... N x_ 1.000 -.1776 .5261 .4710 .4124 N I X8 1.000 .2662 -.1168 .09065 x9 1.000 .4357 .4511

XlO 1.000 .9144 y 1.000 123

Table 10.2.2. Marginal standard deviations of x1 , •.• ,x10 and y for the Gorman and Toman data.

Marginal Standard Variable Deviation

2.37164

.52699

.98786

x .41831 4 .33922 x5 .41350

.33858

2.10839

v "9 16.23543 9.86270

y 3.99514

s :::: 1.46546 124

'\, Table 10.2.3. Values of s' B, B and B-~ for the Gorman and Toman data.

Varia.ble B B ""B B*

xl .12078 .071701 .195470 .19106 ,95493 .125962 .343396 .33564 x2 x, ,08793 .021742 .059273 .05793 J x. .37275 .039028 .106399 .10400 4 x, _ -L69761 _) -.144142 ·-. 392958 -.38409

x6 ,65778 .068080 .185600 .18141

y '·7 ··~ • 26913 -.022808 -.062180 -.06078 .30607 .161523 .440344 .43040 xB x9 .. .00308 -.012523 -.034141 .03337 .42272 1.043558 2.844941 2.78070 xl".V 125

Table 10.2.4. Values oft= (n-1)112 Bfor the Gorman and Toman data.

Variable t (n-1)1/2 _... ____ = i

xl .52467

x2 1.54150

x3 .24986 x4 .61067

x(~ .98165 .J

x6 .35249

x7 - .21484 xs 1.97716* .11881 x9 - XlO 11.67196** 126 at the .05 and .01 levels of significance.

Section 10.3. Summar_y

The examples presented in the preceding sections illustrate the

'\, use of B instead of B as a 0 standardized regression coefficient."

I'\, Since the bias in B is quite small, even for small samples, since B tends to underestimate the "population standardized regression coef- ficient" ~-~ and since the distribution of ]. is i..mwieldly, we recommend

'\, ~ the use of .!! instead of ~ when the aim is to estimate the parameters in the lin.ear model in an invariant manner.

We note that when the data indicate weak relationships among the

'\, dependent and independent variables, the numerical values of B and E.

iv di.Her very little. Dramatic increases in the values for B over those for.!! occur, however, if reasonably strong relationships are present.

Examination of Tables 10.2.1 and 10.2.3 indicate that, in fact, cor- relations need be not much larger than about .15 in order for relatively

I\, large differences in the values of )land E. to be observed. Thus, in problems in -which any reasonable relationships are. present, use of ~ instead of B is li.ke.ly to lead to more satisfactory results. REFERENCES

Brown, L. D. (1966). On the admissibility of invariant estimators of one or rr.ore location parameters. f.nn. Math. Statist., '}2, 1087-1136.

Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press, Princeton, New Jersey.

Ferguson, T. S. (1967). Mathematical Statistics: A Decision Theoretic Ap_p_~l'l· Academic Press, New York.

Fisher, R, A. (1934). Two new properties of mat:hema.tical likelihood. !tro..s:.' -~'.21.· _?._!::atist. ~·, Ser. !;__, ].4. 285-307.

F:ra<0er, 1), A. s. (196la). On fiducial inference. Ann. Math. Statist., .~..?_, 661-·6"76.

Fraser, D. A. S, (1961b). The fiducial method and invariance. B~ometx~!-...~§!., .L18, 261-280,

Fr2ser, D. A. S. (1964). Fiducial inference for location and scale parameters. -~iometrJ.k8:_, _51_, 17-24,

Fraser, D. A. S. (1968). The Structure of Inference. John Wiley and Sons, Nt::w York.

Goldhaber, J. K, and Ehrlich, G. (1970). Algebra. Macmillan, London.

Good, I. .J. (1972). Personal communication.

Gorman, J. W, and Toman, R. J, (1966). Selection of variables for fitting equations to data. Technometrics., l~.' 27-·51.

G:;::-aybiL!. r F. A. ( 1961) . An Introduction to Linear Statistical Models, Volume I. HcGraw-Hill 1 Ne.w York.

Hall, W. J., Wijmnan, R. A. and Ghosh, J. K. (1965), The relationship between su:Ef iciency and invariance with applications in sequential analysi.s. An_T.~· Ijf-1th_. Statist., 36. 575-614.

Berstein, I. N. (1964). 1£Eics in Algebra. Blaisdell, New York,

Hodgman, C, D., Editor. (1959) • Chem~cal Rubb~r C_omEan:.y_ _Stanq!ird IiE~.t!:!_~matica1_ .. 1ab~:E..§.. • Chemical Rubber Publishing Company, Cleveland, Ohio.

H<:.igg> R. V, and Craig, A. T, (1965). Introduction to Mathematical .~J..!:_?-_!l:stic~.· Macm:.Lllan, New York.

Hora, R. B. and Buehler, R. J, (1966). Fiducial theory and invariant esUmaU.on, !.':!~£· ~J-~th.. §j:Jltist';._. t '}_]_. 643-656,

127 128

James, A. T. (1954). Normal multivariate analys:I.s and the orthogonal group. Ann. ~ath. §taUst., 25. 40-75.

James, A. T. (1964). Distribution of matrix variates and latent roots derived from normai samples. Ann. Math. Statist., 35. 475-501.

Kiefer, J. {1957). Invariance, minimax sequential estimation, and continuous time processes. Ann. Math. Statist., 28. 573-601.

Kudo, H. (1955). On minimax invariant estimates of the transformation parameter. Natural Sc~ence ReEort of the Ochanomizu University, .2_. 31-73.

Nae:hbin, L. (1965). The Haar Integral. D. Van Nostrand, Princeton, New Jersey.

Pitman, E. J. G. (1939). The estimation of the location and scale parameters of .a continuous population of any given form. Biometrika, 30. 391--421.

Stein~ C. (1959). The admissibility of Pitman's estimator of a single . Ann. Math. Statist., 30. 970-979.

Wijsman, R. A. (1957). Random orthogonal transformations and their use in some classical distribution problems in multivariate analysis. ~-~t.l· Math· Statist., 28. 415-423.

Wijsman, R. A. (1965). Cross-sections of orbits and their application to densities of maximal invariants. Proc. 5th. Berk. ~· E.!!. Math. Statist. and Probab., 1· 389-400:-

Younger, M. S, (1969), The P~inciple of Invariance in Statistics. Master of Science Thesis, Virginia Polytechnic Institute and State University.

Zacks, S. (1971). The ..:D1eory of . Wiley, New York. The two page vita has been removed from the scanned document. Page 1 of 2 The two page vita has been removed from the scanned document. Page 2 of 2 INVARIANT ESTIMATION WITH

A.PPLICATION TO LINEAR MODELS

by

Mary Sue Younger

(ABSTRACT)

The method of invariant estimation proposed in this dissertation relies on defining a group of transformations on the sample space such that i) the group structure is isomorphic to the parameter space and

"carries" the estimation problem in a natural manner (thus defining an e9.!::_ivari~ estimation problem), and ii) the group structure generates orbits on the para.meter space and the problem is to estimate the orbit in which the parameter lies (thus defining an invariant estimation problem). If the group of transformations can be expressed as the semi-direct product of two subgroups, one a "nuisance" group which is a normal subgroup, then an. estimator of orbits under the nuisance group in the invariant estimation problem can be naturally obtained fr.om the best estimator in the equivariant estimation problem.

The primary application is to the invariant estimation of the pa.ramete.rs in the general linear model under the (nuisance) group of scale changes 011 the dependent and independent variables. The invariant estimator of the regression coefficient is found to be a "standardized regression coefficient,'' but this standardized regression coefficient is not the same as the typical one ("beta coefficient") found in elementary statistics texts and aocial science research.

Comparison ('f tb.c proposed estimator to the usual estimator, in the case in which the input matrix is nonstochastic, shows the proposed estimator to be superior to the usual estimator in terms of such criteria as consistency, unbiasedness, and simplicity of distribution. In the case in wh:i.ch the input matrix is stochastic, some justification can be found for the use of the usual estimator.

Application of the proposed method of invariant estimation to the problem of obt:aini.ng estimators invariant under nonsingular transforma- tions is stra:1.ghtforward, although the estimator obtained is difficult to interpret.