A Survey of Maximum Likelihood Estimationauthor(S): R

A Survey of Maximum Likelihood EstimationAuthor(s): R. H. NordenReviewed work(s):Source: International Statistical Review / Revue Internationale de Statistique, Vol. 40, No. 3(Dec., 1972), pp. 329- 354; Published by: International Statistical Institute (ISI) Stable URL: http://www.jstor.org/stable/1402471 . A Survey of Maximum Likelihood Estimation: Part 2Author(s): R. H. NordenReviewed work(s):Source: International Statistical Review / Revue Internationale de Statistique, Vol. 41, No. 1(Apr., 1973), pp. 39-58; Published by: International Statistical Institute (ISI) Stable URL: http://www.jstor.org/stable/1402786 . International Statistical Institute (ISI) is collaborating with JSTOR to digitize, preserve and extend access toInternational Statistical Review / Revue Internationale de Statistique. R.H. Norden, 1972 A Survey of Maximum Likelihood Estimation, parts 1-2 Combined 1 Int. Stat. Rev., Vol. 40, No. 3, 1972, pp. 329-354/Longman GroupLtd/Printed in Great Britain A Survey of Maximum LikelihoodEstimation R. H. Norden University of Bath, England Contents Abstract Page 329 1. Introduction 330 2. Historicalnote 330 3. Somefundamental properties 332 4. Consistency 335 5. Efficiency 338 6.* Consistencyand efficiency continued 7.* Comparisonof MLE'swith other estimators 8.* Multiparametersituations 9.* Theoreticaldifficulties 10.* Summary 11.* Appendixof importantdefinitions 12. Note on arrangementof bibliography 343 13. Bibliography 343 (* Part 2 following in next issue of the Review) Abstract This survey,which is in two parts,is expositoryin natureand gives an accountof the develop- ment of the theory of MaximumLikelihood Estimation (MLE) since its introductionin the papersof Fisher(1922, 1925)up to the presentday whereoriginal work in this field still con- tinues.After a shortintroduction there follows a historicalaccount in whichin particularit is shown that Fishermay indeedbe rightfullyregarded as the originatorof this methodof estimation since all previous apparentlysimilar methods dependedon the method of inverse probability. There then follows a summaryof a numberof fundamentaldefinitions and resultswhich originatefrom the papersof Fisherhimself and the later contributionsof Cramerand Rao. Consistencyand Efficiencyin relation to maximumlikelihood estimators(MLE's) are then consideredin detailand an accountof Cramer'simportant (1946) theorem is given,though it is remarkedthat, with regard to consistency.Wald's (1949)proof has greatergenerality. Comments are then made on a numberof subsequentrelated contributions. The paperthen continueswith a discussionof the problemof comparingMLE's with other estimators.Special mention is then made of Rao's measureof "second-orderefficiency" since this does provide a means of discriminating between the various best asymptotically Normal (BAN) estimators of which maximum likelihood estimators form a subclass. Subsequently there is a survey of work in the multiparameter field which begins with a brief account of the classical Cramer-Rao theory. Finally there is a section on the theoretical difficulties which arise by reason of the existence of inconsistent maximum likelihood estimates and of estimators which are super efficient. Particular reference is made to Rao's (1962) paper in which these anomalies were considered and at least partially resolved by means of alternative criteria of consistency and efficiency which nevertheless are rooted in Fisher's original ideas. R.H. Norden, 1972 A Survey of Maximum Likelihood Estimation, parts 1-2 Combined 2 330 1. Introduction The aim of this paperand its sequelis to give an accountof the theoreticalaspect of Maximum LikelihoodEstimation. To this end we shall considera numberof contributionsin detail in which the main themes are consistencyand efficiency,both with regardto single and multiparametersituations. The scope of thesepapers will also includea discussionof varioustheoretical difficulties with specialreference to examplesof inconsistencyof maximumlikelihood estimators and super- efficiencyof other estimators.The relatedquestion of to what extent MaximumLikelihood Estimationhas an advantageover othermethods of estimationwill also be considered. An extensivebibliography of nearly400 titles dividedinto four main groupsis given at the end of the firstpaper, in which it is hoped that all, or at least almost all, contributionswhich are in some way relatedto this subjectare included.In the text we referto papersmainly in the first two groups. 2. HistoricalNote In two renownedpapers R. A. Fisher(1921, 1925) introduced into statisticaltheory the concepts of consistency,efficiency and sufficiency,and he also advocatedthe use of MaximumLikelihood as an estimationprocedure. His 1921 definitionof consistency,although importantfrom the theoreticalstandpoint, cannoteasily be appliedto actualsituations, and he gave a furtherdefinition in his 1925paper. "A statisticis said to be a consistentestimate of any parameterif when calculatedfrom an indefinitelylarge sample it tends to be accuratelyequal to that parameter." This is the definitionof consistencywhich is now in generaluse, and mathematicallyit may be statedthus. The estimatorTn of a paremeter0, basedon a sampleof n is consistentfor 0 if for an arbitrary e>0, P[I T,-0 I>e]-+0 as n-+oo. We shallrefer to Fisher'searlier definition in Section9, however,where we considerexamples of inconsistencyof MLE's. With regardto efficiency,he says: "The criterionof efficiencyis satisfiedby those statistics which when derivedfrom large samplestend to a normaldistribution with the least possible standarddeviation". Later,1925, he says: "We shall prove that when an efficientstatistic exists, it may be found by the methodof MaximumLikelihood Estimation". Evidently,therefore, he regardedefficiency as an essentiallyasymptotic property. Nowadays we would say that an estimatoris efficient,whatever the samplesize, if its varianceattains the Cramer-Raolower bound, and Fisher'sproperty would be describedas asymptoticefficiency. His definitionof sufficiencyis as follows:"A statisticsatisfies the criterionof sufficiencywhen no other sample provides any additionalinformation to the value of the parameterto be estimated". It is well known, of course, that there is a close relation between sufficient statistics and Maximum Likelihood Estimation. Further, he defines the likelihood as follows: "The likelihood that any parameter (or set of parameters) should have any assigned value (or set of values) is proportional to the probability that if this were so, the totality of observations should be that observed". The method of Maximum Likelihood as expounded by Fisher consists of regarding the likelihood as a function of the unknown parameter and obtaining the value of this parameter which makes the likelihood greatest. Such a value is then said to be the Maximum Likelihood Esti- mate of the parameter. R.H. Norden, 1972 A Survey of Maximum Likelihood Estimation, parts 1-2 Combined 3 331 A formaldefinition of the methodwould now thereforebe as follows. Let X be a randomvariable with probabilitydensity function f = f (x; 0), wherethe form of is knownbut the which be a is unknown. are the f parameter0, may vector, Supposex1, ..., x. realizedvalues of X in a sampleof n independentobservations. Define the LikelihoodFunction (LF) by L = f (xl ; 0)... f (x,; 0). Then if 0 is such that sup L = L (0) 0 e where 0 is the parameterspace, i.e. the set of possiblevalues of 0, then we say that 0 is the MLE1of 0. If thereis no uniqueMLE then 0 will be understoodto be any value of 0 at which L attainsits supremum. The questionas to whetherFisher may be rightlyregarded as the originatorof the methodof MLE has arisenfrom time to time, so that althoughmost statisticianswould now attributethe methodto him theredoes not appearto havebeen absoluteagreement on this matter. Thereis, however,a full discussionof the historicalaspect by C. R. Rao (1962),where he surveysthe work of F. Y. Edgeworth(1908a and b), and also of Gauss, Laplaceand Pearson in so far as they are relevantto this question. Accordingto Rao, all arguments,prior to those of Fisher,which appearedto be of a ML type, werein fact dependenton the methodof inverseprobability, i.e. the unknownparameter is estimatedby maximizingthe "a posterior"probability derived from Bayeslaw. Theycould not thereforebe regardedas MLE as such, for as Rao remarks,"If 0 is a maximumlikelihood estimateof 0, then0 (0) is a maximumlikelihood estimate of any one-to-onefunction of 0 (0), while such a propertyis not true of estimatesobtained by the inverseprobability argument". After furtherdiscussion he concludesby saying:"We, thereforedo not have any literature supportingprior claims to the methodof maximumlikelihood estimation as a principlecapable of wide application,and justifyingits use on reasonablecriteria (such as sufficiencyin a sense widerthan that used by Edgeworthand consistency)and not on inverseprobability argument, beforethe fundamentalcontributions by Fisherin 1922and 1925". This agreeswith the view put forwardby Le Cam (1953),who also gives an outlineaccount of developmentsof MLE since 1925,particularly with regardto the variousattempts that were made to prove rigorouslythat MLE's are both consistentand efficient.Fisher himself, of course,provided a proof of efficiency,but there was no explicitstatement of the restrictions on the type of estimationsituation to which his conclusionswould apply. He also gave no separateproof of consistency,although this could be regardedas implied by his proof at

Load more