Maximum-LikelihoodEstimation:BasicIdeas 1 I The methodofmaximumlikelihood providesestimatorsthathaveboth areasonableintuitivebasisandmanydesirablestatisticalproperties. I Themethodisverybroadlyapplicableandissimpletoapply. I Onceamaximum-likelihoodestimatorisderived,thegeneraltheory ofmaximum-likelihoodestimationprovidesstandarderrors,statistical tests,andotherresultsusefulforstatisticalinference. I Adisadvantageofthemethodisthatitfrequentlyrequiresstrong Maximum-Likelihood Estimation: assumptionsaboutthestructureofthedata. Basic Ideas

c °

Maximum-LikelihoodEstimation:BasicIdeas 2 Maximum-LikelihoodEstimation:BasicIdeas 3 Thisfunctioniscalledthelikelihoodfunction: 1.AnExample • (parameter )= ( ) Wewanttoestimatetheprobability ofgettingaheadupon flippinga | | I = 7(1 )3 particularcoin. We flipthecoin‘independently’10times(i.e.,wesample =10 flips), I Theprobabilityfunctionandthelikelihoodfunctionaregivenbythe • obtainingthefollowingresult: . sameequation,buttheprobabilityfunctionisafunctionofthedata Theprobabilityofobtainingthissequence—inadvanceofcollecting withthevalueoftheparameter fixed,whilethelikelihoodfunctionisa • thedata—isafunctionoftheunknownparameter : functionoftheparameterwiththedata fixed. Pr(data parameter)=Pr( ) | | = (1 ) (1 )(1 ) = 7(1 )3 Butthedataforourparticularsampleare fixed:Wehavealready • collectedthem. Theparameter alsohasafixedvalue,butthisvalueisunknown,and • sowecanletitvaryinourimaginationbetween0and1,treatingthe probabilityoftheobserveddataasafunctionof .

c c ° ° Maximum-LikelihoodEstimation:BasicIdeas 4 Maximum-LikelihoodEstimation:BasicIdeas 5 Herearesomerepresentativevaluesofthelikelihoodfordifferent ThecompletelikelihoodfunctionisgraphedinFigure1. • valuesof : • ( data)= 7(1 )3 Althougheachvalueof ( data) isanotionalprobability,thefunction | • ( data) isnotaprobability| ordensityfunction—itdoesnotenclose 0.0 0.0 | .1 .0000000729 anareaof1. .2 .00000655 Theprobabilityofobtainingthesampleofdatathatwehaveinhand, • .3 .0000750 ,issmallregardlessofthetruevalueof . .4 .000354 – Thisisusuallythecase: Anyspecific sampleresult—includingthe .5 .000977 onethatisrealized—willhavelowprobability. .6 .00179 Nevertheless,thelikelihoodcontainsusefulinformationaboutthe .7 .00222 • unknownparameter . .8 .00168 Forexample, cannot be0or1,andis‘unlikely’tobecloseto0or1. .9 .000478 • 1.0 0.0 I Reversingthisreasoning,thevalueof thatismostsupportedbythe dataistheoneforwhichthelikelihoodislargest. Thisvalueisthe maximum-likelihoodestimate(MLE),denoted . • Here, = 7,whichisthesampleproportionofheads,7/10. • b c c ° ° b

Maximum-LikelihoodEstimation:BasicIdeas 6 Maximum-LikelihoodEstimation:BasicIdeas 7 I Moregenerally,for independent flipsofthecoin,producingaparticular sequencethatincludes headsand tails, ( data)=Pr(data )= (1 ) | | Wewantthevalueof thatmaximizes ( data),whichweoften • abbreviate ( ). | Itissimpler—andequivalent—to findthevalueof thatmaximizes • | d a ta thelogofthelikelihood L log ( )= log +( )log (1 ) Differentiating log ( ) withrespectto produces • log ( ) 1 = +( ) ( 1) 1

0 . 000 5 00 1 01 20 = 0.00.20.40.60.81.0 1

Figure1.Likelihoodofobserving7headsand3tailsinaparticularse- quencefordifferentvaluesoftheprobabilityofobservingahead, . c c ° ° Maximum-LikelihoodEstimation:BasicIdeas 8 Maximum-LikelihoodEstimation:BasicIdeas 9 Settingthederivativeto0andsolvingproducestheMLEwhich,as 2.PropertiesofMaximum-Likelihood • before,isthesampleproportion . Estimators Themaximum-likelihood estimator is = . • Underverybroadconditions,maximum-likelihoodestimatorshavethe b followinggeneralproperties: I Maximum-likelihoodestimatorsareconsistent. I Theyareasymptoticallyunbiased,althoughtheymaybebiasedin finite samples. I Theyareasymptoticallyefficient—noasymptoticallyunbiasedestimator hasasmallerasymptoticvariance. I Theyareasymptoticallynormallydistributed. I Ifthereisasufficientstatisticforaparameter,thenthemaximum- likelihoodestimatoroftheparameterisafunctionofasuf