Week 6

Interpretation, Interaction, & Heteroskedastic

Rich Frank University of New Orleans September 27, 2012 Interpretation

. How do we turn the output that our statistical software gives us and interpret it in a way that is meaningful?

. What effect is theoretically the most important for your argument?

. How can you interpret this estimated effect?

MLE Class 6 2 General Tests

. Likelihood Ratio . Wald . Lagrange Multiplier (“Score”)

. All asymptotically equivalent when H0 is true.

. No consensus which is preferred in small samples.

. LR test only involves subtraction while Wald required matrix manipulation.

. I have not seen many instances of the Lagrange Multiplier used in the literature.

MLE Class 6 3 . All use nested models.

. A model (let’s call it A) is considered nested in another (B) if you can create Model A by restricting Model B’s parameters in some way.

. Usually this restriction is accomplished by setting a parameter (β푖) equals to zero.

. In Stata, you can do this by dropping this variable (푋푖)from the model.

MLE Class 6 4 What is the Chi-Squared distribution?

. The χ2-distribution (a PDF) with n degrees of freedom is given by the equation:

1 −푥 (푣−2) Φ 푥, 푣 = 푣 (푒 2 )(푥 2 ) 2 푣 2 Γ(2)

Where v = degrees of freedom

. For df over 100, the χ2 approximates the .

MLE Class 6 5 See also Long(1997: 89)

MLE Class 6 6 . All of these are described well in Long (1997).

. Especially useful for understanding the intuition between the different approaches see Figure 4.2 (page 88).

MLE Class 6 7 Likelihood Ratio Test

. Does the likelihood change much under the null hypothesis versus our alternative?

. You actually have to run two models to run the lr test.

. Stata command: lrtest

MLE Class 6 8 . The likelihood ratio statistic is calculated as follows:

2 퐺 = (푀퐶| 푀푈) = 2lnL(푀푈) – 2lnL(푀퐶)

MLE Class 6 9 Let’s try this in Stata and by hand

** LR Test of Ethnic Fractionalization using Fearon and Laitin (2003) ** . onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > ethfrac relfrac

Iteration 0: log likelihood = -534.46236 Iteration 1: log likelihood = -489.16648 Iteration 2: log likelihood = -477.13991 Iteration 3: log likelihood = -476.83237 Iteration 4: log likelihood = -476.83175 Iteration 5: log likelihood = -476.83175

Logistic regression Number of obs = 6326 LR chi2(12) = 115.26 Prob > chi2 = 0.0000 Log likelihood = -476.83175 Pseudo R2 = 0.1078

------onset | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------+------warl | -.8826025 .3120193 -2.83 0.005 -1.494149 -.2710558 lgdpenl1 | -.6441048 .1241874 -5.19 0.000 -.8875076 -.400702 lpopl1 | .2566972 .0759579 3.38 0.001 .1078224 .4055719 lmtnest | .1816345 .0841573 2.16 0.031 .0166893 .3465797 ncontig | .3343231 .2800065 1.19 0.232 -.2144795 .8831258 Oil | .7628113 .2812081 2.71 0.007 .2116535 1.313969 nwstate | 1.656059 .3452964 4.80 0.000 .97929 2.332827 instab | .4926794 .2464971 2.00 0.046 .009554 .9758048 anocl | .6148623 .2406123 2.56 0.011 .1432709 1.086454 deml | .0955259 .3141988 0.30 0.761 -.5202923 .7113441 ethfrac | .2201434 .367535 0.60 0.549 -.5002119 .9404987 relfrac | -.0441585 .5111071 -0.09 0.931 -1.04591 .9575929 _cons | -2.749393 1.191852 -2.31 0.021 -5.085381 -.4134058 ------

. estimates store A MLE Class 6 10 ** Restricted Model does not include ethfrac ** . logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > relfrac

Iteration 0: log likelihood = -534.46236 Iteration 1: log likelihood = -489.23147 Iteration 2: log likelihood = -477.31961 Iteration 3: log likelihood = -477.01331 Iteration 4: log likelihood = -477.0127 Iteration 5: log likelihood = -477.0127

Logistic regression Number of obs = 6326 LR chi2(11) = 114.90 Prob > chi2 = 0.0000 Log likelihood = -477.0127 Pseudo R2 = 0.1075

------onset | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------+------warl | -.8691268 .3109841 -2.79 0.005 -1.478644 -.2596092 lgdpenl1 | -.6582621 .1213198 -5.43 0.000 -.8960446 -.4204795 lpopl1 | .2619946 .0745104 3.52 0.000 .115957 .4080323 lmtnest | .1758016 .0836563 2.10 0.036 .0118382 .339765 ncontig | .3372547 .2787012 1.21 0.226 -.2089895 .8834989 Oil | .7898462 .2776031 2.85 0.004 .2457542 1.333938 nwstate | 1.666944 .3449221 4.83 0.000 .990909 2.342979 instab | .4936941 .246361 2.00 0.045 .0108354 .9765529 anocl | .6196298 .2405215 2.58 0.010 .1482162 1.091043 deml | .0916092 .3145002 0.29 0.771 -.5247999 .7080183 relfrac | .0180241 .5006194 0.04 0.971 -.963172 .9992202 _cons | -2.615902 1.162481 -2.25 0.024 -4.894322 -.3374821 ------

. estimates store B

MLE Class 6 11 . lrtest A

Likelihood-ratio test LR chi2(1) = 0.36 (Assumption: B nested in A) Prob > chi2 = 0.5475

. We can also easily do this by hand:

2 퐺 = (푀퐶| 푀푈) = 2lnL(푀푈) – 2lnL(푀퐶) = 2(-477.0127) – 2(-476.83175) = (-954.0254) – (953.6635) = .3619

. This null finding is not surprising because the beta for ethnic fractionalization is not significant. Let’s try it with “New State.”

MLE Class 6 12 logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > ethfrac relfrac

Iteration 0: log likelihood = -534.46236 Iteration 1: log likelihood = -489.16648 Iteration 2: log likelihood = -477.13991 Iteration 3: log likelihood = -476.83237 Iteration 4: log likelihood = -476.83175 Iteration 5: log likelihood = -476.83175

Logistic regression Number of obs = 6326 LR chi2(12) = 115.26 Prob > chi2 = 0.0000 Log likelihood = -476.83175 Pseudo R2 = 0.1078

------onset | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------+------warl | -.8826025 .3120193 -2.83 0.005 -1.494149 -.2710558 lgdpenl1 | -.6441048 .1241874 -5.19 0.000 -.8875076 -.400702 lpopl1 | .2566972 .0759579 3.38 0.001 .1078224 .4055719 lmtnest | .1816345 .0841573 2.16 0.031 .0166893 .3465797 ncontig | .3343231 .2800065 1.19 0.232 -.2144795 .8831258 Oil | .7628113 .2812081 2.71 0.007 .2116535 1.313969 nwstate | 1.656059 .3452964 4.80 0.000 .97929 2.332827 instab | .4926794 .2464971 2.00 0.046 .009554 .9758048 anocl | .6148623 .2406123 2.56 0.011 .1432709 1.086454 deml | .0955259 .3141988 0.30 0.761 -.5202923 .7113441 ethfrac | .2201434 .367535 0.60 0.549 -.5002119 .9404987 relfrac | -.0441585 .5111071 -0.09 0.931 -1.04591 .9575929 _cons | -2.749393 1.191852 -2.31 0.021 -5.085381 -.4134058 ------

. estimates store A

MLE Class 6 13 ** Restricted Model does not include New State ** . logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil instab anocl deml /// > ethfrac relfrac

Iteration 0: log likelihood = -534.46236 Iteration 1: log likelihood = -507.63265 Iteration 2: log likelihood = -485.83417 Iteration 3: log likelihood = -485.7061 Iteration 4: log likelihood = -485.70598 Iteration 5: log likelihood = -485.70598

Logistic regression Number of obs = 6326 LR chi2(11) = 97.51 Prob > chi2 = 0.0000 Log likelihood = -485.70598 Pseudo R2 = 0.0912

------onset | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------+------warl | -.9659943 .3115093 -3.10 0.002 -1.576541 -.3554473 lgdpenl1 | -.7129984 .1223772 -5.83 0.000 -.9528532 -.4731436 lpopl1 | .2159782 .0754739 2.86 0.004 .068052 .3639043 lmtnest | .1804607 .084162 2.14 0.032 .0155062 .3454152 ncontig | .430106 .278506 1.54 0.123 -.1157557 .9759678 Oil | .8182815 .2775055 2.95 0.003 .2743808 1.362182 instab | .3143875 .2407495 1.31 0.192 -.1574728 .7862479 anocl | .7433193 .2372684 3.13 0.002 .2782818 1.208357 deml | .2285169 .3150897 0.73 0.468 -.3890476 .8460814 ethfrac | .2775607 .3646804 0.76 0.447 -.4371997 .9923211 relfrac | .030431 .5085138 0.06 0.952 -.9662377 1.0271 _cons | -1.867458 1.162471 -1.61 0.108 -4.145859 .410944 ------

. estimates store B

MLE Class 6 14 . lrtest A

Likelihood-ratio test LR chi2(1) = 17.75 (Assumption: B nested in A) Prob > chi2 = 0.0000

We can also calculate the LR test by hand: 2 퐺 = (푀퐶| 푀푈) = 2lnL(푀푈) – 2lnL(푀퐶) = 2(-476.83175) – 2(-485.70598) = (-953.6635) – (-971.41196) = 17.74846

MLE Class 6 15 Wald statistic

. Are the estimated parameters far away from what they would be under the null hypothesis?

. The Wald statistic is calculated as follows: W = [Q휷 - r]´[Q푉푎푟 (휷 )Q´]-1[Q휷 - r]

. W is also distributed chi-square.

. In Stata: test

MLE Class 6 16 . ** Wald ** . ** Using Fearon and Laitin Table 1 Model #3 ** . qui logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > ethfrac relfrac, robust cluster(ccode)

. test

( 1) [onset]warl = 0 ( 2) [onset]lgdpenl1 = 0 ( 3) [onset]lpopl1 = 0 ( 4) [onset]lmtnest = 0 ( 5) [onset]ncontig = 0 ( 6) [onset]Oil = 0 ( 7) [onset]nwstate = 0 ( 8) [onset]instab = 0 ( 9) [onset]anocl = 0 (10) [onset]deml = 0 (11) [onset]ethfrac = 0 (12) [onset]relfrac = 0

chi2( 12) = 142.13 Prob > chi2 = 0.0000

MLE Class 6 17 . We can also test one parameter:

. test warl=0

( 1) [onset]warl = 0

chi2( 1) = 12.62 Prob > chi2 = 0.0004

. Or two jointly:

. test (warl=0)(lpopl1=0)

( 1) [onset]warl = 0 ( 2) [onset]lpopl1 = 0

chi2( 2) = 27.37 Prob > chi2 = 0.0000

MLE Class 6 18 Lagrange Multiplier Test

. If I had a less restrictive likelihood function, would its derivative be close to zero here at the restricted ML estimate?

. In Stata: enumopt, testomit . These are add-ons you have to install . Type “find it enumopt” and “find it testomit” to install

MLE Class 6 19 .qui logit onset lgdpenl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) predict test, score

.testomit warl lpopl1, score(test)

.logit: score tests for omitted variables

Term | score df p ------+------warl (as factor) | 0.00 1 1.0000 lpopl1 | 0.00 1 1.0000 ------+------simultaneous test | 0.00 2 1.0000 ------+------adjusted for clustering on ccode

MLE Class 6 20 . There are a number of other ways of interpreting estimated coefficients.

. For a useful and much more in depth guide see Ch. 3 and 4 of Long and Freese (2005).

MLE Class 6 21 A number of other statistics can be calculated

. ** Long and Freese Spost ** . quietly logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > ethfrac relfrac, robust cluster(ccode) nolog . estimates store model1 . quietly fitstat, save . end of do-file . quietly logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil instab anocl deml /// > ethfrac relfrac, robust cluster(ccode) nolog

. estimates store model2

. fitstat, diff

MLE Class 6 22 Measures of Fit for logit of onset

Current Saved Difference Model: logit logit N: 6326 6326 0 Log-Lik Intercept Only -534.462 -534.462 0.000 Log-Lik Full Model -485.706 -476.832 -8.874 D 971.412(6314) 953.664(6313) 17.748(1) LR 97.513(11) 115.261(12) 17.748(1) Prob > LR 0.000 0.000 0.000 McFadden's R2 0.091 0.108 -0.017 McFadden's Adj R2 0.069 0.084 -0.015 ML (Cox-Snell) R2 0.015 0.018 -0.003 Cragg-Uhler(Nagelkerke) R2 0.098 0.116 -0.018 McKelvey & Zavoina's R2 0.210 0.221 -0.012 Efron's R2 0.028 0.041 -0.014 of y* 4.164 4.225 -0.062 Variance of error 3.290 3.290 0.000 Count R2 0.983 0.983 0.000 Adj Count R2 0.000 0.000 0.000 AIC 0.157 0.155 0.002 AIC*n 995.412 979.664 15.748 BIC -54291.389 -54300.385 8.996 BIC' -1.236 -10.232 8.996 BIC used by Stata 1076.441 1067.445 8.996 AIC used by Stata 995.412 979.664 15.748

Difference of 8.996 in BIC' provides strong support for saved model.

Note: p-value for difference in LR is only valid if models are nested.

MLE Class 6 23 . Truly understanding your data, your model, and how the two are connected is a time-intensive and subtle process that is often overlooked by busy researchers.

. However, the better you know your data, your model, and your link function, the more confidence you can have in your findings (or try and figure out why you are not getting any results).

MLE Class 6 24 Let’s start by taking a look at outliers

20

18

16

14

12

10

8

6

StandardizedResidual

4

2

0

-2 0 1000 2000 3000 4000 5000 6000 7000 Observation Number

MLE Class 6 25 By observation number

20 4843

18

16 3848

14 1831 5600 993 28312934 5122 2400 3241 12 1435 2933 4549 4967 3803 15061550 3498 4439 10 2963 36623855 9481098 2270 4800 484 10991471 2129 2753 4032 8 2971 4340 1433 212223952550 5193 1057 2487 3039 39354257 326552 4003 278568577670 1270138016621718 2628 3172 3962

6 2208 164 1526 25492548 3022 364 1490 2520 28 539 875 22242388 StandardizedResidual 135 2374 4 4484 632878 3819 42904291 4727 183 664741 1504 5061 459 1826215921602330 44634464 180361576784 2 46 406450 1063 34623463 222340 41284129

0 3439455010010161631081106871691188212385871299192141143145147144148149153154155156159163165170172171173176177179184186189190191196194195197203205206207204209210213212211216218217221223224228231232229230233238236243241242240245247248246251253249250252257258255260263259261262264265266270273272276275280281287284286292291289293294296299303302300304308309307313319320321323324327328330336339337343346348345353354356357358360363365366369372374370373377381383385388387389392390391399396400402403404408409405410413411415417418419416423422424426427428429430431434432433438435436437442441443444448445446449447453454459455456458457460462463465466467468470473471472477479480485481483486487489490488492491497499496498503504505502510506514513511515516517518519520525521524528526527529530533537536538545541547549556559563564566567571572574575578583584582588590586587589593591597600598602604603601605608606614615611612613616618617624625622623627631633638636640639645644643646647648650649656658657661662667668669675676677680681682684685689692695696701699705703702704707708710711712713715716717721719720722731727729730734732735736733739744743748751754758757766763764765767769775774776772773778777783786785789787788790791792793795796800801798797807802805806808809810811813812817818819820821822823826828829830834835836842843846851850848849857853855856861862858859860863865864869872874877876873880881879885884886887891892888890895897896894898899900903905906907908910911912917913914915921920918925923927924926928929930932931933937934935938941942939940943945946947944952949951950953955957100110029549569621007100810051006958959100496596796810129639649661010101396997097210141017101810151016102110229739769771023102097497897998298010271028102698710299859869841030103110339881037990992103699110351038104099410399951041104210431044104810471053105110521058105410561055105910611062106010681066106710721073107010711074107610751081108010831079108210841086108710881093108910921091109010951096109710941101110311001102110611051104110711111110110911181115111411161117112011221121112311241125112811291131113011321133113411351136113911401141114411421143114711461150115311511152115511591161116011641167116911661168117211741171117011731177117811761179117511831184118211851189118611881190119211931194119812011202120312041205120812091206120712101217121812191221122312221224122012271228122912251226123312341230123612371239124012411242124312441245124712491246125412511250125812591255125612571260126312641261126212691266126512711272127312741277127612781275128412801289128712881285129312951297130013021303130513011307130813091310131113141315131213161320131713231324132113221327133013261328132913351331133313321334133813361339134013371342134313441345134913501347134813461352135113531354135513571359135613601358136113651363136413671369137013661368137213751374137913781376137713811385138213831384138713881389138613941391139513921393139713981396140414091406140714081414141114131412141514161417141914201421142214231424142514261429142714281431143214381439144014421445144114501449144814521451145314611458145914601464146514661467146814691476147214731479147714801478148114861485148214841489148714881495149214931494150015011502150315051509150815111510151315121516151515211517151915201523152415251522152715311528153615321533153415351537153915411544154215461543154815511553155215601561155715631562156415661568156915711567157315761575157215791586158215831587159115881593159615921595160115971598159916001603160216041610161116091613161416151616161716211622162016231624162516331634163216351639164116421640164416451646164316471649165216541659166116661663166416651668166916701671167316751682168016791681168616851687169216891690169416951697169817001705170317071711171217081709171017151714171717201721172217231725172717261728172917301734173617351737173317421739174017411744174317451747174917481750175117531755175717541756175817601761176317651764177017691771177217731774177717751779177817811780178217851787178317911795179817941797179617991801180318001802180718041805180618081809181318101811181218171816181518181819182218231820182518271828183018331829183218351837183618411843183918421847184518441846185118521850185318551856185718601862186318651866186818671871186918771874187518821879188318801881188418851886188818891890189118921893189618941895189719001899190119071908190519041906191219091910191119131918191419171923192119191920192219241928192719291933193119351937193819341936194019391941194319471946194419481950195219511949195319541956195819591955195719601962196119691966196519671970197119721978197719791975198019811984198319871986198819851994199019921995199619981999200320022004200720082009200520062011201220102013201420182019201720212020202220262025202720282033203420302031203220382039203520372041204020422043204420462048204920532052205420502058205920552056205720602061206320642066206720682065206920702072207420712079207620752077207820802081208320842082208820872086208920852091209220902093209820962097209921022103210021012106210721092108211121 121011151314161718201922212627252324323031293637333541404238471432356752514849531028954565557586062103105106107104641111126510911411370116117721157511912274761217377120127791267880811251241288386130131888990134136137140951389394971399614299981461501511521571611581621601671661681691751741781811821881871921931982002011992022082142152202192262252272372352392442542562672682692712742772792822832852882902972952983013053063113123103143153173163183223253293343313323333353383413443423493473503513523553593623683673713763793753783843803863943933953973984014074124144214254394404514524644614694764784824934944955005015075095085125235225315325345355405425435445465485505535515575585605615625655695705735795805815855955925945965996076096106196206216286296306266346356416426556516526536546596606666636656716726736746796786886906916936946986977067097147187237247257267287377387407427457467507497477567527537557617607597627717687707817827947998048038158168148278258248318328378338388398418408458478448528548668678718688708828838898939029019049099169229199361003960961100910119719751019102498110259831032989103499699799810451046104910501064106510691077107810851108111211131119112611271137113811451148114911541156115711581162116311651180118111911199119611971195120012141211121212131215121612311232123512381248125212531279128112821283128612941292129612981299130413061313131913251341136213731371139013991400140214031401140514101418143414361437144314441446144714551456145714631462147014831491149614971498149915071514151815291530153815401545154715491554155915581565157015741581157815771580158515841589159015941605160616081607161816191626162716291628163016311638165016511648165516561653165816601667167416761678168416831691168816931696170117021704170617161713171917241731173217381746175217591766176817761784179017921793181418211824183418381840184818491858185418591861186418701872187318781876188718981903190219151916192519261932193019421945196319641968197319741976198219891991199319972000200120152016202320242029203620452047205120622073209420952104210521152116212421252138215421582163216521792180218121832184218521862219223022332234223922432250225822652271227222742276229122922300229823062303230523142317232123312333234123432344234623492353235423582372237523792389239323982405240624032419243024292448245924792482248324892498250224992501251025252533255225592562256925722581258725882589260126022611261426192626263826422643264426452648265726602663267126942766277127792780279128112818282428232833283228382835283628432844287428752882289729362940294729532955295629572969297929802977298729902995303330543055305230613064307030723073307130843094310931083113311631223121312631303152316031573158316431823202320832153219322332243227323332343269327033373341334333623366340734103431345834593491352435303543361036213640367836853690372037213727372837323766378838243860386438723884388738953896391439213922392439253928392939323933398540154064409841364142415241694208424242624374440044244425449144924522452345264536469346944804487848894890489149134943494449774973499550135032504250585068508350915143514251445156515851705171518851865204523752675274532453725392539954215439548255095530 132133185474475 77978010009991267126812901291 46544655

-2 0 1000 2000 3000 4000 5000 6000 7000 Observation Number

MLE Class 6 26 Stata code for these two graphs

** Residuals ** ** From Long and Freese page 147-149) ** quietly logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog predict rstd, rs label var rstd "Standardized Residual" sort lgdpenl1, stable generate index=_n label var index "Observation number" graph twoway scatter rstd index, xlabel(0(1000)7000) ylabel(-2(2)20) /// xtitle("Observation Number") yline(0) msymbol(0h) graph twoway scatter rstd index, xlabel(0(1000)7000) ylabel(-2(2)20) /// xtitle("Observation Number") yline(0) msymbol(none) /// mlabel(index) mlabposition(0)

MLE Class 6 27 This lets us find and understand outliers

. **Then you can list influential observations ** . list in 4844, noobs

+------+ | ccode | country | cname | cmark | year | wars | war | warl | onset | ethonset | durest | aim | | 352 | CYPRUS | CYPRUS | 0 | 1979 | 0 | 0 | 0 | 0 | 0 | . | . | |------| | casename | ended | ethwar | waryrs | pop | lpop | polity2 | gdpen | gdptype | gdpenl | | | . | . | | 623 | 6.434546 | 10 | 5.091 | 0 | 4.697 | |------+------+------+------| | lgdpenl1 | lpopl1 | region | western | eeurop | lamerica | ssafrica | asia | nafrme | | 8.454679 | 6.426488 | n. africa | 0 | 0 | 0 | 0 | 0 | 1 | |------| | colbrit | colfra | mtnest | lmtnest | elevdiff | Oil | ncontig | ethfrac | ef | plural | | 1 | 0 | 9 | 2.302585 | 1952 | 0 | 0 | .3485829 | .3592001 | .78 | |------+------+------+------| | second | numlang | relfrac | plurrel | minrelpc | muslim | nwstate | polity2l | instab | anocl | | .18 | 2 | .3576 | 78 | 18 | 18 | 0 | 10 | 0 | 0 | |------+------+------| | deml | empeth~c | empwarl | emponset | empgdp~l | emplpopl | emplmt~t | empnco~g | empol~2l | | 1 | .3485829 | 0 | 0 | 4.697 | 6.426488 | 2.302585 | 0 | 10 | |------| | sdwars | sdonset | colwars | colonset | cowwars | cowonset | cowwarl | sdwarl | colwarl | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |------+------+------| | cyr | _est_A | _est_B | test | _F1_1 | _F1_2 | _est_m~1 | _est_m~2 | rstd | | 3521979 | 1 | 1 | -.0025459 | 1 | 0 | 1 | 1 | -.0505275 | |------| | index | | 4844 | +------+

MLE Class 6 28 . list country year onset warl Oil ethfrac if rstd>12 & rstd~=.

+------+ | country year onset warl Oil ethfrac | |------| 993. | SOMALIA 1991 1 1 0 .0767429 | 1831. | PARAGUAY 1947 1 0 0 .1449224 | 2400. | COSTARICA 1948 1 0 0 .07109 | 2831. | JORDAN 1970 1 0 0 .0466286 | 2934. | AFGHANISTAN 1992 1 1 0 .658281 | |------| 3241. | SRI LANKA 1987 1 1 0 .4670414 | 3848. | NICARAGUA 1978 1 0 0 .1793287 | 4843. | CYPRUS 1974 1 0 0 .3485829 | 5122. | ARGENTINA 1973 1 0 0 .3073192 | 5600. | UK 1969 1 0 0 .3254545 | +------+

MLE Class 6 29 . We can also look for influential observations using Cook’s statistic:

2 푟 ℎ푖푖 푃푖 = 2 (1 − ℎ푖푖) Where:

ℎ푖푖 = 휋 푖(1 − 휋 푖) 푥푖푉푎푟(훽)푥푖 ´

. This measures the effect of removing observation i on the vector 훽 .

MLE Class 6 30 Visually

.3

429042914439 1662 3662 .2 2753 4967 4 278 993 568 12701526 2208 44634464 741 948 2330 2933 364 2129 29343241 3172 40034257 4800 406484539 10631380 1490 2400 2971 3803385539624340 4684 34623463 5600 Cook'sStatistic 664 21592160 3935 284459 450 15501831 3498 472750615193

.1 4843 361 8781099 25202548 5122 326 632 23882395 2831 41284129 340552 1435 2122 2549 38194032 576784 227023742550 2963 4549 164183222 670 12671268 2628 3848 577 1057 1506 180 1098 1826 2487 3039 135 875 2224 3022 1433147115041718

7797801000999 46544655 132133 474475628629 12901291 127128185 2833 314018118220020129734139439540748252354254367167275081581683783887190211801181140214031497149815291530162718581859217921802353235425882589283230723073 39323933 0 10111214161713151921221820232425262729303132343335373638394041424524314749504851675251001015455565789531026110410710510658621036063651081091101111126468697172115701141161171131187374767712012112211975821231247880811251267985871291318386130919213613789901348896141142939497139951381431441451461471489899149152150151153154155156157159160162163158161165168166167169170171172173176177178174175179184186187188189190191192193194195196197198199203202204205206207208209210211212213214215216217218219221223220224225227228226229230231232233235236238237239240241242243244245246247248249250251252253255257258254256259260261262263264265266267268270271272273269274275276277279280281282283284286287285288289290291292293294295296298299300302303301304306307308309305310313312311316318317314315319320321323324322325327328329330332333331334336337338339335342343344345346347348349350351352353354356357358359355360363362365366367369368370371372373374375377378376379380381383384385386387388389390391392393396398399397400402403404401405408409410411413414412415416417418419422423424421425426427428429430431432433434435436437438439441442443444440445446447448449452453454451455456457458459460461462463464465466467468470469471472473477478479476480481483485486487488489490491492495493494496497498499500501502503504505506508509510507511512513514515516517518519520521522524525526527528529530533535534531532536537538540541544545547548549550546551553556557558559560561562563564565566567569570571572574575573578579580581582583584585586587588589590591592593594595596597598600599601602603604605606607608609610611612613614615616617618619620621622623624625626627631630633634635636638639640642643644645646641647648649650654656653651652655657658661659660662663665666667668669675676673674677678680681679682684685689691688690692693694695696697699701698702703704705706707708709710711712713715716714717719720721718722724725723726727729730731728732733734735736739737738740743744746745742747748751749752754753755756757758759760761763764765766762767769770768771772773774775776777778781783785786782787788789790791792793794795796797798799800801802803805806807804808809810811812813814817818819820821822823826824825827828829830832831833834835836841842839840843844846845847848849850851852853854855856857858859860861862863864865867866868869872870873874876877879880881882884885886887883888890891892889893894895896897898899900901903905906907904908910911912909913914915917916918920921922919923924925926927928929930931932933934935937936938939940941942943944945946947949950951952953954955956957100110021003958959962100410051006100710089609619639649659669679681010101110121013100996997097197210141015101610171018973974976977978102010211022102397510199799809821026102710289819831024102598498598698710291030103110331032988990991992103510361037103898910349949959981039104010411042104399699710441047104810451046105110521053104910501054105510561058105910601061106210651066106710681064107010711072107310691074107510761077107810791080108110821083108410861087108810851089109010911092109310941095109610971100110111021103110411051106110711081109111011111112111311141115111611171118112011211122112311191124112511281129112611271130113111321133113411351136113911371138114011411142114311441146114711481145115011511152115311491154115511561157115911581160116111641162116311651166116711681169117011711172117311741175117611771178117911821183118411851186118811891190119111921193119411951196119711981199120112021203120412001205120612071208120912101211121212131214121512171218121912161220122112221223122412251226122712281229123012311232123312341235123612371239123812401241124212431244124512461247124812491250125112531254125212551256125712581259126012611262126312641265126612691271127212731274127512761277127812791280128212831284128112851286128712881289129312941292129512961297130012981299130113021303130513041306130713081309131013111312131413151313131613171320131913211322132313241325132613271328132913301331133213331334133513361337133813391340134213431344134513411346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137213731374137513711376137713781379138113821383138413851386138713881389139013911392139313941395139613971398140013991401140414051406140714081409141014111412141314141415141614171419142014181421142214231424142514261427142814291431143214341438143914401436143714411442144514431444144814491450144614471451145214531455145814591460146114561457146214641465146614631467146814691470147214731476147714781479148014811482148414851486148314871488148914911492149314941495149615001501149915021503150515081509151015111507151215131515151615141517151915201521151815221523152415251527152815311532153315341535153615371539154115381540154215431544154615451548154915511547155215531554155715601561155815591562156315641566156515671568156915711570157215731574157515761577157915781580158115821583158615841585158715881591158915901592159315941595159615971598159916001601160216031604160516061607160816091610161116131614161516 0 1000 2000 3000 4000 5000 6000 7000 Observation Number

MLE Class 6 31 Then in more detail

. list country year onset warl Oil ethfrac if cook>.2 & cook~=.

+------+ | country year onset warl Oil ethfrac | |------| 1662. | BANGLADESH 1976 1 0 0 .0049751 | 3662. | LEBANON 1958 1 0 0 .1348051 | 4290. | MOLDOVA 1991 0 0 0 .5515695 | 4291. | MOLDOVA 1992 1 0 0 .5515695 | 4439. | LEBANON 1975 1 0 0 .1348051 | +------+

MLE Class 6 32 Stata code for above figures

** INFLUENTIAL CASES ** predict cook, dbeta label var cook "Cook's Statistic" graph twoway scatter cook index, xlabel(0(1000)7000) ylabel(0(.1).3) /// xtitle("Observation Number") yline(.1 .2) /// msymbol(none) mlabel(index) mlabposition(0) list country year onset warl Oil ethfrac if cook>.2 & cook~=.

MLE Class 6 33 Least Likely Observations

. You can also look for those cases that are least likely (i.e. have large residuals) given your model.

. qui logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// > ethfrac relfrac, robust cluster(ccode) nolog

. leastlikely country year warl lpopl1

Outcome: 0

+------+ | Prob country year warl lpopl1 | |------| 127. | .3946847 INDONESIA 1949 0 11.22257 | 185. | .6983526 INDONESIA 1952 0 11.24785 | 902. | .765673 INDONESIA 1962 0 11.47718 | 1267. | .6832954 PAKISTAN 1947 0 11.18728 | 1268. | .6832954 PAKISTAN 1948 0 11.18728 | +------+

Outcome: 1

+------+ | Prob country year warl lpopl1 | |------| 993. | .006282 SOMALIA 1991 1 8.958411 | 1831. | .0053564 PARAGUAY 1947 0 7.150702 | 3848. | .0045628 NICARAGUA 1978 0 7.852828 | 4843. | .0025399 CYPRUS 1974 0 6.415097 | 5600. | .0056291 UK 1969 0 10.91792 | +------+

MLE Class 6 34 Predicting Pr(Y)

. Again, what we are usually interested in is how our model predicts the of our Y over our main independent variable of interest.

. However, before we do that it can also be informative to get a rough sense of the range of our predicted latent variable.

MLE Class 6 35 As you can see, civil war data is highly skewed

1

.8

.6

.4

Logit: Pr(DV)

.2

0

0 1000 2000 3000 Frequency

MLE Class 6 36 *** Predicted with PREDICT ** .qui logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog .predict prlogit .label var prlogit "Logit: Pr(DV)" .dotplot prlogit, ylabel(0(.2)1)

. However, there are still a few cases with an over 50% chance of conflict.

MLE Class 6 37 . As most of our readings indicate, there is little difference between logit and probit estimates.

. However, in cases where the data are heavily skewed, it is possible.

MLE Class 6 38 Estimating Probit and Logit

qui logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog predict prlogit label var prlogit "Logit: Pr(onset)" qui probit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog predict prprobit label var prprobit "Probit: Pr(onset)" pwcorr prlogit prprobit

* graphing predicted probabilities from logit and probit graph twoway scatter prlogit prprobit, /// xlabel(0(.25).75) ylabel(0(.25).75) /// xline(.25(.25).75) yline(.25(.25).75) /// plotregion(margin(zero)) msymbol(Oh) /// ysize(4.0413) xsize(4.0413)

MLE Class 6 39 . They are still highly correlated.

. pwcorr prlogit prprobit

| prlogit prprobit ------+------prlogit | 1.0000 prprobit | 0.9880 1.0000

. tab onset

1 for civil | war onset | Freq. Percent Cum. ------+------0 | 6,499 98.34 98.34 1 | 110 1.66 100.00 ------+------Total | 6,609 100.00

MLE Class 6 40 .75

.5

Logit: Pr(onset)

.25

0 0 .25 .5 .75 Probit: Pr(onset)

MLE Class 6 41 . So Logit gives slightly higher of Pr(Y). . Why? . The slight difference in the normal and logistic CDFs we saw last week.

MLE Class 6 42 . Let’s try it on less skewed data. . Labor force participation data from Long (1997)

. pwcorr prlogit prprobit

| prlogit prprobit ------+------prlogit | 1.0000 prprobit | 0.9998 1.0000

. tab lfp

Paid Labor | Force: | 1=yes 0=no | Freq. Percent Cum. ------+------NotInLF | 325 43.16 43.16 inLF | 428 56.84 100.00 ------+------Total | 753 100.00

MLE Class 6 43 1

.75

.5

Logit: Pr(lfp)

.25

0 0 .25 .5 .75 1 Probit: Pr(lfp)

MLE Class 6 44 1

.8

.6

.4

Logit: Pr(lfp)

.2

0

0 10 20 30 40 50 Frequency

MLE Class 6 45 . As you can see the DV is more balanced.

. Towards the end of class we are going to talk a bit about ways of dealing with skewed variables.

MLE Class 6 46 Now we all remember out-of-sample predictions

. As described last week, what we are really interested in is understanding what effect our X variable of interest has on Y for different values of X while holding all other variables constant.

MLE Class 6 47 Let’s move to the meat of interpretation

. How do we really get a sense of the relation between my main IV and DV?

. After running a regression there are a number of ways of isolating the effect of my IV on my DV.

. Several ways can be implemented in Spost, an add-on to Stata written by Long and Freese.

MLE Class 6 48 . Let’s say that I am most interested in capturing the probability of conflict as GDP varies.

. What you can do is look at the relationship graphically (I like pictures).

. To look at only GDP’s effect on the probability of conflict I can set all other variables at their mean values (or modes for dichotomous variables) and calculate the predicted parameter 휋 .

MLE Class 6 49 Where: 1 휋 = − 1 + exp[−푋 훽 − 퐺퐷푃(훽훽퐺퐷푃)

. Then we plug in different values for GDP and plot them.

. For more detail see King (1989: 105)

MLE Class 6 50 This is what we get…

.04

.03

.02

.01

ProbabilityCivilOnset of War

0

6 7 8 9 Log(GDP)

Predicted Probability 95% upper limit 95% lower limit

MLE Class 6 51 logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog prgen lgdpenl1, from(6) to(9) gen(gdp) rest(mean) ci label var gdpp1 "Predicted Probability" label var gdpp1ub "95% upper limit" label var gdpp1lb "95% lower limit" label var gdpx "GDP"

* using lines to show confidence interval twoway /// (connected gdpp1 gdpx, /// clcolor(black) clpat(solid) clwidth(medthick) /// msymbol(i) mcolor(none)) /// (connected gdpp1ub gdpx, /// msymbol(i) mcolor(none) /// clcolor(black) clpat(dash) clwidth(thin)) /// (connected gdpp1lb gdpx, /// msymbol(i) mcolor(none) /// clcolor(black) clpat(dash) clwidth(thin)), /// ytitle("Probability of Civil War Onset") yscale(range(0 .03)) /// ylabel(, grid glwidth(medium) glpattern(solid)) /// xscale(range(6 9)) xlabel(6(1)9) xtitle("Log(GDP)") /// ysize(4) xsize(6)

MLE Class 6 52 Alternatively…

.04

.03

.02

.01

ProbabilityCivilOnset of War

0

6 7 8 9 Log(GDP)

95% lower limit/95% upper limit Predicted Probability

MLE Class 6 53 * using shading to show confidence interval graph twoway /// (rarea gdpp1lb gdpp1ub gdpx, color(gs14)) /// (connected gdpp1 gdpx, /// clcolor(black) clpat(solid) clwidth(medthick) /// msymbol(i) mcolor(none)), /// ytitle("Probability of Civil War Onset") yscale(range(0 .03)) /// ylabel(, grid glwidth(medium) glpattern(solid)) /// xscale(range(6 9)) xlabel(6(1)9) xtitle("Log(GDP)") /// ysize(4) xsize(6) // legend(label (1 "95% confidence interval"))

MLE Class 6 54 Or we can look at point predictions using Clarify

. estsimp logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog setx lgdpenl1 p25 anocl 1

. simqi

Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ------+------Pr(onset=0) | .9982832 .0011713 .9953762 .9995933 Pr(onset=1) | .0017168 .0011713 .0004067 .0046238

GDP 25th GDP 50th GDP 75th percentile percentile percentile Anocratic .0017 .0010 .00062 Democratic .0010 .0006 .00030 Percent Δ Pr(onset) -41% -40% -48%

MLE Class 6 55 ** USING CLARIFY ** set seed 1234567890 sort ccode year estsimp logit onset warl lgdpenl1 lpopl1 lmtnest ncontig Oil nwstate instab anocl deml /// ethfrac relfrac, robust cluster(ccode) nolog setx lgdpenl1 p25 anocl 1 simqi setx lgdpenl1 p50 anocl 1 simqi setx lgdpenl1 p75 anocl 1 simqi setx lgdpenl1 p25 anocl 0 deml 1 simqi setx lgdpenl1 p50 anocl 0 deml 1 simqi setx lgdpenl1 p75 anocl 0 deml 1 simqi

MLE Class 6 56 . Lastly, it is possible to set values to those actually observed in your data (say Nicaragua in 1993 or Ethiopia in 2008).

. In summary, there are a number of different ways of trying to tease out the relationship between your dependent and independent variables.

MLE Class 6 57 . Moving from interpretation back to modeling…

MLE Class 6 58 Heteroskedastic Probit

. Heteroskedasticity is a greater problem in ML than in OLS, so it is important to account for non- constant variance to inference and .

MLE Class 6 59 . To begin it is important to understand the difference between heteroskedasticity and heterogeneity.

. Heterogeneity typically refers to a difference between groups in their mean in Y.

. Heteroskedasticity refers to a difference in the between groups.

MLE Class 6 60 . Again remember the Bernoulli distribution. 1 , 휋 Y = 0, 1 − 휋

Where:

휋푖 = f(Xβ) and for probit = Φ(Xβ)

Where P(Y = 1) = Φ(Xβ) and P(Y = 0) = 1 - Φ(Xβ)

MLE Class 4 61 . The probit maximum likelihood function again:

푛 Ln L = 푖=1 푦푖lnΦ(Xβ) + (1- 푦푖)lnΦ(Xβ)

MLE Class 4 62 . And again remember that for both the logit and probit the variance was assumed to be a 휋2 constant—1 for probit and for logit. 3

. But what if the model is not homoskedastic? In OLS our estimates would not be efficient, but they would still be unbiased.

. This is not the case in ML.

MLE Class 6 63 Consider an example…

. Supposed we have data on baseball players at- bats and whether they get hits. . Whether to swing or not is determined by some X variables. . We have data on both Major League players and AAA minor league players. . If we run probit models on both types of players we find that they have the same βs. . The effect of X on Pr(swinging) is the same for both samples and models.

MLE Class 6 64 . However, think about these two groups. . The Major Leaguers are the best of the best (think Big Poppie David Ortiz) and are selective in what pitches they go for. . The AAA players are also quite good, but maybe they are not as selective so they swing at more pitches than they should. . This means that the error term is smaller for the ML players than the AAA players.

MLE Class 6 65 . Now suppose we put data on both types of players into one model (Yes! More observations).

. If we include a dummy for whether the player is ML or AAA, this variable should be insignificant.

. But remember the kernel of the probit likelihood function is a ratio: 푋훽 푙푛Φ 휎2

MLE Class 6 66 . We are assuming that the variance is the same for both groups and that it is equals to 1. If either assumption fails then our estimates of β will be inconsistent.

MLE Class 6 67 Again: 푋훽 푙푛Φ 휎2 So we can write:

훽퐴퐴퐴 휎2 퐴퐴퐴 = 1 훽푀퐿 휎2 푀퐿

2 2 . But what if 휎 푀퐿 equals 1 but 휎 퐴퐴퐴 does not equal 1 but rather δ?

MLE Class 6 68 . Then,

훽퐴퐴퐴 δ 훽 퐴퐴퐴 = 퐴퐴퐴 훽푀퐿 1 δ퐴퐴퐴훽푀퐿 . Thus as the variance γ in AAA players increases our estimate of 훽 퐴퐴퐴 decreases.

. This means that our estimates in ML are both inconsistent and inefficient.

MLE Class 6 69 . This in some ways is similar to omitted variable bias.

. Some difference between groups in our data is not being controlled for.

. However correcting for heteroskedasticity is not as simple as correcting for heterogeneity.

MLE Class 6 70 . To correct for heterogeneity then, you need to control for the difference in variance.

. For the let

푍´ Var[ε]=휎2 = [푒훾 ]2

So

푍´ 휎 =푒훾

MLE Class 6 71 This gives the probit likelihood function:

푛 푋훽 푋훽 Ln L = 푖=1 푦푖lnΦ( ´) + (1- 푦푖)lnΦ( ´ ) 푒훾푍 푒훾푍

. Therefore we now have two unknowns, 훽 and γ, where 훽 represents the effects of X on the mean probability of Y, and γ represents the effects of Z on the variance of Y.

. It is possible for X and Z to be the same (e.g. ethnic fractionalization in Blimes 2006).

MLE Class 6 72 푍´ . Why 푒훾 ? Basically it has some properties that the variance must maintain.

. First, it must be positive.

. Second, if the effect of Z on the variance is 0, then 휎2 must revert to 1. Exponentiating γ does both of these things.

MLE Class 6 73 . Note that as the estimated heteroskedasticity γ increases the effects of the independent variables X decreases. (why is this?)

. Now, let’s turn to the Blimes (2006) and the Alvarez and Brehm (1995) articles to look at how researchers have used (and theoretically justified using) heteroskedastic probit.

MLE Class 6 74 Interaction Terms

. A detailed exploration of the debate about a) the need for and b) the interpretation of interaction terms is outside the scope of this class.

. I did want to show you that it is out there.

. Let’s take the time we have left to work through Berry et al.’s (2010) argument and evidence.

MLE Class 6 75 . See you next week!

MLE Class 6 76