Bayesian Computation for Parametric Models of Heteroscedasticity in The

Home , Sensitivity analysis

Bayesian Computation for Parametric Mo dels

of Heteroscedasticity in the Linear Mo del

W John Boscardin Andrew Gelman

Department of Statistics University of California Berkeley

May

Revised Novemb er

In Advances in Econometrics A We thank the National Science Foundation

for grants SBR DMS and a Graduate Research Fellowship

ABSTRACT

In the linear mo del with unknown variances one can often mo del the

heteroscedasticity as vary f w where f is a xed function w are

i i i

the weights for the problem and is an unknown parameter f w w

is a traditional choice

We show how to do a fully Bayesian computation in this simple linear

setting and also for a hierarchical mo del The full Bayesian computation has

the advantage that we are able to average over our uncertainty in instead of

using a p oint estimate We carry out the computations for a problem involving

forecasting US Presidential elections lo oking at dierent choices for f and

the eects on b oth estimation and prediction

Intro duction

In b oth the econometrics and statistics literature a standard way to mo del

heteroscedasticity in regression is through a parametric mo del for the unequal

variances as describ ed in many places eg Amemiya Greene

Judge et al Carroll Rupp ert Mo deling heteroscedasticity

should improve the eciency of estimates of regression co ecients but the

most imp ortant eect is in prediction allowing predictive inferences to b e

more precise for some units and less precise for others

Previous approaches have tended to fo cus on obtaining a p oint estimate for

the heteroscedasticity parameter p ossibly vectorvalued Unfortunately

in many applications the parameter governing the heteroscedasticity is not

wellidentied by the data so any p oint estimate may b e quite unreliable

Furthermore there is no consensus on which p oint estimate to use Carroll

Rupp ert

To avoid these problems we use a fully Bayesian approach which automat

ically averages over our uncertainty in the mo del parameters One p otential

pitfall of Bayesian analysis is its sensitivity to the choice of a prior distribution

but for the example that we consider our inference and prediction are quite

robust to reasonable choices of prior distributions on

In this pap er we will develop a computational framework for parame

ter estimation and prediction in b oth nonhierarchical NH and hierarchical

HIER linear mo dels The computation is quite straightforward for the NH

mo dels and harder for the HIER mo dels We will illustrate our metho ds with

the example of forecasting US Presidential elections

Mo dels

Heteroscedasticity in the linear regression

We use the following regression mo del for the n indep endent units on the

k covariates

y j NX f w

i i i

For later reference the loglikeliho o d function is

log log py j log f w

y X f w

i i i

We consider the sp ecial case in which the unequal variances are governed by

known weights w and we wish to dene a continuum b etween the extreme

cases of no weighting and having variance prop ortional to w ie weighted

least squares We consider two dierent forms for f

HET f w w

where and

HET f w

where For either mo del the extremes of and corresp ond to

equal variances and variances prop ortional to w resp ectively

The rst mo del is standard in the literature see Greene for exam

ple The second mo del has a natural interpretation as two indep endent sources

of variation with one term having variance inversely prop ortional to w and

the other term having constant variance eg sampling and mo deling errors

We also consider as a comparison the equal variance EV or homoscedastic

mo del ie f w

Parametrization of the weights

In each case we normalize the n values f w to have pro duct equal to

as suggested by Box Cox This is done to provide a consistent

meaning for b etween mo dels with dierent values of and to simplify the

expression for the likeliho o d see equation In the sp ecication of equation

this just amounts to dividing the w by their geometric mean so without

loss of generality we can take the w to have pro duct equal to one For the form

in equation the normalization actually implies a slightly dierent functional

form of HET

HET f w

For HET there is no trick that will let us avoid the issue of the normalization

Prior distributions on the variance parameters

We consider various noninformative prior distributions to the variance

comp onents For the EV mo del the improp er uniform prior density on log

is standard from many p ersp ectives eg Box and Tiao For the het

eroscedastic mo dels there do es not seem to b e any clear choice for a noninfor

mative prior density on but p on the unit interval seems reasonable

We cannot use a uniform density on logit as it would lead to an improp er

p osterior density under either mo del Another scho ol of thought suggests cal

culating the Jereys prior density eg Box and Tiao The conditional

Fisher information for is calculated as

f w

For HET this turns out to b e a function of the w s only and thus the

information is constant suggesting a uniform prior density on For HET

the Jereys prior density has a particularly unattractive form and we did not

pursue this any further

In our main analysis here we use a uniform prior density on log for

b oth mo dels For each mo del we assess the sensitivity to the choice of prior

density distribution by comparing to inferences obtained using the Beta

for which is another standard noninformative density for a parameter on

Mo dels for the regression co ecients

When computing regressions in the nonhierarchical mo del NH using least

squares we are implicitly assigning a uniform prior density on the regression

parameters in the mo del

In the hierarchical mo del HIER we let have an informative prior dis

tribution which dep ends on unknown parameters In this pap er we restrict to

the case

j N

where is a diagonal matrix whose entries come from

Lindley and Smith discuss more general forms of the hierarchical

normal mo del If we want to keep a nonhierarchical structure on some

of the s xed eects we can set the corresp onding entries of to

zero Having done this there are now k s with an informative prior dis

tribution random eects Of these k have variance j J and

k k k k The hierarchical mo del is not necessary for our study of

heteroscedasticity but in all our exp eriences including the example of election

forecasting discussed b elow the hierarchical part of the mo del has b een im

p ortant b oth for parameter estimation and prediction We will elab orate on

this imp ortance in the example We assign a noninformative prior density to

the variance comp onents As is well known in the statistics literature eg

Hill and Box Tiao we cannot assign a uniform prior density on

the parameters log as that would lead to an improp er p osterior distribution

Instead we consider two alternative noninformative distributions uniform in

and in

Computation

Our general goal in Bayesian computation is to draw simulations from the

p osterior distribution of the unknown parameters and then to sample from

the predictive distribution of future data by using these simulated parameter

draws The computational pro cedure is easier for the nonhierarchical case

since there are only two unknown variance parameters and we are able to

draw indep endent realizations of the p osterior by factoring it in an appropriate

way In the hierarchical case we need to use a Gibbstyp e sampler and the

computation is more elab orate

Nonhierarchical regression

First consider the NHHET mo dels Conditional on we can use the

standard results for Bayesian linear weighted regression eg Box Tiao

Zellner

j y

j y N V

where

W diagf w f w

T T

X WX X W y

V X WX

and

S y X W y X

is the sum of the squared weighted residuals

We can determine the marginal p osterior density of using the identity

p jy

p j y

py j p

p j y p j y

Substituting the likeliho o d equation and equations and we nd that

f w exp S p

p jy

k T nk nk

jV j exp V S exp S

Since the left side of this expression is a function of and y only the right

side of this can not dep end on We can therefore set to any value we wish

and we set for b oth numerical stability and algebraic simplicity Also

recall that the pro duct of the f s is equal to one as discussed earlier so we

now have

p jy jV j S p

This is not a recognizable distribution but we can certainly compute the value

of the unnormalized p osterior density for any value of by doing the weighted

leastsquares regression and computing the quantity in Equation We do

this over a ne mesh of values and then use the inversecdf metho d to simulate

many draws from this arbitrarily go o d approximation to p jy Given each of

our simulated values we can compute W and then use weighted least squares

to get and S draw from its p osterior distribution compute V and draw

a sample of from its p osterior distribution Once we have a set of p osterior

simulations of we can simulate from the predictive distribution of y

given a new set of covariates X

pred

The pro cedure for NHEV is much simpler By letting W I and sup

pressing the conditioning on Equations give a full factorization of the

p osterior distribution of and so the computations for this sp ecial case

are trivial to carry out

Hierarchical regression

The hierarchical regression mo del can b e interpreted as a nonhierarchical

regression with additional data corresp onding to the prior distribution eg

Dempster Rubin and Tsutakawa

y j NX

p j

where y X and are derived by combining the likeliho o d for y and the

prior distribution on

X W y

X y

W is dened in Equation with the prior distribution of the variance pa

rameters p determined by the prior distribution of

Even though we have reduced the problem to a nonhierarchical regression

we can not just pro ceed as in Section in which we were able to analytically

integrate out to nd an expression for p jy Here there are to o many

variance parameters and so we need a dierent metho d We

use the following general paradigm not just for this problem but for other

problems in our work First we attempt to nd the mo de of p jy

by an EMtyp e algorithm as in Dempster Rubin and Tsutakawa

We then run a Gibbs sampler eg Gelfand and Smith to sample from

p jy If a particular conditional is not easy to sample from then we

use a step of Metrop olis algorithm instead of a Gibbs step for that parameter

This Markov chain simulation has the p osterior distribution as its equilibrium

distribution see eg Smith and Rob erts To help monitor convergence

we run multiple parallel sequences from overdisp ersed starting p oints simulated

from a t distribution centered at the mo de of found by the EM

pro cedure Gelman Rubin

In this particular problem the GibbsMetrop olis sampler works as follows

for clarity we demonstrate the calculations with the uniform prior distribution

on log At the tth iteration for a particular sequence we have

t t t

We then sample

t t t

j y N V

where and V come from the weighted leastsquares pro cedure applied to

y X and weights Then given and the conditional p osterior

t t

distribution of the variance comp onents and can b e determined

from the weighted residuals r W y X where

Each variance comp onent is simulated as the sum of the squares of the corre

sp onding elements of r divided by a random variate For n k

k and for each

t t

We now need to draw from p j y Algebraically we

have

t t

t t t t

p j y p jy

Y Y

t n t

f w p

i j

exp y X y X

We use a Metrop olis step here b ecause the density has no standard form but

is numerically computable for any set of co ecients and variance parameters

The Metrop olis algorithm is easiest to describ e in the sp ecial case of a sym

metric random walk chain see eg Tierney for other p ossibilities We

generate a candidate by taking a random step away from our current value

t t

according to a jumping density J j The jumping density is symmetric

t t t

that is J j J j for any We then accept this candidate with

probability

t t

p j y

min

t t t

p j y

For this problem we use a symmetric normal jumping kernel J j

N j v with v set to times the estimated variance of based on the

normal approximation at the EM estimate as suggested in Gelman Rob erts

and Gilks To maintain symmetry we reect the random walk at the

b oundaries of the space of and

Once we have a set of simulations from p jy we can use these

to make predictions of y for a new set of predictive covariates X The pr ed

pro cedure is not completely straight forward so we describ e it briey We

use each simulation to generate realizations of the k random s We

combine these with the xed s and simulation to get y We then

pr ed

rep eat this pro cess for the rest of the simulations in out set The resulting set

of y s is a sample from the p osterior predictive distribution py jy

pr ed pr ed

Numerical Illustration

Judge et al Example give a simulated data set based on the

following heteroscedastic linear mo del

y j N X exp x

i i i

T T

where the ith row of X is x x and

i i

T T

We can transform this to the form of HET by taking w exp x

i i

x and exp x Our p osterior inference for and

is summarized in Table Given our p osterior simulations of and we

just transform back to the parametrization to get p osterior simulations of

log x The inference is quite similar to the general

ized least squares estimates obtained in the reference by rst estimating

Our p osterior means are slightly closer than the GLS estimates to the true

parameter values in each case

TABLE ABOUT HERE

Example Forecasting US Presidential Elec

tions

Background

Most p olitical scientists are now aware that the US Presidential election

can b e predicted quite well with information available several months b efore

the actual election In contrast the Gallup p oll taken several months b efore

the election can predict a landslide victory for the eventual loser as when

Dukakis lost to Bush in The metho ds used by Rosenstone

Fair Campb ell and others use standard regression

metho ds and are remarkably successful We b ecame interested in this problem

to see if we could do b etter by using more rened statistical techniques

We b egin by using a mo del describ ed by Gelman and King that is

similar to the forecasting mo del of Campb ell The metho d is to run

a simple linear regression of Demo cratic share of the twoparty vote by state

for each election year b eginning with on several covariates These co

variates can b e divided into three groups national variables state variables

and regional variables First dene the variable INC to b e if a Demo crat is

President and if not The national variables that we use are the Demo

cratic share in the Septemb er Gallup p oll TRIHEAT the change in GNP in

the previous quarter times INC ECxINC a dummy to say if the current

president is seeking reelection times INC PRESINC and the latest approval

rating times INC APPRxINC The state variables are the Demo cratic share

in the last Presidential elections DEV and DEV homestate advantage

for the President and the VicePresident times INC HOME and VP state

economic growth in the past year times INC ECxINC legislature parti

sanship LEGIS an index of state lib eralism ADAACA and a measure of

the prop ortion of Catholic voters in each state for the year in which there

was a Catholic candidate CATH The regional variables mostly used to

patch up problems in abnormal years are a dummy for the South

in SB and similar dummies for the Deep South in DEEPS

New England in NEWENG the West in W and the North

Central in NCENT There is also a true regional variable S which is

for every southern state in the years for which the Demo cratic candidate had

a Southern home state The variables are signed so that they are all exp ected

to have p ositive co ecients Finally we have a column of s CONSTANT

to give a total of k covariates for the NH mo dels We exclude all cases

in which a thirdparty candidate for President won the plurality of the votes

in a state leaving us with observations

There are a few issues that need to b e addressed in connection with our

mo deling choices see Judge et al Chapter First the parameter

space is actually discrete and dierent for each state b ecause we are lo oking

at the Demo cratic share of a nite numb er of votes This is not really a prob

lem though b ecause of the enormous p opulation sizes involved no less than

twenty thousand in any particular instance Second prop ortions for count

data typically are heteroscedastic which usually precludes a standard regres

sion analysis but our mo dels are taking this into account Finally in many

problems that o ccur on the unit interval there is highly non linear b ehavior

near the end p oints which necessitates a logit or probit transformation Here

though of the data p oints are b etween and So while we could

mo del a transformation of the data this would probably not make much of a

dierence

Mo dels NHEV NHHET and NHHET

Homoscedastic mo dels are standard in empirical studies of elections and

so our rst mo del NHEV is that the variance in each state for each election

year is a constant

Of course we do not really think that the states are equally variable

In fact we might exp ect that the bigger states are less variable than the

smaller states A naive mo del which says that votes are binomially distributed

implies that the variance of the Demo cratic vote share should b e prop ortional

where n is the numb er of voters This motivates using the mo dels of to

heteroscedasticity discussed in Section and we tried b oth forms for f with

corresp onding mo dels NHHET and NHHET with w b eing the numb er of

voters for the Demo crats and Republicans in the corresp onding state and year

For predictions of we set w to the voter turnout in the state in

scaled by the increase in the votingage p opulation of the state from to

Mo dels HIEREV HIERHET and HIERHET

We can immediately notice some p otential problems with the NH mo dels

First most of the regional variables while they certainly allow the mo dels to

of b etter than for NHEV will t b etter we calculated a multiple R

not help at all for doing predictions Also this mo del ignores the yearbyyear

structure completely and treats the data as indep endent observations

These problems are handled with a hierarchical mo del We include all

the original covariates except for the regional patchup variables SB

DEEPS NEWENG W and NCENT We leave a at prior den

sity on the original s We then include a covariate for election year which

has levels corresp onding to the elections from to and a co

variate for region by year The regions we used were East Midwest West

and South Thus there were levels of this factor We call the s as

so ciated with the year factor the year eects and the other s the region

year eects The year eects are given indep endent normal prior distribu

tions with mean zero and variance Similarly the region year eects are

N except for the South year eects which are N reecting the

prior b elief that the South is a sp ecial region that is not exchangeable with

the other three p olitically To match some of the earlier denitions there are

k covariates of which k have informative prior

distributions with k k and k

This setup is quite arbitrary We could sp ecify more than four regions we

could give each region a separate variance and so on This mo del represents

a simple rstpass solution which we hop e should address the most serious

problems of the NH mo dels We would want to try more carefully sp ecied

mo dels and check the robustness of our inference to changing the sp ecication

We t this hierarchical mo del with the equal variance assumption HI

EREV and the two heteroscedastic mo dels HIERHET and HIERHET

from Section

Results with nonhierarchical mo dels

Table contains a summary of the p osterior simulations for the parameters

Table shows quantiles of the state by state electoral college and p opular vote

predictions for the elections Figure plots the prediction standard errors

as versus the w on a log scale

TABLE ABOUT HERE

FIGURE ABOUT HERE

The p osterior distribution of the co ecients is comparable across mo dels

The p osterior medians of the co ecients all have the exp ected p ositive sign

except for APPRxINC which is essentially zero Both of the HET mo dels

nd evidence for a small amount of heteroscedasticity This can b e seen in

Figures b and c and also from the p osterior intervals for in Table

For the predictions there is nearly exact agreement in the medians as can b e

seen in Table so we do not show a comparison plot The standard errors of

prediction for NHEV b ear no relation to turnout and are centered around

The standard errors for NHHET decrease linearly with log turnout whereas

those for NHHET decrease linearly at rst but then level out following the

parametric mo del Predictive standard errors range from to for

NHHET and from to for NHHET

To conrm our suspicion that the year by year structure of the data is

not b eing dealt with eectively by the NH mo dels we compute the average

residuals by election year This gives us a list of numb ers for each of the

three mo dels For NHEV we exp ect these numb ers r to b e approximately

indep endently normally distributed with variances n where n is the num

t t

b er of observations we used in election year t ie that n r N

The p osterior interval for from Table is roughly but

empirically the eleven values of n r have a standard deviation of nearly

ie the year by year swings are ab out twice as variable as the mo del

would predict Similarly for NHHET and NHHET the mo del predicts that

f w r N The results are roughly the same despite a n

i t t

p osterior interval for of in each of the mo dels the empiri

cal variation of the eleven quantities is nearly in each case We p erform a

more formal check of the mo del in Section

Results with hierarchical mo dels

For each of the hierarchical mo dels we computed the mo de of the variance

comp onents using the EM algorithm and then ran parallel simulation se

quences of the Gibbstyp e sampler each of length As describ ed in Gelman

and Rubin we can use the multiple sequences to help decide whether

the sampler has run for a long enough time We calculated the p otential scale

reductions for all the parameters R based on discarding the rst half of each

sequence to allow burnin to o ccur and they were all less than Since

running the sampler for an innitely long time would only bring these num

b ers down to we are satised with the convergence of the sampler Table

summarizes the p osterior distribution of the parameters Table the quantiles

of the predictions and Figure the prediction standard errors

The co ecient estimates app ear to b e relatively similar They are p osi

tively signed as exp ected If anything the HET estimates are more ecient

have narrower intervals than the EV estimates in most of the cases The

hierarchical standard deviations are not determined with very much

precision This p oints out one great advantage of doing Bayesian computa

tions if we had simply made p oint estimates of these variance comp onents we

would have b een ignoring a wide range of p ossible values for all the parameters

Also there is more evidence for heteroscedasticity than in the NH mo dels In

tuitively this makes sense b ecause we are accounting for the year and the

region by year variability it should b e easier to detect unequal variances in

the state by state errors

TABLE ABOUT HERE

FIGURE ABOUT HERE

Again the median predictions are virtually identical for b oth the HIER

and NH mo dels and we do not show a plot The prediction SEs split nicely

into southern and nonsouthern states with the southern states ab out one

p ercent higher Within each of the groups we see the same b ehavior as a

function of turnout as we saw in the NH mo dels The prediction standard

errors are much larger than the ones for the NH mo dels they range from

to p erhaps to o large b ecause of our simplistic mo del of the regions of

the country To develop a mo del with narrower prediction intervals requires

including more information ab out the elections

To check that idea we t HIERHET again but this time including the ve

regional covariates HIERHETALL Although this is not a welldened mo del

for prediction as explained earlier we certainly get b etter results by doing this

Now the p osterior distributions of the s have b oth smaller medians and lower

variance Also the prediction standard errors are more reasonable ranging

b etween and Tables and and Figure display some of these results

TABLE ABOUT HERE

FIGURE ABOUT HERE

Mo del Checking

For each of our simulated parameter vectors we can

generate a replicated data set y Then to check whether the mo del ts

r ep

well in some way we can calculate an appropriate test variable T which is a

function of b oth data and parameters For example if we are concerned ab out

the yearly swings as discussed at the end of Section we could calculate the

mean square national prediction error

X X

w y X T y

it it it

i2ALW Y t2

recall that w is the turnout in state i and election year t Then we can

lo ok at a scatter plot of T y vs T y for our entire set of simulations

r ep

of If the mo del ts well the p oints should cluster around the degree line

We can regard the p ercentage b elow the line as a pvalue Gelman Meng and

Stern and Rubin for formally testing the t of the mo del

We carried out this computation for the three NH mo dels and three HIER

mo dels that were originally intro duced in this pap er The scatter plots app ear

in Figure The pvalue is essentially for the three NH mo dels and is

acceptable for the HIER mo dels telling us that the year to year variability

has b een captured well by the HIER mo dels

FIGURE ABOUT HERE

Sensitivity analysis

Given our set of simulations from the p osterior distribution of the pa

rameters for an original mo del it is not dicult to get approximate p osterior

simulations from a p erturb ed mo del ie with dierent likeliho o d or prior

assumptions To do this we use the imp ortance resampling SIR algorithm

Rubin see also Gelman and Smith and Rob erts For

each of the original simulations we calculate its imp ortance weight ie the

ratio of the unnormalized p erturb ed and original p osterior densities We

then cho ose a subsample without replacement from the original set of sim

ulations with sampling probabilities prop ortional to the imp ortance weights

This subsample is approximately from the p erturb ed p osterior distribution

We b egan by changing the prior distribution assumption on from uniform

on the unit interval to Beta A summary of the changes app ears in

Tables and Since we already know that the mo dels favor values of less

than changing to this new prior mo del will tend to favor even smaller

values of Indeed for all ve mo dels the p osterior quantiles of are shifted

down slightly and the predictions b ecame slightly more like the EV results

Tables and

TABLE ABOUT HERE

Next we changed the prior distribution on the from uniform on to uni

form on This will lead us to favor lower values for these variance parameters

as can b e seen in Table The new predictions are in Table

The most imp ortant thing to notice is that the predictions are quite robust

to either change in the prior density A more complete robustness analysis

would p erturb either the normal likeliho o d or normal prior density on the

s We examined the residuals from the regression and found no outliers

suggesting that the normal likeliho o d mo del is acceptable

Discussion

Heteroscedasticity was an easy thing to add in the Bayesian context with

little added computational eort once we decided on the weights and sp ecied

the priors Putting in the hierarchical mo del is the most imp ortant thing

but the heteroscedastic mo del adds some mo del t and makes the predictions

more trustworthy The Bayesian mo del allows us to t a parametric mo del

for unequal variances even in situations in which the parameter is not well

identied thus allowing us to use information enco ded in the weights w in a

more general and exible setting than weighted least squares

NOTE The co de and data used to do the computations in this pap er are

available from the authors up on request

References

Amemiya T Advanced Econometrics Cambridge Mass Harvard

University Press

Box G E P and Cox D R An analysis of transformations with

discussion Journal of the Royal Statistical Society B

Box G E P and Tiao G C Bayesian Inference in Statistical Anal

ysis New York Wiley Classics

Campb ell J E Forecasting the Presidential vote in the states Amer

ican Journal of Political Science

Carroll R J and Rupp ert D Transformations and Weighting in

Regression New York Chapman and Hall

Dempster A P Laird N M and Rubin D B Maximum likeliho o d

from incomplete data via the EM algorithm with discussion Journal of

the Royal Statistical Society B

Fair R C The eect of economic events on votes for President

Review of Economics and Statistics

Fair R C The eect of economic events on votes for President

Up date Review of Economics and Statistics

Fair R C The eect of economic events on votes for President

Up date Political Behavior

Gelfand A E and Smith A F M Samplingbased approaches to

calculating marginal densities Journal of the American Statistical Associ

ation

Gelman A Iterative and noniterative simulation algorithms Com

puting Science and Statistics

Gelman A Meng X L and Stern H S Bayesian mo del checking

using tail area probabilities Under revision for Statistica Sinica

Gelman A Rob erts G O and Gilks W R Ecient Metrop olis

jumping rules Presented at Bayesian Statistics Valencia Spain

Gelman A and Rubin D B Inference from iterative simulation

using multiple sequences with discussion Statistical Science

Gelman A and King G Why are American Presidential election

campaign p olls so variable when votes are so predictable British Journal

of Political Science

Greene W H Econometric Analysis New York Macmillan

Hill B M Inference ab out variance comp onents in the oneway mo del

Journal of the American Statistical Association

Judge G Hill C Griths H Lutkep ohl H and Lee T Intro

duction to the Theory and Practice of Econometrics nd ed New York

Wiley

Judge G Hill C Griths H Lutkep ohl H and Lee T The

Theory and Practice of Econometrics nd ed New York Wiley

Lindley D V and Smith A F M Bayes estimates for the linear

mo del with discussion Journal of the Royal Statistical Society B

Rosenstone S J Predicting elections Unpublished manuscript De

partment of p olitical science University of Michigan

Rosenstone S J Forecasting Presidential Elections New Haven Yale

University Press

Rubin D B Bayesianly justiable and relevant frequency calculations

for the applied statistician x Annals of Statistics

Rubin D B A noniterative samplingimp ortance resampling alterna

tive to the data augmentation algorithm for creating a few imputations

when fractions of missing information are mo dest the SIR algorithm

Comment on The calculation of p osterior distributions by data augmen

tation by M A Tanner and W H Wong Journal of the American

Statistical Association

Smith A F M and Rob erts G Bayesian computation via the Gibbs

sampler and related Markov Chain Monte Carlo metho ds with discussion

Journal of the Royal Statistical Society B

Tierney L Markov Chains for exploring p osterior distributions Tech

rep ort Department of Statistics University of Minnesota

Zellner A An Introduction to Bayesian Inference in Econometrics

New York Wiley

BAYES GLS True Parameter

E jy SD jy Estimate SE Value

Table Comparison of Posterior Distribution with GLS Estimates for the

simulated data example of Judge et al

NHEV NHHET NHHET

Constant

TRIHEAT

ECxINC

PRESINC

APPRxINC

DEV

HOME

LEGIS

ECxINC

ADAACA

CATH

DEMS

NEWENG

NCENT

SIG

THETA

Table median and quantiles of the p osterior distributions of the

parameters for the NH mo dels All values are obtained by p osterior simulation

NHEV NHHET NHHET

Actual

ELECCOLL

POPVOTE

Table median and quantiles of the predictions for for NH mo dels

HIEREV HIERHET HIERHET

Constant

TRIHEAT

ECxINC

PRESINC

APPRxINC

DEV

HOME

LEGIS

ECxINC

ADAACA

CATH

SIG

TAU

THETA

Table median and quantiles of the p osterior distributions of the

parameters for the HIER mo dels random eects omitted

HIEREV HIERHET HIERHET

Actual

ELECCOLL

POPVOTE

Table median and quantiles of the predictions for for

HIER mo dels

HIERHET HIERHETALL

SIG

TAU

THETA

Table Posterior quantiles of variance comp onents for HIERHET and HI

ERHETALL

HIERHET HIERHETALL

Actual

ELECCOLL

POPVOTE

Table Prediction summary for HIERHET and HIERHETALL

Original Beta:;:

NHHET

SIG

THETA

NHHET

SIG

THETA

Table Variance comp onent sensitivity analysis comparing original and new

prior densities on theta for NH mo dels

Original Beta

Actual

NHHET

ELECCOLL

POPVOTE

NHHET

ELECCOLL

POPVOTE

Table Prediction sensitivity analysis comparing original and new prior den

sities on theta for NH mo dels

Original Beta /

HIERHET

SIG

TAU

THETA

HIERHET

SIG

TAU

THETA

Table Variance comp onent sensitivity analysis comparing original and new

prior densities for HIER mo dels

Original Beta /

Actual

HIERHET

ELECCOLL

POPVOTE

HIERHET

ELECCOLL

POPVOTE

Table Prediction sensitivity analysis comparing original and new priors on

theta for HIER mo dels WY AK VT DE WYAKDE ID ND MS TX VTSDMT AR ALLAGA ND NV NV MD HARINHNM AR SDMTIDRINHNM AR VT MT NB IN OHFL MEWVNBUT TN HA DENDHA IOCNTNWSNCVA KSSCAZAL MEUT WY NVRINH OKAZCO MNMANJMI NY CA MSOKORCNLA WVNB TX AK NMMEUT OR WA PA IOKYCO VA MSSCAZALCNLATNMDGANC Prediction SE ID WV KSSCKY MO IL Prediction SE MDWANC Prediction SE KSIOCO WA FL SD GAMNIN ORKY INVA NJ MOMANJ OH OK MO NY WS FL NY MNWSMA PAIL CA MIPAILTX MIOH CA 3.2 3.6 4.0 4.4 3.2 3.6 4.0 4.4 3.2 3.6 4.0 4.4 500 1000 5000 500 1000 5000 500 1000 5000

Turnout Turnout Turnout

Figure Standard error of predictions versus turnout in s on a log

scale for a NHEV b NHHET and c NHHET AR MSSC TNGA FL AL NCVA TX MS OKKYLA AR TN ARMS TN VA AL GAVA AL GA SCKYLA SCKYLA NC FLTX OK NC FLTX OK

WY WY VT VTDENDSD AKDE Prediction SE Prediction SE AK MT Prediction SE NDSD HA HAMT ND ID KS NHNMMENBUT WY SDHARINHNMUT ORIO MNWS IL CA NVIDRI WV NVNHNMMEWVNBUT AKVTDE MTNV NB AZCOCNMDMOINMANJ OHPA NY KS IDRI WV WA MI ORAZCO MO KSORAZCO WAMOMA 6.5 7.0 7.5 8.0 8.5 ME 6.5 7.0 7.5 8.0 8.5 IOCNMDWAMNMA 6.5 7.0 7.5 8.0 8.5 IOCNMDMNWS NJMIOHIL NY CA WSIN NJMIOHIL IN PA PA NY CA 500 1000 5000 500 1000 5000 500 1000 5000

Turnout Turnout Turnout

Figure Standard error of predictions versus turnout in s on a log

scale for a HIEREV b HIERHET and c HIERHET AR AR NCSCGAALTN SC ALTNGANC MS MS TX TX AKVTWY LA WYAKVT LA MT MT FL FL NH DEND KY VA DEND NH KY VA HA OK HA OK RIUT SD SD RI UT IDWVNBME ID MEWVNB IO NVNM NVNM IO CNWAORKS KSORCNWA Prediction SE HIERHETALL MO MO WSAZ AZ WS OHNJINMNMDMACO COMDMNINMANJ OH IL IL 5.2PANY 5.4MI 5.6 5.8 6.0 5.2 5.4 5.6 5.8 6.0 MIPA NY CA CA 6.5 7.0 7.5 8.0 500 1000 5000

HIERHET1 Turnout

Figure Standard error of predictions for HIERHETALL vs a SEs for HI

ERHET and b turnout NHEV NHHET1 NHHET2

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·· · · · · ··· · · ·· · ·· · · · · · · · · · · · · · · · · · · · ···· · · · · · ·· ·· · · ·· · · ·· · · · · ··· · · · · ·· ··· · · ·· · · · · ··· · · ·· · · ·· ·· · · ···· ··· ··· ·· · ·· · · ·· ·· · · ·· ··· · · · · · · ··· · · · ·········· ··· ····· · · ·· · · ·· · · · · · ·· ····· ·· · ·· · · ········ ·· · · · · · · ··· · · ·· · ······ ···· ············ · ·· ············ ·· · ·· ·· · · · · ··········· ······· · ·· · · · · ·· ······· ····· ··· ···· ··· · · · ······ ······ · · ·· ·· ··················· ·· · · · ··· ························· ··· · ·· · · ········· ·· ····· ····· · ·· ·· ·· ····················· ······· · · ·· ········································ · · ·· ······· ············· · · · · · ············ ································ ··· · · ······································ ······· · · ··········· ··· · ····· · ·· ···· ·· ··································· ···· ·· · ··········································· · · · ·· ··· · ····· ···· ·· · ····· ·· · ············································· ··· · · · ·· ················································ ··· ··· ·············· ·· ··· ·· · ··· ·········································· · · · · ···················································· · · ············· ······ ··· · ······· ····················· ·························· · · · · · ················································ ····· · ············· ···· ·· ··· ·· ······················································· · ·················································· ··· · · · ····· ··· · · · · · ·· ············································· · · ·· ····· ··································· · ·· · ·· ··· ·· · ··· ··· · · · ································ ···· · ·· · · ···························· ········· · · · ···· · · · · · · ································ · ·· ·········· ··················· · · ·· · · · · · ·· ··· ·········· ···· ··· · · · ······· ······ ···· · · · · · ··· · · · ·· ·· ·· · · · · 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0

HIEREV HIERHET1 HIERHET2

· · · · · · · · · · · · · · · · · · · · · · ··· · · · · · ·· · · · · · · · · · · · · · ·· · · · · · · · · · · · · ···· ·· · · · · · ·· · · · · · ·· ·· ··· ··· · · · · · ··· · ··· · · ·· ···· · · · · · · · · ·· ·· ·· ·· · · · ·· · ··· · · · · · · ···· · · ·· · · · ·· · ·· ·· · · · · · · · ··· · · · · · · · ··· · · · · · · · · · ···· ·· ··· · ···· ·· ·· ·· · · · · · · · · ··· · ·· · · ·· · · · · ·· ··· · · ·· ····· ··· · · · ·· · · · · ·· · ·· ·· ·· · ·· · ·· ·· ·· ······ ······ · ·· · · · · · · ·· ··· · · ·· · ·· · · · ···· ··· · ··· · · · ··· · · · ····· · ··· · ·· · ··· ··· · · · ·· · · · ·· · · · ···· ··· · · ······· ···· ·· ·· · ·· · · ····· ·············· ·· · · ··· · ·· · ··· ·· · · · · · ····· ······ ····· · ···· · · · · ·· ·· ············· ··· ··· ·· · · · · · · · · ·· · · ····· · ······ · · · · ··· ··· ······· · ··· ······ · · · · · ··· ·· · ············ ··· ··· ··· · ··· ···· ····· ·· · · · · ·· · · ······ · · ····· · · · ·· ·· · · · ········ · ······· ······ · ······· · ···· · · ······ ·········· ··· · ··· · · · · · ······· ··· ············· ··· · · · · ···· ·· ················ ············ ·· · ···· · · · · ···· ··· · ····· ··· ······· · · · · ···· ··· · ··········· · ·· ·· ·· · · ·································· ·· ···· · · · · ····· ················ ···· ·· · ·········· ··························· ·· · ··· · ··· ········· ············· · ····· ····· ·· · · ··· ·· ··· ·········· ············· · ·· ·· · · · · · · ·············· ········ ····· ·· ·· ······ · · · · · ··· ····· ················ · ···· · · · · · · ····················· ··· ·········· ··· · · · · ·· ·· ····················· ·· ······· ··· ···· · · · ·· ······ ··· ·········· ·· ··· · ·· · ·· · · ·· ······················ ············ ·· ··· ·· · ·· · · ·· ··························· ·········· ··· · ·· · ·· · · ···· ·············· ·· ······ ·· ·· · ·· ································ ····· · · · · · ············································ ·· ·· ··· · · · ·· · ··· ·················· ······ ·· · ·· · ··· ··············· ··············· · ·· ···· · · · · · ················ ························· ···· ··· ···· · · ·· ·· · ······· · ······· ·· · · · · · ······································ ·············· · · ······························ · ··· · · · · · · · ············· ······ ····· ·· ···· · · · ··· ·································· ········ · ··· ···· · · ·· · ·············· ······· ·········· ······ · · ·· ······ ····· ··· · · ··· · · · ·· ········································· · · ······· · · · ···· · · ·················· · ·· · ··· · ········· ·· ···· ·· ······ ·································· ··· · · · ·· ····· · ···· ····· ····· · ····· · · · ··· · ··· ·· ·· · · · · ····· ············ · ··· · · ·· ·· · · · ·· ··········· ····· · ····· · · · · ······ ·· ········ ······· · ·· · · · ·· · ·· · ·· ·· · · · ··· ··· ·· ··· ··· ··· ···· · · · ···· · · · · · · · 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

Figure Mo del checking scatter plots of T y vs T y under dierent

r ep

mo dels T y is the mean square error in the national vote compared to the

mo del The top three plots indicate that T y is consistently higher than

T y for the NH mo dels showing that these mo dels do not accurately

r ep

capture the variation in the national vote The HIER mo dels t much b etter

in this resp ect