<<

Controlling Observables & Unobservables Lecture 3

Rebecca B. Morton

NYU

Exp Class Lectures

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 1/92 Recall that Yi is a function of Xi & Ti is a function of Zi .

Xi represents the other observable variables that a¤ect our dependent variable besides the treatment variable

Zi represents the of observable variables that a¤ect the treatment variable.

Moreover, these variables may overlap and we de…ne Wi = Zi Xi . \

Controlling Observables & Unobservables Controlling Observables in Experiments

Two types of observable variables that can cause problems for the estimation of the e¤ects of a cause, Zi & Xi .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 2/92 Xi represents the other observable variables that a¤ect our dependent variable besides the treatment variable

Zi represents the set of observable variables that a¤ect the treatment variable.

Moreover, these variables may overlap and we de…ne Wi = Zi Xi . \

Controlling Observables & Unobservables Controlling Observables in Experiments

Two types of observable variables that can cause problems for the estimation of the e¤ects of a cause, Zi & Xi .

Recall that Yi is a function of Xi & Ti is a function of Zi .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 2/92 Zi represents the set of observable variables that a¤ect the treatment variable.

Moreover, these variables may overlap and we de…ne Wi = Zi Xi . \

Controlling Observables & Unobservables Controlling Observables in Experiments

Two types of observable variables that can cause problems for the estimation of the e¤ects of a cause, Zi & Xi .

Recall that Yi is a function of Xi & Ti is a function of Zi .

Xi represents the other observable variables that a¤ect our dependent variable besides the treatment variable

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 2/92 Moreover, these variables may overlap and we de…ne Wi = Zi Xi . \

Controlling Observables & Unobservables Controlling Observables in Experiments

Two types of observable variables that can cause problems for the estimation of the e¤ects of a cause, Zi & Xi .

Recall that Yi is a function of Xi & Ti is a function of Zi .

Xi represents the other observable variables that a¤ect our dependent variable besides the treatment variable

Zi represents the set of observable variables that a¤ect the treatment variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 2/92 Controlling Observables & Unobservables Controlling Observables in Experiments

Two types of observable variables that can cause problems for the estimation of the e¤ects of a cause, Zi & Xi .

Recall that Yi is a function of Xi & Ti is a function of Zi .

Xi represents the other observable variables that a¤ect our dependent variable besides the treatment variable

Zi represents the set of observable variables that a¤ect the treatment variable.

Moreover, these variables may overlap and we de…ne Wi = Zi Xi . \

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 2/92 Talk about controll now. Guala (2005, p. 238) remarks: “. . . the experimental method works by eliminating possible sources of error or, in other words, by controlling systematically the background factors that may induce us to draw a mistaken inference from the evidence to the main hypothesis under test. A good design is one that e¤ectively controls for (many) possible sources of error.”

Controlling Observables & Unobservables Controlling Observables in Experiments

In experiments researchers deal with these observable variables in two ways–through random assignment & through the ability to manipulate these variables as they do with treatment variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 3/92 Guala (2005, p. 238) remarks: “. . . the experimental method works by eliminating possible sources of error or, in other words, by controlling systematically the background factors that may induce us to draw a mistaken inference from the evidence to the main hypothesis under test. A good design is one that e¤ectively controls for (many) possible sources of error.”

Controlling Observables & Unobservables Controlling Observables in Experiments

In experiments researchers deal with these observable variables in two ways–through random assignment & through the ability to manipulate these variables as they do with treatment variables. Talk about controll now.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 3/92 Controlling Observables & Unobservables Controlling Observables in Experiments

In experiments researchers deal with these observable variables in two ways–through random assignment & through the ability to manipulate these variables as they do with treatment variables. Talk about controll now. Guala (2005, p. 238) remarks: “. . . the experimental method works by eliminating possible sources of error or, in other words, by controlling systematically the background factors that may induce us to draw a mistaken inference from the evidence to the main hypothesis under test. A good design is one that e¤ectively controls for (many) possible sources of error.”

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 3/92 Controlling Observables & Unobservables Controlling Observables in Experiments

De…nition (Controlling Observables in Experimentation) When an experimentalist holds observable variables constant or randomly assigns them in order to evaluate the e¤ect of one or more treatments’on subjects’.

De…nition (Script) The context of the instructions and given to subjects in an experiment.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 4/92 Within subjects designs Can make some variables typically unobservable observable (…nancial incentives to motivate subjects and reduce variance in things like voter intensity & other random things that can a¤ect preferences) Can measure & control & e¤ort spent on tasks Can use subliminal primes coupled with implicit measures of responses & can measure things like racial prejudices, etc.

Controlling Observables & Unobservables Controlling Unobservables in Laboratory Experiments

De…nition (Controlling Unobservables in Experimentation) When an experimentalist attempts to control typical unobservables through within subjects designs, by manipulation, or in order to evaluate the e¤ect of one or more treatments’on subjects’choices.

Ways to do this:

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 5/92 Can make some variables typically unobservable observable (…nancial incentives to motivate subjects and reduce variance in things like voter intensity & other random things that can a¤ect preferences) Can measure & control time & e¤ort spent on tasks Can use subliminal primes coupled with implicit measures of responses & can measure things like racial prejudices, etc.

Controlling Observables & Unobservables Controlling Unobservables in Laboratory Experiments

De…nition (Controlling Unobservables in Experimentation) When an experimentalist attempts to control typical unobservables through within subjects designs, by manipulation, or observation in order to evaluate the e¤ect of one or more treatments’on subjects’choices.

Ways to do this: Within subjects designs

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 5/92 Can measure & control time & e¤ort spent on tasks Can use subliminal primes coupled with implicit measures of responses & can measure things like racial prejudices, etc.

Controlling Observables & Unobservables Controlling Unobservables in Laboratory Experiments

De…nition (Controlling Unobservables in Experimentation) When an experimentalist attempts to control typical unobservables through within subjects designs, by manipulation, or observation in order to evaluate the e¤ect of one or more treatments’on subjects’choices.

Ways to do this: Within subjects designs Can make some variables typically unobservable observable (…nancial incentives to motivate subjects and reduce variance in things like voter intensity & other random things that can a¤ect preferences)

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 5/92 Can use subliminal primes coupled with implicit measures of responses & can measure things like racial prejudices, etc.

Controlling Observables & Unobservables Controlling Unobservables in Laboratory Experiments

De…nition (Controlling Unobservables in Experimentation) When an experimentalist attempts to control typical unobservables through within subjects designs, by manipulation, or observation in order to evaluate the e¤ect of one or more treatments’on subjects’choices.

Ways to do this: Within subjects designs Can make some variables typically unobservable observable (…nancial incentives to motivate subjects and reduce variance in things like voter intensity & other random things that can a¤ect preferences) Can measure & control time & e¤ort spent on tasks

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 5/92 Controlling Observables & Unobservables Controlling Unobservables in Laboratory Experiments

De…nition (Controlling Unobservables in Experimentation) When an experimentalist attempts to control typical unobservables through within subjects designs, by manipulation, or observation in order to evaluate the e¤ect of one or more treatments’on subjects’choices.

Ways to do this: Within subjects designs Can make some variables typically unobservable observable (…nancial incentives to motivate subjects and reduce variance in things like voter intensity & other random things that can a¤ect preferences) Can measure & control time & e¤ort spent on tasks Can use subliminal primes coupled with implicit measures of responses & can measure things like racial prejudices, etc.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 5/92 Controlling Manipulations Manipulation Checks

De…nition (Manipulation Check) A survey or other method used to check whether the manipulation conducted in an experiment is perceived by the subjects as the experimenter wishes it to be perceived.

Taber’sexperiment good example of use of manipulation checks that are often necessary in experiments in political science when we are using words or visuals as manipulations.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 6/92 Modal method for exploring hypothesized causal in political science in such a situation is the use of control functions in regressions. Most who use this approach in political science are working within an RCM basis either implicitly or explicitly

Control Functions in Regressions

What happens when a researcher is investigating the e¤ects of information on voting behavior in observational data or in some cases experimental data gathered through a …eld experiment where the researcher did not have the ability to control observable variables as above or to randomly assign them?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 7/92 Most who use this approach in political science are working within an RCM basis either implicitly or explicitly

Control Functions in Regressions

What happens when a researcher is investigating the e¤ects of information on voting behavior in observational data or in some cases experimental data gathered through a …eld experiment where the researcher did not have the ability to control observable variables as above or to randomly assign them? Modal method for exploring hypothesized causal relations in political science in such a situation is the use of control functions in regressions.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 7/92 Control Functions in Regressions

What happens when a researcher is investigating the e¤ects of information on voting behavior in observational data or in some cases experimental data gathered through a …eld experiment where the researcher did not have the ability to control observable variables as above or to randomly assign them? Modal method for exploring hypothesized causal relations in political science in such a situation is the use of control functions in regressions. Most who use this approach in political science are working within an RCM basis either implicitly or explicitly

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 7/92 But voter is an unordered multinomial response and thus OLS is an inappropriate estimation procedure. Standard line of attack is to concentrate on the probability of each voter choice, or what is typically called the response probabilities. Response probabilities are then assumed to be a function of explanatory variables, which sum to 1. Response probabilities are typically estimated using maximum likelihood procedures such as multinomial logit, conditional logit, or multinomial probit.

Digression on Dealing with Voting as a Dependent Variable

Most political scientists use ordinary least squares (OLS) regression to estimate causal e¤ects based usually only implicitly on RCM.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 8/92 Standard line of attack is to concentrate on the probability of each voter choice, or what is typically called the response probabilities. Response probabilities are then assumed to be a function of explanatory variables, which sum to 1. Response probabilities are typically estimated using maximum likelihood procedures such as multinomial logit, conditional logit, or multinomial probit.

Digression on Dealing with Voting as a Dependent Variable

Most political scientists use ordinary least squares (OLS) regression to estimate causal e¤ects based usually only implicitly on RCM. But voter choice is an unordered multinomial response and thus OLS is an inappropriate estimation procedure.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 8/92 Response probabilities are then assumed to be a function of explanatory variables, which sum to 1. Response probabilities are typically estimated using maximum likelihood procedures such as multinomial logit, conditional logit, or multinomial probit.

Digression on Dealing with Voting as a Dependent Variable

Most political scientists use ordinary least squares (OLS) regression to estimate causal e¤ects based usually only implicitly on RCM. But voter choice is an unordered multinomial response and thus OLS is an inappropriate estimation procedure. Standard line of attack is to concentrate on the probability of each voter choice, or what is typically called the response probabilities.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 8/92 Response probabilities are typically estimated using maximum likelihood procedures such as multinomial logit, conditional logit, or multinomial probit.

Digression on Dealing with Voting as a Dependent Variable

Most political scientists use ordinary least squares (OLS) regression to estimate causal e¤ects based usually only implicitly on RCM. But voter choice is an unordered multinomial response and thus OLS is an inappropriate estimation procedure. Standard line of attack is to concentrate on the probability of each voter choice, or what is typically called the response probabilities. Response probabilities are then assumed to be a function of explanatory variables, which sum to 1.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 8/92 Digression on Dealing with Voting as a Dependent Variable

Most political scientists use ordinary least squares (OLS) regression to estimate causal e¤ects based usually only implicitly on RCM. But voter choice is an unordered multinomial response and thus OLS is an inappropriate estimation procedure. Standard line of attack is to concentrate on the probability of each voter choice, or what is typically called the response probabilities. Response probabilities are then assumed to be a function of explanatory variables, which sum to 1. Response probabilities are typically estimated using maximum likelihood procedures such as multinomial logit, conditional logit, or multinomial probit.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 8/92 Bartels (1996) examines the e¤ect of information on voters’choices in NES surveys between major party candidates in presidential elections from 1972 to 1992, excluding voters who report no preference or a preference for minor party or independent candidates. Could restrict our analysis to the e¤ect of information on the decision to turnout alone, also a binary choice, as does Lassen.

Digression on Dealing with Voting as a Dependent Variable

Another solution is to focus more narrowly on the e¤ect of information on a binary choice, within our general choice context.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 9/92 Could restrict our analysis to the e¤ect of information on the decision to turnout alone, also a binary choice, as does Lassen.

Digression on Dealing with Voting as a Dependent Variable

Another solution is to focus more narrowly on the e¤ect of information on a binary choice, within our general choice context. Bartels (1996) examines the e¤ect of information on voters’choices in NES surveys between major party candidates in presidential elections from 1972 to 1992, excluding voters who report no preference or a preference for minor party or independent candidates.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 9/92 Digression on Dealing with Voting as a Dependent Variable

Another solution is to focus more narrowly on the e¤ect of information on a binary choice, within our general choice context. Bartels (1996) examines the e¤ect of information on voters’choices in NES surveys between major party candidates in presidential elections from 1972 to 1992, excluding voters who report no preference or a preference for minor party or independent candidates. Could restrict our analysis to the e¤ect of information on the decision to turnout alone, also a binary choice, as does Lassen.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 9/92 Assume then that our dependent variable, Yi can only take two values, 0 or 1 and we are interested in the probability of voting. One simple approach to estimating this probability is the linear probability model (LPM), where we assume that the response probabilities are a linear function of explanatory variables, which can be estimated using OLS. For example, this is the approach used to estimate causal e¤ects of media exposure on voting in GKB (working paper, need to check on published paper). Use this approach because of its expositional clarity although it has known limitations.

Digression on Dealing with Voting as a Dependent Variable

For expositional purposes, let’srestrict our examination to the e¤ect of information on turnout.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 10/92 One simple approach to estimating this probability is the linear probability model (LPM), where we assume that the response probabilities are a linear function of explanatory variables, which can be estimated using OLS. For example, this is the approach used to estimate causal e¤ects of media exposure on voting in GKB (working paper, need to check on published paper). Use this approach because of its expositional clarity although it has known limitations.

Digression on Dealing with Voting as a Dependent Variable

For expositional purposes, let’srestrict our examination to the e¤ect of information on turnout.

Assume then that our dependent variable, Yi can only take two values, 0 or 1 and we are interested in the probability of voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 10/92 For example, this is the approach used to estimate causal e¤ects of media exposure on voting in GKB (working paper, need to check on published paper). Use this approach because of its expositional clarity although it has known limitations.

Digression on Dealing with Voting as a Dependent Variable

For expositional purposes, let’srestrict our examination to the e¤ect of information on turnout.

Assume then that our dependent variable, Yi can only take two values, 0 or 1 and we are interested in the probability of voting. One simple approach to estimating this probability is the linear probability model (LPM), where we assume that the response probabilities are a linear function of explanatory variables, which can be estimated using OLS.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 10/92 Use this approach because of its expositional clarity although it has known limitations.

Digression on Dealing with Voting as a Dependent Variable

For expositional purposes, let’srestrict our examination to the e¤ect of information on turnout.

Assume then that our dependent variable, Yi can only take two values, 0 or 1 and we are interested in the probability of voting. One simple approach to estimating this probability is the linear probability model (LPM), where we assume that the response probabilities are a linear function of explanatory variables, which can be estimated using OLS. For example, this is the approach used to estimate causal e¤ects of media exposure on voting in GKB (working paper, need to check on published paper).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 10/92 Digression on Dealing with Voting as a Dependent Variable

For expositional purposes, let’srestrict our examination to the e¤ect of information on turnout.

Assume then that our dependent variable, Yi can only take two values, 0 or 1 and we are interested in the probability of voting. One simple approach to estimating this probability is the linear probability model (LPM), where we assume that the response probabilities are a linear function of explanatory variables, which can be estimated using OLS. For example, this is the approach used to estimate causal e¤ects of media exposure on voting in GKB (working paper, need to check on published paper). Use this approach because of its expositional clarity although it has known limitations.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 10/92 De…ne that probability as Pi and Pij as the probability of voting for individual i in information state j. The causal e¤ect we are interested in when we use P as our dependent variable is de…ned as δ = (Pi1 Pi0). Also, to make our exposition simpler, we will drop the i’sfrom our notation & refer to P0 as the probability of voting when an individual is uninformed & P1 as the probability of voting when an individual is informed.

The Switching Regression Model

Following the LPM model assume that that probability of voting is a linear function of observables and unobservables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 11/92 The causal e¤ect we are interested in when we use P as our dependent variable is de…ned as δ = (Pi1 Pi0). Also, to make our exposition simpler, we will drop the i’sfrom our notation & refer to P0 as the probability of voting when an individual is uninformed & P1 as the probability of voting when an individual is informed.

The Switching Regression Model

Following the LPM model assume that that probability of voting is a linear function of observables and unobservables.

De…ne that probability as Pi and Pij as the probability of voting for individual i in information state j.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 11/92 Also, to make our exposition simpler, we will drop the i’sfrom our notation & refer to P0 as the probability of voting when an individual is uninformed & P1 as the probability of voting when an individual is informed.

The Switching Regression Model

Following the LPM model assume that that probability of voting is a linear function of observables and unobservables.

De…ne that probability as Pi and Pij as the probability of voting for individual i in information state j. The causal e¤ect we are interested in when we use P as our dependent variable is de…ned as δ = (Pi1 Pi0).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 11/92 The Switching Regression Model

Following the LPM model assume that that probability of voting is a linear function of observables and unobservables.

De…ne that probability as Pi and Pij as the probability of voting for individual i in information state j. The causal e¤ect we are interested in when we use P as our dependent variable is de…ned as δ = (Pi1 Pi0). Also, to make our exposition simpler, we will drop the i’sfrom our notation & refer to P0 as the probability of voting when an individual is uninformed & P1 as the probability of voting when an individual is informed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 11/92 The answer is yes, under certain additional assumptions. What are those assumptions?

The Switching Regression Model

Can we just add in our treatment variable, information, as one of these observable variables and use the e¤ect of information on Pi as measured in a linear regression as our measure of the causal e¤ect, assuming that the LPM is an accurate assumption about how the probability of voting is determined?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 12/92 What are those assumptions?

The Switching Regression Model

Can we just add in our treatment variable, information, as one of these observable variables and use the e¤ect of information on Pi as measured in a linear regression as our measure of the causal e¤ect, assuming that the LPM is an accurate assumption about how the probability of voting is determined? The answer is yes, under certain additional assumptions.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 12/92 The Switching Regression Model

Can we just add in our treatment variable, information, as one of these observable variables and use the e¤ect of information on Pi as measured in a linear regression as our measure of the causal e¤ect, assuming that the LPM is an accurate assumption about how the probability of voting is determined? The answer is yes, under certain additional assumptions. What are those assumptions?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 12/92 P0 = µ0 + u0

P1 = µ1 + u1

where µj is the mean of P in state j, uj is the stochastic term in state j and E (uj ) = 0.

The Switching Regression Model

It is useful to decompose the two probabilities, P0 and Pi into their means and a stochastic part with a zero mean:

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 13/92 P1 = µ1 + u1

where µj is the mean value of P in state j, uj is the stochastic term in state j and E (uj ) = 0.

The Switching Regression Model

It is useful to decompose the two probabilities, P0 and Pi into their means and a stochastic part with a zero mean:

P0 = µ0 + u0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 13/92 where µj is the mean value of P in state j, uj is the stochastic term in state j and E (uj ) = 0.

The Switching Regression Model

It is useful to decompose the two probabilities, P0 and Pi into their means and a stochastic part with a zero mean:

P0 = µ0 + u0

P1 = µ1 + u1

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 13/92 The Switching Regression Model

It is useful to decompose the two probabilities, P0 and Pi into their means and a stochastic part with a zero mean:

P0 = µ0 + u0

P1 = µ1 + u1

where µj is the mean value of P in state j, uj is the stochastic term in state j and E (uj ) = 0.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 13/92 P = TP1 + (1 T ) P0 We can then plug in yielding:

P = µ + (µ µ ) T + u0 + (u1 u0) T 0 1 0 In econometrics this equation is called Quandt’sswitching regression model, see Quandt (1958, 1974) & the coe¢ cient on T is thus considered the causal e¤ect of information on the probability of turnout.

The Switching Regression Model

Further we assume that the probability of voting that we observe, P, depends on the state of the world for a voter:

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 14/92 We can then plug in yielding:

P = µ + (µ µ ) T + u0 + (u1 u0) T 0 1 0 In econometrics this equation is called Quandt’sswitching regression model, see Quandt (1958, 1974) & the coe¢ cient on T is thus considered the causal e¤ect of information on the probability of turnout.

The Switching Regression Model

Further we assume that the probability of voting that we observe, P, depends on the state of the world for a voter:

P = TP1 + (1 T ) P0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 14/92 P = µ + (µ µ ) T + u0 + (u1 u0) T 0 1 0 In econometrics this equation is called Quandt’sswitching regression model, see Quandt (1958, 1974) & the coe¢ cient on T is thus considered the causal e¤ect of information on the probability of turnout.

The Switching Regression Model

Further we assume that the probability of voting that we observe, P, depends on the state of the world for a voter:

P = TP1 + (1 T ) P0 We can then plug in yielding:

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 14/92 In econometrics this equation is called Quandt’sswitching regression model, see Quandt (1958, 1974) & the coe¢ cient on T is thus considered the causal e¤ect of information on the probability of turnout.

The Switching Regression Model

Further we assume that the probability of voting that we observe, P, depends on the state of the world for a voter:

P = TP1 + (1 T ) P0 We can then plug in yielding:

P = µ + (µ µ ) T + u0 + (u1 u0) T 0 1 0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 14/92 The Switching Regression Model

Further we assume that the probability of voting that we observe, P, depends on the state of the world for a voter:

P = TP1 + (1 T ) P0 We can then plug in yielding:

P = µ + (µ µ ) T + u0 + (u1 u0) T 0 1 0 In econometrics this equation is called Quandt’sswitching regression model, see Quandt (1958, 1974) & the coe¢ cient on T is thus considered the causal e¤ect of information on the probability of turnout.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 14/92 Since the focus in this lecture is on control of observables, for ease of exposition we drop M from our analysis for now. However, inclusion of M is straightforward whenever Z is included or discussed.

Selection on the Observables or Ignorability of Treatment

However, most political scientists would assume that stochastic of the probabilities of voting in the two states, uj , would depend on X , our set of observable exogenous variables that also a¤ect the decision to turnout & Z & M, our sets of observable and manipulated exogenous variables that a¤ect the decision to be informed when this decision is endogenous, such that the means of the stochastic terms are not zero.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 15/92 However, inclusion of M is straightforward whenever Z is included or discussed.

Selection on the Observables or Ignorability of Treatment

However, most political scientists would assume that stochastic nature of the probabilities of voting in the two states, uj , would depend on X , our set of observable exogenous variables that also a¤ect the decision to turnout & Z & M, our sets of observable and manipulated exogenous variables that a¤ect the decision to be informed when this decision is endogenous, such that the means of the stochastic terms are not zero. Since the focus in this lecture is on control of observables, for ease of exposition we drop M from our analysis for now.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 15/92 Selection on the Observables or Ignorability of Treatment

However, most political scientists would assume that stochastic nature of the probabilities of voting in the two states, uj , would depend on X , our set of observable exogenous variables that also a¤ect the decision to turnout & Z & M, our sets of observable and manipulated exogenous variables that a¤ect the decision to be informed when this decision is endogenous, such that the means of the stochastic terms are not zero. Since the focus in this lecture is on control of observables, for ease of exposition we drop M from our analysis for now. However, inclusion of M is straightforward whenever Z is included or discussed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 15/92 The answer is no.

The main problem is that now the potential choices, Pj may be correlated with the treatment variable making it di¢ cult to determine treatment e¤ects. Thus, we may not be able to estimate causal relationships.

Selection on the Observables or Ignorability of Treatment

Can a political scientist simply run an OLS regression with T as an exogenous variable, using the coe¢ cient on T as an estimate of the causal e¤ect as in the Quandt switching regression model?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 16/92 The main problem is that now the potential choices, Pj may be correlated with the treatment variable making it di¢ cult to determine treatment e¤ects. Thus, we may not be able to estimate causal relationships.

Selection on the Observables or Ignorability of Treatment

Can a political scientist simply run an OLS regression with T as an exogenous variable, using the coe¢ cient on T as an estimate of the causal e¤ect as in the Quandt switching regression model? The answer is no.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 16/92 Thus, we may not be able to estimate causal relationships.

Selection on the Observables or Ignorability of Treatment

Can a political scientist simply run an OLS regression with T as an exogenous variable, using the coe¢ cient on T as an estimate of the causal e¤ect as in the Quandt switching regression model? The answer is no.

The main problem is that now the potential choices, Pj may be correlated with the treatment variable making it di¢ cult to determine treatment e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 16/92 Selection on the Observables or Ignorability of Treatment

Can a political scientist simply run an OLS regression with T as an exogenous variable, using the coe¢ cient on T as an estimate of the causal e¤ect as in the Quandt switching regression model? The answer is no.

The main problem is that now the potential choices, Pj may be correlated with the treatment variable making it di¢ cult to determine treatment e¤ects. Thus, we may not be able to estimate causal relationships.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 16/92 Axiom (Ignorability of Treatment or Selection on the Observables): Conditional on W ; T and (P0, P1) are independent. Axiom (Mean Ignorability of Treatment or Mean Selection on the Observables): E (P0 W , T ) = E (P0 W ) and j j E (P1 W , T ) = E (P1 W ) j j

Selection on the Observables or Ignorability of Treatment

How can we deal with the problem?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 17/92 Axiom (Mean Ignorability of Treatment or Mean Selection on the Observables): E (P0 W , T ) = E (P0 W ) and j j E (P1 W , T ) = E (P1 W ) j j

Selection on the Observables or Ignorability of Treatment

How can we deal with the problem? Axiom (Ignorability of Treatment or Selection on the Observables): Conditional on W ; T and (P0, P1) are independent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 17/92 Selection on the Observables or Ignorability of Treatment

How can we deal with the problem? Axiom (Ignorability of Treatment or Selection on the Observables): Conditional on W ; T and (P0, P1) are independent. Axiom (Mean Ignorability of Treatment or Mean Selection on the Observables): E (P0 W , T ) = E (P0 W ) and j j E (P1 W , T ) = E (P1 W ) j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 17/92 We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below. Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j E (P T , W ) = µ + αT + h0 (W ) β j 0 0 where α = ATE and h0 (W ) = E (u0 W ) . j What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j E (P T , W ) = µ + αT + h0 (W ) β j 0 0 where α = ATE and h0 (W ) = E (u0 W ) . j What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [ We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 E (P T , W ) = µ + αT + h0 (W ) β j 0 0 where α = ATE and h0 (W ) = E (u0 W ) . j What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [ We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below. Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 where α = ATE and h0 (W ) = E (u0 W ) . j What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [ We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below. Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j E (P T , W ) = µ + αT + h0 (W ) β j 0 0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [ We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below. Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j E (P T , W ) = µ + αT + h0 (W ) β j 0 0 where α = ATE and h0 (W ) = E (u0 W ) . j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 Selection on the Observables or Ignorability of Treatment

Mean ignorability of treatment is su¢ cient for the estimation of a regression function with T & W as independent variables (recall that W = X Z). [ We also need a second assumption, that (u1 u0) has a zero mean conditional on W , although we relax this assumption below. Therefore, given mean ignorability of treatment and the assumption that E (u1 W ) = E (u0 W ), then ATE = ATT and j j E (P T , W ) = µ + αT + h0 (W ) β j 0 0 where α = ATE and h0 (W ) = E (u0 W ) . j What does this mean? It means that if the predicted individual speci…c e¤ect of information given W is zero, the coe¢ cient on the treatment variable in a regression can be used to estimate ATE.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 18/92 The function h0 (W ) is generally called a control function.

If mean ignorability of treatment and E (u1 W ) = E (u0 W ) are both true, then when the control function is addedj to the regressionj of P on 1 and T , biases are controlled, allowing for α to serve as a consistent estimator of ATE and ATE(W ).

Selection on the Observables or Ignorability of Treatment

Regressions like these are estimated frequently in political science.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 19/92 If mean ignorability of treatment and E (u1 W ) = E (u0 W ) are both true, then when the control function is addedj to the regressionj of P on 1 and T , biases are controlled, allowing for α to serve as a consistent estimator of ATE and ATE(W ).

Selection on the Observables or Ignorability of Treatment

Regressions like these are estimated frequently in political science.

The function h0 (W ) is generally called a control function.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 19/92 Selection on the Observables or Ignorability of Treatment

Regressions like these are estimated frequently in political science.

The function h0 (W ) is generally called a control function.

If mean ignorability of treatment and E (u1 W ) = E (u0 W ) are both true, then when the control function is addedj to the regressionj of P on 1 and T , biases are controlled, allowing for α to serve as a consistent estimator of ATE and ATE(W ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 19/92 For example, it means that the e¤ect of information on the person who is just indi¤erent between voting and not voting is the same as the e¤ect of information on a person who is a priori inclined to vote. Thus, nonparametric uses of propensity scores make strong implicit assumptions about marginal and average treatment e¤ects. Some assume only mean ignorability, as in Sekhon (2005), and report ATT(W ) rather than ATE(W ).

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

As Heckman (2005) points out, this means that the conditional marginal treatment, MTE, also equals the conditional average treatment, ATE(W ), which is arguably an unattractive assumption since it implies that the average individual is indi¤erent between treated or not, in our analysis, being informed or not.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 20/92 Thus, nonparametric uses of propensity scores make strong implicit assumptions about marginal and average treatment e¤ects. Some assume only mean ignorability, as in Sekhon (2005), and report ATT(W ) rather than ATE(W ).

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

As Heckman (2005) points out, this means that the conditional marginal treatment, MTE, also equals the conditional average treatment, ATE(W ), which is arguably an unattractive assumption since it implies that the average individual is indi¤erent between being treated or not, in our analysis, being informed or not. For example, it means that the e¤ect of information on the person who is just indi¤erent between voting and not voting is the same as the e¤ect of information on a person who is a priori inclined to vote.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 20/92 Some assume only mean ignorability, as in Sekhon (2005), and report ATT(W ) rather than ATE(W ).

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

As Heckman (2005) points out, this means that the conditional marginal treatment, MTE, also equals the conditional average treatment, ATE(W ), which is arguably an unattractive assumption since it implies that the average individual is indi¤erent between being treated or not, in our analysis, being informed or not. For example, it means that the e¤ect of information on the person who is just indi¤erent between voting and not voting is the same as the e¤ect of information on a person who is a priori inclined to vote. Thus, nonparametric uses of propensity scores make strong implicit assumptions about marginal and average treatment e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 20/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

As Heckman (2005) points out, this means that the conditional marginal treatment, MTE, also equals the conditional average treatment, ATE(W ), which is arguably an unattractive assumption since it implies that the average individual is indi¤erent between being treated or not, in our analysis, being informed or not. For example, it means that the e¤ect of information on the person who is just indi¤erent between voting and not voting is the same as the e¤ect of information on a person who is a priori inclined to vote. Thus, nonparametric uses of propensity scores make strong implicit assumptions about marginal and average treatment e¤ects. Some assume only mean ignorability, as in Sekhon (2005), and report ATT(W ) rather than ATE(W ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 20/92 Suppose that Louise and Sam have identical propensity scores as estimated using data available on observable variables such as gender (we assume they are both female in this example), race and ethnicity, income, home ownership, partisan identi…cation, and so forth. What is unobservable is the value they place on being a “good citizen” by voting and by being informed about political matters.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

If we believe that whether an individual is informed or not (treatment choice) is related through unobservables not controlled for to her potential choices if informed or not (whether he or she would vote), then ignorability of treatment does not hold.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 21/92 What is unobservable is the value they place on being a “good citizen” by voting and by being informed about political matters.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

If we believe that whether an individual is informed or not (treatment choice) is related through unobservables not controlled for to her potential choices if informed or not (whether he or she would vote), then ignorability of treatment does not hold. Suppose that Louise and Sam have identical propensity scores as estimated using data available on observable variables such as gender (we assume they are both female in this example), race and ethnicity, income, home ownership, partisan identi…cation, and so forth.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 21/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

If we believe that whether an individual is informed or not (treatment choice) is related through unobservables not controlled for to her potential choices if informed or not (whether he or she would vote), then ignorability of treatment does not hold. Suppose that Louise and Sam have identical propensity scores as estimated using data available on observable variables such as gender (we assume they are both female in this example), race and ethnicity, income, home ownership, partisan identi…cation, and so forth. What is unobservable is the value they place on being a “good citizen” by voting and by being informed about political matters.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 21/92 For Louise the value of being a good citizen by voting outweighs the cost of voting, so whether she is informed or not, she will vote. Also assume that she receives more utility from casting an informed vote than an uninformed vote (the value she places on being a citizen is higher). So she will choose to be informed. Her decision to be informed is a function of the choices she would make if informed or uninformed and the value she places on citizen duty.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

Louise values being a good citizen but Sam places no value on it.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 22/92 Also assume that she receives more utility from casting an informed vote than an uninformed vote (the value she places on being a citizen is higher). So she will choose to be informed. Her decision to be informed is a function of the choices she would make if informed or uninformed and the value she places on citizen duty.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

Louise values being a good citizen but Sam places no value on it. For Louise the value of being a good citizen by voting outweighs the cost of voting, so whether she is informed or not, she will vote.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 22/92 So she will choose to be informed. Her decision to be informed is a function of the choices she would make if informed or uninformed and the value she places on citizen duty.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

Louise values being a good citizen but Sam places no value on it. For Louise the value of being a good citizen by voting outweighs the cost of voting, so whether she is informed or not, she will vote. Also assume that she receives more utility from casting an informed vote than an uninformed vote (the value she places on being a citizen is higher).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 22/92 Her decision to be informed is a function of the choices she would make if informed or uninformed and the value she places on citizen duty.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

Louise values being a good citizen but Sam places no value on it. For Louise the value of being a good citizen by voting outweighs the cost of voting, so whether she is informed or not, she will vote. Also assume that she receives more utility from casting an informed vote than an uninformed vote (the value she places on being a citizen is higher). So she will choose to be informed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 22/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

Louise values being a good citizen but Sam places no value on it. For Louise the value of being a good citizen by voting outweighs the cost of voting, so whether she is informed or not, she will vote. Also assume that she receives more utility from casting an informed vote than an uninformed vote (the value she places on being a citizen is higher). So she will choose to be informed. Her decision to be informed is a function of the choices she would make if informed or uninformed and the value she places on citizen duty.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 22/92 For Sam, the cost of voting outweighs the bene…t from voting, whether she (this is Samatha, not Samuel) is informed or not. So whether she is informed or not, she will not vote. However, we assume that if she is informed and does not vote, she will some regret from not participating, so she receives greater utility from being an uninformed nonvoter than being an informed nonvoter. So, given that she will not vote regardless of whether she is informed, and that she receives higher utility from being an uninformed nonvoter than an informed nonvoter, she will choose to be uninformed.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

In contrast, Sam places zero value on being a good citizen by voting or by being informed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 23/92 So whether she is informed or not, she will not vote. However, we assume that if she is informed and does not vote, she will experience some regret from not participating, so she receives greater utility from being an uninformed nonvoter than being an informed nonvoter. So, given that she will not vote regardless of whether she is informed, and that she receives higher utility from being an uninformed nonvoter than an informed nonvoter, she will choose to be uninformed.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

In contrast, Sam places zero value on being a good citizen by voting or by being informed. For Sam, the cost of voting outweighs the bene…t from voting, whether she (this is Samatha, not Samuel) is informed or not.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 23/92 However, we assume that if she is informed and does not vote, she will experience some regret from not participating, so she receives greater utility from being an uninformed nonvoter than being an informed nonvoter. So, given that she will not vote regardless of whether she is informed, and that she receives higher utility from being an uninformed nonvoter than an informed nonvoter, she will choose to be uninformed.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

In contrast, Sam places zero value on being a good citizen by voting or by being informed. For Sam, the cost of voting outweighs the bene…t from voting, whether she (this is Samatha, not Samuel) is informed or not. So whether she is informed or not, she will not vote.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 23/92 So, given that she will not vote regardless of whether she is informed, and that she receives higher utility from being an uninformed nonvoter than an informed nonvoter, she will choose to be uninformed.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

In contrast, Sam places zero value on being a good citizen by voting or by being informed. For Sam, the cost of voting outweighs the bene…t from voting, whether she (this is Samatha, not Samuel) is informed or not. So whether she is informed or not, she will not vote. However, we assume that if she is informed and does not vote, she will experience some regret from not participating, so she receives greater utility from being an uninformed nonvoter than being an informed nonvoter.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 23/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

In contrast, Sam places zero value on being a good citizen by voting or by being informed. For Sam, the cost of voting outweighs the bene…t from voting, whether she (this is Samatha, not Samuel) is informed or not. So whether she is informed or not, she will not vote. However, we assume that if she is informed and does not vote, she will experience some regret from not participating, so she receives greater utility from being an uninformed nonvoter than being an informed nonvoter. So, given that she will not vote regardless of whether she is informed, and that she receives higher utility from being an uninformed nonvoter than an informed nonvoter, she will choose to be uninformed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 23/92 Since being informed (treated) is a function of Louise and Sam’s potential or counterfactual choices even when controlling for all observable variables, ignorability of treatment does not hold. Ignorability of treatment ssumes that Louise’sbehavior is the counterfactual of Sam’sbehavior if informed and Sam’sbehavior is the counterfactual of Louise’sbehavior if uninformed But this is not true.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

On measurable variables, Louise and Sam are identical except one is informed and the other is not and the one who is informed votes and the other does not.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 24/92 Ignorability of treatment ssumes that Louise’sbehavior is the counterfactual of Sam’sbehavior if informed and Sam’sbehavior is the counterfactual of Louise’sbehavior if uninformed But this is not true.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

On measurable variables, Louise and Sam are identical except one is informed and the other is not and the one who is informed votes and the other does not. Since being informed (treated) is a function of Louise and Sam’s potential or counterfactual choices even when controlling for all observable variables, ignorability of treatment does not hold.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 24/92 But this is not true.

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

On measurable variables, Louise and Sam are identical except one is informed and the other is not and the one who is informed votes and the other does not. Since being informed (treated) is a function of Louise and Sam’s potential or counterfactual choices even when controlling for all observable variables, ignorability of treatment does not hold. Ignorability of treatment ssumes that Louise’sbehavior is the counterfactual of Sam’sbehavior if informed and Sam’sbehavior is the counterfactual of Louise’sbehavior if uninformed

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 24/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

On measurable variables, Louise and Sam are identical except one is informed and the other is not and the one who is informed votes and the other does not. Since being informed (treated) is a function of Louise and Sam’s potential or counterfactual choices even when controlling for all observable variables, ignorability of treatment does not hold. Ignorability of treatment ssumes that Louise’sbehavior is the counterfactual of Sam’sbehavior if informed and Sam’sbehavior is the counterfactual of Louise’sbehavior if uninformed But this is not true.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 24/92 This implies that the expected voting choices of voters with observables like Louise and Sam if informed and the expected choices if uninformed are independent of whether they are informed and that the distribution of e¤ects of potential voting choices as perceived by these voters is such that these e¤ects wash out. While ignorability is not directly testable, there are sensitivity tests to determine if ignorability holds, see Rosenbaum (2002). See also Rivers and Vuong and Heckman (2005).

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

It is possible to estimate ATT using matching on propensity scores or to estimate causal e¤ects using regressions with controls and assume only mean ignorability of treatment.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 25/92 While ignorability is not directly testable, there are sensitivity tests to determine if ignorability holds, see Rosenbaum (2002). See also Rivers and Vuong and Heckman (2005).

How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

It is possible to estimate ATT using matching on propensity scores or to estimate causal e¤ects using regressions with controls and assume only mean ignorability of treatment. This implies that the expected voting choices of voters with observables like Louise and Sam if informed and the expected choices if uninformed are independent of whether they are informed and that the distribution of e¤ects of potential voting choices as perceived by these voters is such that these e¤ects wash out.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 25/92 How Reasonable is the Ignorability of Treatment Assumption? In Observational or Nonexperimental Data

It is possible to estimate ATT using matching on propensity scores or to estimate causal e¤ects using regressions with controls and assume only mean ignorability of treatment. This implies that the expected voting choices of voters with observables like Louise and Sam if informed and the expected choices if uninformed are independent of whether they are informed and that the distribution of e¤ects of potential voting choices as perceived by these voters is such that these e¤ects wash out. While ignorability is not directly testable, there are sensitivity tests to determine if ignorability holds, see Rosenbaum (2002). See also Rivers and Vuong and Heckman (2005).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 25/92 See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior. Can we use interaction terms and relax some of the assumptions above? Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). If wej do so we losej the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0

Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 Can we use interaction terms and relax some of the assumptions above? Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). If wej do so we losej the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0

Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables. See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). If wej do so we losej the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0

Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables. See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior. Can we use interaction terms and relax some of the assumptions above?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 If we do so we lose the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0

Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables. See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior. Can we use interaction terms and relax some of the assumptions above? Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0

Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables. See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior. Can we use interaction terms and relax some of the assumptions above? Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). If wej do so we losej the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 Interaction Terms & Estimating Causal Relationships

A recent trend in political science empirical research is to use multiple interaction terms as an e¤ort to loosen some of the restrictiveness of the implicit assumptions described above, particularly if we think that the e¤ect of the causal variable, the treatment, is mitigated or part of some general imprecise process including some other observable variables. See Basinger and Lavine (2005) in their study of political alienation, knowledge, and campaigns on voting behavior. Can we use interaction terms and relax some of the assumptions above? Using interaction terms does allow us to relax the assumption that E (u1 W ) = E (u0 W ). If wej do so we losej the equality between ATE and ATT but we can devise a regression equation that estimates these values. That is, if E (u1 W ) = E (u0 W ), then: j 6 j E (P T , W ) = µ + αT + h0 (W ) β + T [h1 (W ) h0 (W )] j 0 0 R B Morton (NYU) EPS Lecture 3 ExpClassLectures 26/92 Wooldridge (2002, page 613) points out that if linearity is assumed so that the coe¢ cient on T measures ATE the researcher should demean the values of W (he discusses procedures that can be used to do so and adjustments to standard errors that might be necessary). Assuming that these adjustments are appropriately made then we can estimate ATE(W ) for given ranges of independent variables by combining both the coe¢ cient on T & the coe¢ cients on the interaction terms the di¤erent values of the independent variables that interest us.

Interaction Terms & Estimating Causal Relationships

We can estimate the treatment e¤ect if we include in our standard regression analysis interaction terms between the control function and the treatment variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 27/92 Assuming that these adjustments are appropriately made then we can estimate ATE(W ) for given ranges of independent variables by combining both the coe¢ cient on T & the coe¢ cients on the interaction terms times the di¤erent values of the independent variables that interest us.

Interaction Terms & Estimating Causal Relationships

We can estimate the treatment e¤ect if we include in our standard regression analysis interaction terms between the control function and the treatment variable. Wooldridge (2002, page 613) points out that if linearity is assumed so that the coe¢ cient on T measures ATE the researcher should demean the values of W (he discusses procedures that can be used to do so and adjustments to standard errors that might be necessary).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 27/92 Interaction Terms & Estimating Causal Relationships

We can estimate the treatment e¤ect if we include in our standard regression analysis interaction terms between the control function and the treatment variable. Wooldridge (2002, page 613) points out that if linearity is assumed so that the coe¢ cient on T measures ATE the researcher should demean the values of W (he discusses procedures that can be used to do so and adjustments to standard errors that might be necessary). Assuming that these adjustments are appropriately made then we can estimate ATE(W ) for given ranges of independent variables by combining both the coe¢ cient on T & the coe¢ cients on the interaction terms times the di¤erent values of the independent variables that interest us.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 27/92 Assumption of mean ignorability of treatment is important for making these statements. If this assumption does not hold, and as we see shortly there are good reasons for thinking that it does not, then the estimating equations, even including interaction e¤ects, cannot give us accurate estimates of causal e¤ects.

Interaction Terms & Estimating Causal Relationships

In summary, if mean ignorability of treatment is true, then the standard political science approach of estimating causal e¤ects through including control variables and interaction terms can provide estimates of ATE and ATE(W , M).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 28/92 If this assumption does not hold, and as we see shortly there are good reasons for thinking that it does not, then the estimating equations, even including interaction e¤ects, cannot give us accurate estimates of causal e¤ects.

Interaction Terms & Estimating Causal Relationships

In summary, if mean ignorability of treatment is true, then the standard political science approach of estimating causal e¤ects through including control variables and interaction terms can provide estimates of ATE and ATE(W , M). Assumption of mean ignorability of treatment is important for making these statements.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 28/92 Interaction Terms & Estimating Causal Relationships

In summary, if mean ignorability of treatment is true, then the standard political science approach of estimating causal e¤ects through including control variables and interaction terms can provide estimates of ATE and ATE(W , M). Assumption of mean ignorability of treatment is important for making these statements. If this assumption does not hold, and as we see shortly there are good reasons for thinking that it does not, then the estimating equations, even including interaction e¤ects, cannot give us accurate estimates of causal e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 28/92 Restricts analysis to respondents who gave a choice of a major party candidate (thus excluding nonvoters and those who voted for minor party candidates). Uses a probit equation instead of LPM. As his information variable he uses the interviewer ratings of information. Interviewers rate subjects’information levels from “very high” to “very low.”

Bartels Work on Information & Voting

Bartels (1996) evaluates the e¤ect of information on major party voting choices in presidential elections using the American National Election Surveys (ANES) survey data.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 29/92 Uses a probit equation instead of LPM. As his information variable he uses the interviewer ratings of subject information. Interviewers rate subjects’information levels from “very high” to “very low.”

Bartels Work on Information & Voting

Bartels (1996) evaluates the e¤ect of information on major party voting choices in presidential elections using the American National Election Surveys (ANES) survey data. Restricts analysis to respondents who gave a choice of a major party candidate (thus excluding nonvoters and those who voted for minor party candidates).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 29/92 As his information variable he uses the interviewer ratings of subject information. Interviewers rate subjects’information levels from “very high” to “very low.”

Bartels Work on Information & Voting

Bartels (1996) evaluates the e¤ect of information on major party voting choices in presidential elections using the American National Election Surveys (ANES) survey data. Restricts analysis to respondents who gave a choice of a major party candidate (thus excluding nonvoters and those who voted for minor party candidates). Uses a probit equation instead of LPM.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 29/92 Interviewers rate subjects’information levels from “very high” to “very low.”

Bartels Work on Information & Voting

Bartels (1996) evaluates the e¤ect of information on major party voting choices in presidential elections using the American National Election Surveys (ANES) survey data. Restricts analysis to respondents who gave a choice of a major party candidate (thus excluding nonvoters and those who voted for minor party candidates). Uses a probit equation instead of LPM. As his information variable he uses the interviewer ratings of subject information.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 29/92 Bartels Work on Information & Voting

Bartels (1996) evaluates the e¤ect of information on major party voting choices in presidential elections using the American National Election Surveys (ANES) survey data. Restricts analysis to respondents who gave a choice of a major party candidate (thus excluding nonvoters and those who voted for minor party candidates). Uses a probit equation instead of LPM. As his information variable he uses the interviewer ratings of subject information. Interviewers rate subjects’information levels from “very high” to “very low.”

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 29/92 Label this variable T . However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level. So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95. Bartels assumes that the variable T is bounded between 0 and 1. When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level. So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95. Bartels assumes that the variable T is bounded between 0 and 1. When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95. Label this variable T .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95. Bartels assumes that the variable T is bounded between 0 and 1. When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95. Label this variable T . However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 Bartels assumes that the variable T is bounded between 0 and 1. When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95. Label this variable T . However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level. So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95. Label this variable T . However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level. So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95. Bartels assumes that the variable T is bounded between 0 and 1.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 Bartels Work on Information & Voting

Assigns voters cardinal information scores to represent each of the …ve levels possible of 0.05, 0.2, 0.5, 0.8, or 0.95. Label this variable T . However, these numbers are best viewed as approximate assignments in an unspeci…ed interval of information ranges around them since within these categories respondents vary in information level. So for example, some of those classi…ed as very high information and assigned T = 0.95, may have information levels above 0.95 and some might have information levels below 0.95. Bartels assumes that the variable T is bounded between 0 and 1. When T = 1 if a voter is fully informed and = 0 if a voter is fully uninformed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 30/92 In the probit, he interacts these independent variables with both T and (1 T ) so assigned. This is a generalization of the switching regression model in Quandt’s Switching Regression equation.

Bartels Work on Information & Voting

As control variables in the probit equations Bartels includes demographic variables that measure the following characteristics: age (which is entered nonlinearly), education, income, race, gender, marital status, homeownership, occupational status, region and urban, and .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 31/92 This is a generalization of the switching regression model in Quandt’s Switching Regression equation.

Bartels Work on Information & Voting

As control variables in the probit equations Bartels includes demographic variables that measure the following characteristics: age (which is entered nonlinearly), education, income, race, gender, marital status, homeownership, occupational status, region and urban, and religion. In the probit, he interacts these independent variables with both T and (1 T ) so assigned.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 31/92 Bartels Work on Information & Voting

As control variables in the probit equations Bartels includes demographic variables that measure the following characteristics: age (which is entered nonlinearly), education, income, race, gender, marital status, homeownership, occupational status, region and urban, and religion. In the probit, he interacts these independent variables with both T and (1 T ) so assigned. This is a generalization of the switching regression model in Quandt’s Switching Regression equation.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 31/92 He then compares the goodness of …t of the model with the information variable as interacted with the goodness of …t of a probit estimation of voting choice as a function of the independent variables without the information variables; he …nds that in every presidential election year from 1972 to 1992 the estimation including information e¤ects improves the …t and that in 1972, 1984, and 1992, the improvement is large enough to reject the hypothesis of on information e¤ects.

Bartels Work on Information & Voting

Bartels argues then that the coe¢ cients on the independent variables when interacted with T are the e¤ects of these variables on voting behavior when a voter is fully informed and that the coe¢ cients on the independent variables when interacted (1 T ) are the e¤ects of these variables on voting behavior when a voter is completely uninformed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 32/92 Bartels Work on Information & Voting

Bartels argues then that the coe¢ cients on the independent variables when interacted with T are the e¤ects of these variables on voting behavior when a voter is fully informed and that the coe¢ cients on the independent variables when interacted (1 T ) are the e¤ects of these variables on voting behavior when a voter is completely uninformed. He then compares the goodness of …t of the model with the information variable as interacted with the goodness of …t of a probit estimation of voting choice as a function of the independent variables without the information variables; he …nds that in every presidential election year from 1972 to 1992 the estimation including information e¤ects improves the …t and that in 1972, 1984, and 1992, the improvement is large enough to reject the hypothesis of on information e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 32/92 Finds that there are large di¤erences in how his simulated fully informed and fully uninformed women, Protestants, and Catholics vote but that the e¤ects of education, income, and race on voting behavior are similar for the simulated fully informed and fully uninformed voters. Argues that his results show that incumbent presidents received about …ve percent more support and Democratic candidates almost two percent more support than they would have if voters had been fully informed.

Bartels Work on Information & Voting

Using simulations and the clever way that he has coded and interacted the information variable, Bartels then makes a number of comparisons of how di¤erent types of voters, according to demographic characteristics, would or would not change their vote choices if they moved from completely uninformed to fully informed and how electoral outcomes might actually have been di¤erent if the electorate had been fully informed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 33/92 Argues that his results show that incumbent presidents received about …ve percent more support and Democratic candidates almost two percent more support than they would have if voters had been fully informed.

Bartels Work on Information & Voting

Using simulations and the clever way that he has coded and interacted the information variable, Bartels then makes a number of comparisons of how di¤erent types of voters, according to demographic characteristics, would or would not change their vote choices if they moved from completely uninformed to fully informed and how electoral outcomes might actually have been di¤erent if the electorate had been fully informed. Finds that there are large di¤erences in how his simulated fully informed and fully uninformed women, Protestants, and Catholics vote but that the e¤ects of education, income, and race on voting behavior are similar for the simulated fully informed and fully uninformed voters.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 33/92 Bartels Work on Information & Voting

Using simulations and the clever way that he has coded and interacted the information variable, Bartels then makes a number of comparisons of how di¤erent types of voters, according to demographic characteristics, would or would not change their vote choices if they moved from completely uninformed to fully informed and how electoral outcomes might actually have been di¤erent if the electorate had been fully informed. Finds that there are large di¤erences in how his simulated fully informed and fully uninformed women, Protestants, and Catholics vote but that the e¤ects of education, income, and race on voting behavior are similar for the simulated fully informed and fully uninformed voters. Argues that his results show that incumbent presidents received about …ve percent more support and Democratic candidates almost two percent more support than they would have if voters had been fully informed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 33/92 It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds. How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior? In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect. Thus, the answer must be to use data at an aggregate level. If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

Problem of Generalizing from Individual Results

Implications hinge crucially on in SUTVA, as Sekhon (2005) notes.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior? In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect. Thus, the answer must be to use data at an aggregate level. If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

Problem of Generalizing from Individual Results

Implications hinge crucially on belief in SUTVA, as Sekhon (2005) notes. It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect. Thus, the answer must be to use data at an aggregate level. If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

Problem of Generalizing from Individual Results

Implications hinge crucially on belief in SUTVA, as Sekhon (2005) notes. It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds. How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 Thus, the answer must be to use data at an aggregate level. If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

Problem of Generalizing from Individual Results

Implications hinge crucially on belief in SUTVA, as Sekhon (2005) notes. It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds. How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior? In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

Problem of Generalizing from Individual Results

Implications hinge crucially on belief in SUTVA, as Sekhon (2005) notes. It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds. How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior? In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect. Thus, the answer must be to use data at an aggregate level.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 Problem of Generalizing from Individual Results

Implications hinge crucially on belief in SUTVA, as Sekhon (2005) notes. It is extremely doubtful that we can assume that the treatment e¤ects are …xed as we vary the number of informed voters in the population and thus highly speculative to argue what would occur in the two worlds. How then can we estimate the e¤ect of large, aggregate changes in information on voting behavior? In order to aggregate up from individual level data, we need to assume SUTVA, which is highly suspect. Thus, the answer must be to use data at an aggregate level. If the unit of the analysis is at the aggregate level, then measured e¤ect will be at the aggregate and take into account possible equilibrium & cross-e¤ects from changing aggregate information levels.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 34/92 Certainly it is easy to measure aggregate voting behavior but then the problem is measuring aggregate levels of voter information, &, even more di¢ cult, having signi…cant enough variation in voter information to discern treatment e¤ects, as well as having a large enough dataset to be able to show results that are statistically signi…cant. Can we devise experiments to handle this problem?

Problem of Generalizing from Individual Results

What can be done?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 35/92 Can we devise experiments to handle this problem?

Problem of Generalizing from Individual Results

What can be done? Certainly it is easy to measure aggregate voting behavior but then the problem is measuring aggregate levels of voter information, &, even more di¢ cult, having signi…cant enough variation in voter information to discern treatment e¤ects, as well as having a large enough dataset to be able to show results that are statistically signi…cant.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 35/92 Problem of Generalizing from Individual Results

What can be done? Certainly it is easy to measure aggregate voting behavior but then the problem is measuring aggregate levels of voter information, &, even more di¢ cult, having signi…cant enough variation in voter information to discern treatment e¤ects, as well as having a large enough dataset to be able to show results that are statistically signi…cant. Can we devise experiments to handle this problem?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 35/92 The reasons for excluding these variables are not in contention; that is, the values of these variables are likely to be a¤ected by voter information and thus mask the e¤ect of information on voting behavior in the survey data. This is an important issue in choosing control variables in analysis of causal e¤ects that is often ignored in political science research.

Problems in Choosing Control Variables & Mediating Variables

Sekhon (2005) also asserts that because Bartels’estimation does not include known variables that can a¤ect voter choices such as partisan identi…cation (a mediating variable), the results overstate the e¤ect of information on voting behavior.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 36/92 This is an important issue in choosing control variables in analysis of causal e¤ects that is often ignored in political science research.

Problems in Choosing Control Variables & Mediating Variables

Sekhon (2005) also asserts that because Bartels’estimation does not include known variables that can a¤ect voter choices such as partisan identi…cation (a mediating variable), the results overstate the e¤ect of information on voting behavior. The reasons for excluding these variables are not in contention; that is, the values of these variables are likely to be a¤ected by voter information and thus mask the e¤ect of information on voting behavior in the survey data.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 36/92 Problems in Choosing Control Variables & Mediating Variables

Sekhon (2005) also asserts that because Bartels’estimation does not include known variables that can a¤ect voter choices such as partisan identi…cation (a mediating variable), the results overstate the e¤ect of information on voting behavior. The reasons for excluding these variables are not in contention; that is, the values of these variables are likely to be a¤ected by voter information and thus mask the e¤ect of information on voting behavior in the survey data. This is an important issue in choosing control variables in analysis of causal e¤ects that is often ignored in political science research.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 36/92 Mediating Variables in Control Functions

De…nition (Mediating Variable) A variable through which treatment variables can a¤ect potential outcomes. Mediating variables are functions of treatment variables and potential outcomes are functions of mediating variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 37/92 Y is also a function of an observable variable, X , partisan identi…cation, and unobservable variables, U, which we assume does a¤ect X but not T .

Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

Say we are interested in causal e¤ect of T , information, on Y , voting behavior, arrow that goes from T to Y .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 38/92 Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

Say we are interested in causal e¤ect of T , information, on Y , voting behavior, arrow that goes from T to Y . Y is also a function of an observable variable, X , partisan identi…cation, and unobservable variables, U, which we assume does a¤ect X but not T .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 38/92 But if we control for X , partisan identi…cation, in estimating the e¤ect of information on voting behavior, then since X is a descendant or a function of both U & T ,& U & T are independent, then U & T are associated through Y . Controlling for X makes U a confounder of the relationship between T & Y . Intuitively, when measuring the e¤ect of T on Y , we remove part of the e¤ect of information on voting behavior that is mediated through partisan identi…cation because of the confounding.

Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

U is not a confounder of the relationship between T and Y .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 39/92 Controlling for X makes U a confounder of the relationship between T & Y . Intuitively, when measuring the e¤ect of T on Y , we remove part of the e¤ect of information on voting behavior that is mediated through partisan identi…cation because of the confounding.

Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

U is not a confounder of the relationship between T and Y . But if we control for X , partisan identi…cation, in estimating the e¤ect of information on voting behavior, then since X is a descendant or a function of both U & T ,& U & T are independent, then U & T are associated through Y .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 39/92 Intuitively, when measuring the e¤ect of T on Y , we remove part of the e¤ect of information on voting behavior that is mediated through partisan identi…cation because of the confounding.

Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

U is not a confounder of the relationship between T and Y . But if we control for X , partisan identi…cation, in estimating the e¤ect of information on voting behavior, then since X is a descendant or a function of both U & T ,& U & T are independent, then U & T are associated through Y . Controlling for X makes U a confounder of the relationship between T & Y .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 39/92 Problems in Choosing Control Variables

Figure 4.1: Control and Endogenous Variables

U is not a confounder of the relationship between T and Y . But if we control for X , partisan identi…cation, in estimating the e¤ect of information on voting behavior, then since X is a descendant or a function of both U & T ,& U & T are independent, then U & T are associated through Y . Controlling for X makes U a confounder of the relationship between T & Y . Intuitively, when measuring the e¤ect of T on Y , we remove part of the e¤ect of information on voting behavior that is mediated through partisan identi…cation because of the confounding.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 39/92 This is probably why many political scientists tend to include variables that arguably create such confounding. Bartels work is exceptional in this respect. Yet, can the poor …t be a problem in itself? Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered. Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Bartels work is exceptional in this respect. Yet, can the poor …t be a problem in itself? Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered. Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon. This is probably why many political scientists tend to include variables that arguably create such confounding.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Yet, can the poor …t be a problem in itself? Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered. Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon. This is probably why many political scientists tend to include variables that arguably create such confounding. Bartels work is exceptional in this respect.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered. Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon. This is probably why many political scientists tend to include variables that arguably create such confounding. Bartels work is exceptional in this respect. Yet, can the poor …t be a problem in itself?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon. This is probably why many political scientists tend to include variables that arguably create such confounding. Bartels work is exceptional in this respect. Yet, can the poor …t be a problem in itself? Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Problems in Choosing Control Variables

However, leaving out these variables does result in models of voting behavior that do not …t well and there can be numerous missclassi…cation errors, which bothers Sekhon. This is probably why many political scientists tend to include variables that arguably create such confounding. Bartels work is exceptional in this respect. Yet, can the poor …t be a problem in itself? Sekhon argues that the estimated e¤ects Bartels …nds may be inaccurate if the consequences of the misclassi…cations are not considered. Sekhon re-estimates the model with a matching procedure designed to determine treatment e¤ects that are robust to these errors (which we discuss more fully later) & …nds that the information e¤ects are not signi…cant.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 40/92 Argument is that by including only as control variables “essentially …xed” demographic and social characteristics & using a ‡exible estimation procedure, & multiple experimentation with alternative speci…cations which yield similar results, can accurately detect the e¤ect of information on voting behavior in the reduced form estimation. Many political scientists do this. However, since reduced form equation estimated is not actually solved for from a fully speci…ed more elaborate model, but just hypothesized to be solution to one, using an RCM approach to the problem. Implicitly assuming SUTVA as well as some version of ignorability of treatment which may be untrue and thus estimates may be inconsistent.

Reduced Form Equations & Implicit Models

Bartels contends that the estimating equation he uses can be of as a reduced form version of a more elaborate model that is unspeci…ed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 41/92 Many political scientists do this. However, since reduced form equation estimated is not actually solved for from a fully speci…ed more elaborate model, but just hypothesized to be solution to one, using an RCM approach to the problem. Implicitly assuming SUTVA as well as some version of ignorability of treatment which may be untrue and thus estimates may be inconsistent.

Reduced Form Equations & Implicit Models

Bartels contends that the estimating equation he uses can be thought of as a reduced form version of a more elaborate model that is unspeci…ed. Argument is that by including only as control variables “essentially …xed” demographic and social characteristics & using a ‡exible estimation procedure, & multiple experimentation with alternative speci…cations which yield similar results, can accurately detect the e¤ect of information on voting behavior in the reduced form estimation.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 41/92 However, since reduced form equation estimated is not actually solved for from a fully speci…ed more elaborate model, but just hypothesized to be solution to one, using an RCM approach to the problem. Implicitly assuming SUTVA as well as some version of ignorability of treatment which may be untrue and thus estimates may be inconsistent.

Reduced Form Equations & Implicit Models

Bartels contends that the estimating equation he uses can be thought of as a reduced form version of a more elaborate model that is unspeci…ed. Argument is that by including only as control variables “essentially …xed” demographic and social characteristics & using a ‡exible estimation procedure, & multiple experimentation with alternative speci…cations which yield similar results, can accurately detect the e¤ect of information on voting behavior in the reduced form estimation. Many political scientists do this.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 41/92 Implicitly assuming SUTVA as well as some version of ignorability of treatment which may be untrue and thus estimates may be inconsistent.

Reduced Form Equations & Implicit Models

Bartels contends that the estimating equation he uses can be thought of as a reduced form version of a more elaborate model that is unspeci…ed. Argument is that by including only as control variables “essentially …xed” demographic and social characteristics & using a ‡exible estimation procedure, & multiple experimentation with alternative speci…cations which yield similar results, can accurately detect the e¤ect of information on voting behavior in the reduced form estimation. Many political scientists do this. However, since reduced form equation estimated is not actually solved for from a fully speci…ed more elaborate model, but just hypothesized to be solution to one, using an RCM approach to the problem.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 41/92 Reduced Form Equations & Implicit Models

Bartels contends that the estimating equation he uses can be thought of as a reduced form version of a more elaborate model that is unspeci…ed. Argument is that by including only as control variables “essentially …xed” demographic and social characteristics & using a ‡exible estimation procedure, & multiple experimentation with alternative speci…cations which yield similar results, can accurately detect the e¤ect of information on voting behavior in the reduced form estimation. Many political scientists do this. However, since reduced form equation estimated is not actually solved for from a fully speci…ed more elaborate model, but just hypothesized to be solution to one, using an RCM approach to the problem. Implicitly assuming SUTVA as well as some version of ignorability of treatment which may be untrue and thus estimates may be inconsistent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 41/92 Construct simple example with a binary treatment variable either 0 or 1. Allow for treatment variable endogenously determined & function of same demographics that a¤ect voting as well as other factors.

Reduced Form Equations & Implicit Models

How might ignorability of treatment be false and the results not be consistent in equations that are reduced form versions of unstated models?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 42/92 Allow for treatment variable endogenously determined & function of same demographics that a¤ect voting as well as other factors.

Reduced Form Equations & Implicit Models

How might ignorability of treatment be false and the results not be consistent in equations that are reduced form versions of unstated models? Construct simple example with a binary treatment variable either 0 or 1.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 42/92 Reduced Form Equations & Implicit Models

How might ignorability of treatment be false and the results not be consistent in equations that are reduced form versions of unstated models? Construct simple example with a binary treatment variable either 0 or 1. Allow for treatment variable endogenously determined & function of same demographics that a¤ect voting as well as other factors.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 42/92 T  = X βT + Z θ + v

Yj = 1 if Yj > 0, 0 otherwise T = 1 if T  > 0, 0 otherwise Y = TY1 + (1 T ) Y0 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

Reduced Form Equations & Implicit Models

Yj = X βYj + uj

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 Yj = 1 if Yj > 0, 0 otherwise T = 1 if T  > 0, 0 otherwise Y = TY1 + (1 T ) Y0 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

Reduced Form Equations & Implicit Models

Yj = X βYj + uj

T  = X βT + Z θ + v

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 T = 1 if T  > 0, 0 otherwise Y = TY1 + (1 T ) Y0 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

Reduced Form Equations & Implicit Models

Yj = X βYj + uj

T  = X βT + Z θ + v

Yj = 1 if Yj > 0, 0 otherwise

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 Y = TY1 + (1 T ) Y0 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

Reduced Form Equations & Implicit Models

Yj = X βYj + uj

T  = X βT + Z θ + v

Yj = 1 if Yj > 0, 0 otherwise T = 1 if T  > 0, 0 otherwise

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

Reduced Form Equations & Implicit Models

Yj = X βYj + uj

T  = X βT + Z θ + v

Yj = 1 if Yj > 0, 0 otherwise T = 1 if T  > 0, 0 otherwise Y = TY1 + (1 T ) Y0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 Reduced Form Equations & Implicit Models

Yj = X βYj + uj

T  = X βT + Z θ + v

Yj = 1 if Yj > 0, 0 otherwise T = 1 if T  > 0, 0 otherwise Y = TY1 + (1 T ) Y0 where Yj is the latent utility that a voter receives from voting for the Republican candidate over the Democratic opponent under treatment j and Yj is a binary variable that represents whether an individual votes Republican or Democrat under treatment j.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 43/92 Assume that (u, v) is independent of X and Z and distributed as bivariate normal with mean zero and that each has a unit variance. Since there is no manipulation by an experimenter or nature we do not include M as an exogenous variable; however, if there was manipulation then M would be included there. In our formulation, the equivalent equation to what Bartels estimates is the following probit model:

Pr (Y = 1) = Φ ((1 T ) X βY 0 + TX βY 1) , where Φ is the cumulative normal function.

Reduced Form Equations & Implicit Models

De…ne u = Tu1 + (1 T ) u0.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 44/92 Since there is no manipulation by an experimenter or nature we do not include M as an exogenous variable; however, if there was manipulation then M would be included there. In our formulation, the equivalent equation to what Bartels estimates is the following probit model:

Pr (Y = 1) = Φ ((1 T ) X βY 0 + TX βY 1) , where Φ is the cumulative normal function.

Reduced Form Equations & Implicit Models

De…ne u = Tu1 + (1 T ) u0. Assume that (u, v) is independent of X and Z and distributed as bivariate normal with mean zero and that each has a unit variance.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 44/92 In our formulation, the equivalent equation to what Bartels estimates is the following probit model:

Pr (Y = 1) = Φ ((1 T ) X βY 0 + TX βY 1) , where Φ is the cumulative normal function.

Reduced Form Equations & Implicit Models

De…ne u = Tu1 + (1 T ) u0. Assume that (u, v) is independent of X and Z and distributed as bivariate normal with mean zero and that each has a unit variance. Since there is no manipulation by an experimenter or nature we do not include M as an exogenous variable; however, if there was manipulation then M would be included there.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 44/92 Reduced Form Equations & Implicit Models

De…ne u = Tu1 + (1 T ) u0. Assume that (u, v) is independent of X and Z and distributed as bivariate normal with mean zero and that each has a unit variance. Since there is no manipulation by an experimenter or nature we do not include M as an exogenous variable; however, if there was manipulation then M would be included there. In our formulation, the equivalent equation to what Bartels estimates is the following probit model:

Pr (Y = 1) = Φ ((1 T ) X βY 0 + TX βY 1) , where Φ is the cumulative normal function.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 44/92 De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj . Why might this occur? If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated. For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty. Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 Why might this occur? If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated. For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty. Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior? De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated. For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty. Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior? De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj . Why might this occur?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty. Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior? De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj . Why might this occur? If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior? De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj . Why might this occur? If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated. For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 Reduced Form Equations & Implicit Models

Since T is an endogenous variable in the underlying model, can the estimation of this equation yield consistent estimates of the e¤ect of information on voting behavior? De…ne ρ = Corr(u, v). If ρ = 0, then u and v are correlated, and 6 probit estimation is inconsistent for βYj . Why might this occur? If we think that there are unobservable variables that both a¤ect whether a voter is informed and how he or she votes, then these errors may be correlated. For example, one such variable is cognitive abilities; Another unobserved variable that might lead to problems is the value that individuals place on citizen duty. Thus unobserved variables like cognitive abilities and the value that voters place on citizen duty might lead to a correlation between the error terms and inconsistent estimations of the e¤ect of information on voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 45/92 Y is a function of T and U, which contains both observables and unobservables.

Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

Interested in the e¤ect of T on Y . T is a function of observables, Z, & unobservables, V .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 46/92 Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

Interested in the e¤ect of T on Y . T is a function of observables, Z, & unobservables, V . Y is a function of T and U, which contains both observables and unobservables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 46/92 U and T are correlated through the observables. When we use control functions we are controlling for these e¤ects.

Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

Note that we allow for Z to be related to U & may even overlap with some of the observables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 47/92 Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

Note that we allow for Z to be related to U & may even overlap with some of the observables. U and T are correlated through the observables. When we use control functions we are controlling for these e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 47/92 If U and V are correlated, then selection on unobservables is occurring and there are common omitted variables in the estimation of the treatment e¤ect.

Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

The assumption of ignorability of treatment is that this is the only avenue through which U and T are correlated.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 48/92 Reduced Form Equations & Implicit Models

Figure 4.2: Ignorability of Treatment

The assumption of ignorability of treatment is that this is the only avenue through which U and T are correlated. If U and V are correlated, then selection on unobservables is occurring and there are common omitted variables in the estimation of the treatment e¤ect. R B Morton (NYU) EPS Lecture 3 ExpClassLectures 48/92 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 Reduced Form Equations & Implicit Models

How can a researcher …nd out if ρ = 0? 6 Rivers and Vuong (1988) propose a test of exogeneity that can be applied to our problem. To do so, a researcher …rst estimates a probit of T on X and Z and saves the residuals. The researcher then estimates her equation including the residuals as an additional independent variable. If the coe¢ cient on the residuals is signi…cantly di¤erent from zero, then the researcher has an endogeneity problem. Evidence suggests that in studies of the e¤ect of information on voting behavior, endogeneity can be a problem. Wooldridge (2002, chapter 15) discusses, consistent estimates of the causal e¤ect of information in the underlying model can be obtained through maximum likelihood methods that explicitly incorporate and estimate ρ.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 49/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways – a subjective measure by revealing complete information to voters & asking them if they voted correctly

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways – a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 50/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Construct measures of whether subjects voted correctly in two ways –

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways –

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways – a subjective measure by revealing complete information to voters & asking them if they voted correctly

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Bartels (1996), by investigating whether uninformed voters choose as if they are informed, is evaluating an implication of the hypothesis that uninformed voters use heuristics and information shortcuts in order to vote correctly; recall the theories of voting. In order to evaluate how voters use heuristics Lau and Redlawsk create a computer generated hypothetical campaign where voters are exposed to heuristics as well as have opportunities to acquire more substantive information. Able to monitor electronically which heuristics subjects use and how often & compare to votes in hypothetical election Construct measures of whether subjects voted correctly in two ways – a subjective measure by revealing complete information to voters & asking them if they voted correctly a normative measure where they used information from a pre-treatment survey of political attitudes & preferences.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 51/92 Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable. Also interact heuristic use with a measure of political sophistication. Thus, implicitly assuming mean ignorability of treatment & SUTVA. Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable. Also interact heuristic use with a measure of political sophistication. Thus, implicitly assuming mean ignorability of treatment & SUTVA. Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 Also interact heuristic use with a measure of political sophistication. Thus, implicitly assuming mean ignorability of treatment & SUTVA. Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 Thus, implicitly assuming mean ignorability of treatment & SUTVA. Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable. Also interact heuristic use with a measure of political sophistication.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable. Also interact heuristic use with a measure of political sophistication. Thus, implicitly assuming mean ignorability of treatment & SUTVA.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 Laboratory Experiment on How Voters Use Heuristics in Making Voting Choices

Did not manipulate heuristic exposure, allowing subjects choose heuristics to use Through controlling information subjects beyond heuristics & cost of using heuristics, attempted to control for confounding To evaluate e¤ect of heuristic use on whether a subject voted correctly estimate a logit regression with voting correctly as a dependent variable & heuristic use as an independent variable. Also interact heuristic use with a measure of political sophistication. Thus, implicitly assuming mean ignorability of treatment & SUTVA. Find heuristic use increases probability of voting correctly if voters are more politically sophisticated, suggesting unsophisticated voters less able use heuristics.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 52/92 Are implicitly assuming both mean ignorability of treatment plus E (u1 W ) = E (u0 W ) . j j Humphreys (2009) presents a number of examples in which E (u1 W ) = E (u0 W ) does not hold although mean ignorability holdsj because manipulationsj are randomly assigned. He derives a monotonicity condition & shows that when the condition holds the estimated treatment e¤ects are guaranteed to lie between estimates of ATT & the average treatment e¤ects for non-treated subjects.

Using Regression Control Methods with Experimental Data

In many cases researchers using experiments estimate regression equations with control functions to deal with observable variables that they think might be confounders. Gerber, Kaplan, & Bergan; Clinton & Lapinski

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 53/92 Humphreys (2009) presents a number of examples in which E (u1 W ) = E (u0 W ) does not hold although mean ignorability holdsj because manipulationsj are randomly assigned. He derives a monotonicity condition & shows that when the condition holds the estimated treatment e¤ects are guaranteed to lie between estimates of ATT & the average treatment e¤ects for non-treated subjects.

Using Regression Control Methods with Experimental Data

In many cases researchers using experiments estimate regression equations with control functions to deal with observable variables that they think might be confounders. Gerber, Kaplan, & Bergan; Clinton & Lapinski Are implicitly assuming both mean ignorability of treatment plus E (u1 W ) = E (u0 W ) . j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 53/92 He derives a monotonicity condition & shows that when the condition holds the estimated treatment e¤ects are guaranteed to lie between estimates of ATT & the average treatment e¤ects for non-treated subjects.

Using Regression Control Methods with Experimental Data

In many cases researchers using experiments estimate regression equations with control functions to deal with observable variables that they think might be confounders. Gerber, Kaplan, & Bergan; Clinton & Lapinski Are implicitly assuming both mean ignorability of treatment plus E (u1 W ) = E (u0 W ) . j j Humphreys (2009) presents a number of examples in which E (u1 W ) = E (u0 W ) does not hold although mean ignorability holdsj because manipulationsj are randomly assigned.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 53/92 Using Regression Control Methods with Experimental Data

In many cases researchers using experiments estimate regression equations with control functions to deal with observable variables that they think might be confounders. Gerber, Kaplan, & Bergan; Clinton & Lapinski Are implicitly assuming both mean ignorability of treatment plus E (u1 W ) = E (u0 W ) . j j Humphreys (2009) presents a number of examples in which E (u1 W ) = E (u0 W ) does not hold although mean ignorability holdsj because manipulationsj are randomly assigned. He derives a monotonicity condition & shows that when the condition holds the estimated treatment e¤ects are guaranteed to lie between estimates of ATT & the average treatment e¤ects for non-treated subjects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 53/92 If E (u1 W ) = E (u0 W ) then should consider adding in interaction terms orj using6 nonparametricj methods such as matching Alternatively, might randomize within strata of observable variables that believe ex ante are likely to be confounders. Gerber, Kaplan, & Bergan randomized within some of observable variables from an initial survey– to vote, paper-reader (non-Post/non-Times), mentioned ever reading a paper, received a magazine, or asked whether they wish they read newspapers more.

Using Regression Control Methods with Experimental Data

If con…dent E (u1 W ) = E (u0 W ) is true, including control variables as independent variablesj will resultj in an estimate of ATE.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 54/92 Alternatively, might randomize within strata of observable variables that believe ex ante are likely to be confounders. Gerber, Kaplan, & Bergan randomized within some of observable variables from an initial survey–intention to vote, paper-reader (non-Post/non-Times), mentioned ever reading a paper, received a magazine, or asked whether they wish they read newspapers more.

Using Regression Control Methods with Experimental Data

If con…dent E (u1 W ) = E (u0 W ) is true, including control variables as independent variablesj will resultj in an estimate of ATE.

If E (u1 W ) = E (u0 W ) then should consider adding in interaction terms orj using6 nonparametricj methods such as matching

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 54/92 Gerber, Kaplan, & Bergan randomized within some of observable variables from an initial survey–intention to vote, paper-reader (non-Post/non-Times), mentioned ever reading a paper, received a magazine, or asked whether they wish they read newspapers more.

Using Regression Control Methods with Experimental Data

If con…dent E (u1 W ) = E (u0 W ) is true, including control variables as independent variablesj will resultj in an estimate of ATE.

If E (u1 W ) = E (u0 W ) then should consider adding in interaction terms orj using6 nonparametricj methods such as matching Alternatively, might randomize within strata of observable variables that believe ex ante are likely to be confounders.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 54/92 Using Regression Control Methods with Experimental Data

If con…dent E (u1 W ) = E (u0 W ) is true, including control variables as independent variablesj will resultj in an estimate of ATE.

If E (u1 W ) = E (u0 W ) then should consider adding in interaction terms orj using6 nonparametricj methods such as matching Alternatively, might randomize within strata of observable variables that believe ex ante are likely to be confounders. Gerber, Kaplan, & Bergan randomized within some of observable variables from an initial survey–intention to vote, paper-reader (non-Post/non-Times), mentioned ever reading a paper, received a magazine, or asked whether they wish they read newspapers more.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 54/92 Can estimate simple bivariate switching regression equation for ATE. Suppose have covariates & think that controls will increase precision of estimates In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20. Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 Suppose have covariates & think that controls will increase precision of estimates In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20. Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments. Can estimate simple bivariate switching regression equation for ATE.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20. Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments. Can estimate simple bivariate switching regression equation for ATE. Suppose have covariates & think that controls will increase precision of estimates

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20. Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments. Can estimate simple bivariate switching regression equation for ATE. Suppose have covariates & think that controls will increase precision of estimates In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments. Can estimate simple bivariate switching regression equation for ATE. Suppose have covariates & think that controls will increase precision of estimates In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 Using Regression Control Methods with Experimental Data

Random assignment of manipulations when implemented ideally allows one to compute ATE as the simple average of choices of subjects given their assignments. Can estimate simple bivariate switching regression equation for ATE. Suppose have covariates & think that controls will increase precision of estimates In …nite samples, Freedman (2008a, 2008b) shows including controls results in biased OLS estimates even when manipulations randomly assigned Green (2009) shows that in simulated & actual examples that biases tend to be negligible for sample sizes greater than 20. Argues ases where biases might occur in larger experiments are where there are extreme outliers readily detected through visual inspection.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 55/92 But often have data over time & can exploit time dimension in order to determine causal e¤ects. At one extreme, the number of cases is small relative to the time periods examined, turn to time-series methods Host of issues to consider when using time-series to establish causal relationships, not particularly relevant to experimental political science where researchers normally have a much larger set of cases relative to time periods examined.

Using Time to Control for Confounding Unobservables

So far considering regressions with a single cross-sectional random sample similar to that used by Bartels.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 56/92 At one extreme, the number of cases is small relative to the time periods examined, turn to time-series methods Host of issues to consider when using time-series to establish causal relationships, not particularly relevant to experimental political science where researchers normally have a much larger set of cases relative to time periods examined.

Using Time to Control for Confounding Unobservables

So far considering regressions with a single cross-sectional random sample similar to that used by Bartels. But often have data over time & can exploit time dimension in order to determine causal e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 56/92 Host of issues to consider when using time-series to establish causal relationships, not particularly relevant to experimental political science where researchers normally have a much larger set of cases relative to time periods examined.

Using Time to Control for Confounding Unobservables

So far considering regressions with a single cross-sectional random sample similar to that used by Bartels. But often have data over time & can exploit time dimension in order to determine causal e¤ects. At one extreme, the number of cases is small relative to the time periods examined, turn to time-series methods

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 56/92 Using Time to Control for Confounding Unobservables

So far considering regressions with a single cross-sectional random sample similar to that used by Bartels. But often have data over time & can exploit time dimension in order to determine causal e¤ects. At one extreme, the number of cases is small relative to the time periods examined, turn to time-series methods Host of issues to consider when using time-series to establish causal relationships, not particularly relevant to experimental political science where researchers normally have a much larger set of cases relative to time periods examined.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 56/92 Pooled cross-sectional data is when for each time period we have a new random sample from the relevant population. The are thought of as independent but not identically distributed. Panel data is when we have observations on the same group of individuals, geographical locations, and so forth over time.

Using Time to Control for Confounding Unobservables

There are two types of datasets that are relevant— pooled cross-sectional data and panel data.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 57/92 Panel data is when we have observations on the same group of individuals, geographical locations, and so forth over time.

Using Time to Control for Confounding Unobservables

There are two types of datasets that are relevant— pooled cross-sectional data and panel data. Pooled cross-sectional data is when for each time period we have a new random sample from the relevant population. The observations are thought of as independent but not identically distributed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 57/92 Using Time to Control for Confounding Unobservables

There are two types of datasets that are relevant— pooled cross-sectional data and panel data. Pooled cross-sectional data is when for each time period we have a new random sample from the relevant population. The observations are thought of as independent but not identically distributed. Panel data is when we have observations on the same group of individuals, geographical locations, and so forth over time.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 57/92 We discuss these methods later when review procedures used to estimate by side stepping confounding variables. Now explore panel data methods that primarily are used to aid in control of confounding variables and better determine causal e¤ects by controlling for unobservables. Panel data methods are also useful for analyzing experimental data that is generated through repeated choices of the same subjects.

Using Time to Control for Confounding Unobservables

One of the ways that researchers evaluate the e¤ects of manipulations by nature, or natural experiments, is through the use of pooled cross-sectional data.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 58/92 Now explore panel data methods that primarily are used to aid in control of confounding variables and better determine causal e¤ects by controlling for unobservables. Panel data methods are also useful for analyzing experimental data that is generated through repeated choices of the same subjects.

Using Time to Control for Confounding Unobservables

One of the ways that researchers evaluate the e¤ects of manipulations by nature, or natural experiments, is through the use of pooled cross-sectional data. We discuss these methods later when review procedures used to estimate causality by side stepping confounding variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 58/92 Panel data methods are also useful for analyzing experimental data that is generated through repeated choices of the same subjects.

Using Time to Control for Confounding Unobservables

One of the ways that researchers evaluate the e¤ects of manipulations by nature, or natural experiments, is through the use of pooled cross-sectional data. We discuss these methods later when review procedures used to estimate causality by side stepping confounding variables. Now explore panel data methods that primarily are used to aid in control of confounding variables and better determine causal e¤ects by controlling for unobservables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 58/92 Using Time to Control for Confounding Unobservables

One of the ways that researchers evaluate the e¤ects of manipulations by nature, or natural experiments, is through the use of pooled cross-sectional data. We discuss these methods later when review procedures used to estimate causality by side stepping confounding variables. Now explore panel data methods that primarily are used to aid in control of confounding variables and better determine causal e¤ects by controlling for unobservables. Panel data methods are also useful for analyzing experimental data that is generated through repeated choices of the same subjects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 58/92 cognitive ability or the value individual places on citizen duty or some other unobservable characteristic. If have repeated observations on individuals where can observe di¤erences in voter information over time & assume individual speci…c variable is constant, then having a panel data set can perhaps allow us to control for observable variable. Experimental data generated in cross-over designs with multiple periods discussed earlier also can be analyzed in this fashion.

Panel Data & Control of Unobservables

One problem with trying to determine e¤ect of information on voters’ choices is that some unobserved variable might both a¤ect how informed voters are as well as the choices they would make given their informational choices.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 59/92 If have repeated observations on individuals where can observe di¤erences in voter information over time & assume individual speci…c variable is constant, then having a panel data set can perhaps allow us to control for observable variable. Experimental data generated in cross-over designs with multiple periods discussed earlier also can be analyzed in this fashion.

Panel Data & Control of Unobservables

One problem with trying to determine e¤ect of information on voters’ choices is that some unobserved variable might both a¤ect how informed voters are as well as the choices they would make given their informational choices. cognitive ability or the value individual places on citizen duty or some other unobservable characteristic.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 59/92 Experimental data generated in cross-over designs with multiple periods discussed earlier also can be analyzed in this fashion.

Panel Data & Control of Unobservables

One problem with trying to determine e¤ect of information on voters’ choices is that some unobserved variable might both a¤ect how informed voters are as well as the choices they would make given their informational choices. cognitive ability or the value individual places on citizen duty or some other unobservable characteristic. If have repeated observations on individuals where can observe di¤erences in voter information over time & assume individual speci…c variable is constant, then having a panel data set can perhaps allow us to control for observable variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 59/92 Panel Data & Control of Unobservables

One problem with trying to determine e¤ect of information on voters’ choices is that some unobserved variable might both a¤ect how informed voters are as well as the choices they would make given their informational choices. cognitive ability or the value individual places on citizen duty or some other unobservable characteristic. If have repeated observations on individuals where can observe di¤erences in voter information over time & assume individual speci…c variable is constant, then having a panel data set can perhaps allow us to control for observable variable. Experimental data generated in cross-over designs with multiple periods discussed earlier also can be analyzed in this fashion.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 59/92 where t denotes time period, Tt represents information state of an individual at time t, I is unknown characteristic individual speci…c & constant across t,& Ut is time & individual speci…c unobservable error. Assuming only thing that a¤ects voter choices is information & individual characteristics.

Problem is that if I is correlated with Tt , then if just have I be in error term cannot consistently estimate e¤ect of information on voting behavior. If have just a single cross-section, then estimate of e¤ect of information on voting behavior is problematic unless we come up with a good proxy for I , or take an instrumental variable approach.

Panel Data & Control of Unobservables

Assume expect dependent variable, P, according to LPM, is given by the following simple model: Pt = α + βTt + I + Ut

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 60/92 Assuming only thing that a¤ects voter choices is information & individual characteristics.

Problem is that if I is correlated with Tt , then if just have I be in error term cannot consistently estimate e¤ect of information on voting behavior. If have just a single cross-section, then estimate of e¤ect of information on voting behavior is problematic unless we come up with a good proxy for I , or take an instrumental variable approach.

Panel Data & Control of Unobservables

Assume expect dependent variable, P, according to LPM, is given by the following simple model: Pt = α + βTt + I + Ut

where t denotes time period, Tt represents information state of an individual at time t, I is unknown characteristic individual speci…c & constant across t,& Ut is time & individual speci…c unobservable error.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 60/92 Problem is that if I is correlated with Tt , then if just have I be in error term cannot consistently estimate e¤ect of information on voting behavior. If have just a single cross-section, then estimate of e¤ect of information on voting behavior is problematic unless we come up with a good proxy for I , or take an instrumental variable approach.

Panel Data & Control of Unobservables

Assume expect dependent variable, P, according to LPM, is given by the following simple model: Pt = α + βTt + I + Ut

where t denotes time period, Tt represents information state of an individual at time t, I is unknown characteristic individual speci…c & constant across t,& Ut is time & individual speci…c unobservable error. Assuming only thing that a¤ects voter choices is information & individual characteristics.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 60/92 If have just a single cross-section, then estimate of e¤ect of information on voting behavior is problematic unless we come up with a good proxy for I , or take an instrumental variable approach.

Panel Data & Control of Unobservables

Assume expect dependent variable, P, according to LPM, is given by the following simple model: Pt = α + βTt + I + Ut

where t denotes time period, Tt represents information state of an individual at time t, I is unknown characteristic individual speci…c & constant across t,& Ut is time & individual speci…c unobservable error. Assuming only thing that a¤ects voter choices is information & individual characteristics.

Problem is that if I is correlated with Tt , then if just have I be in error term cannot consistently estimate e¤ect of information on voting behavior.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 60/92 Panel Data & Control of Unobservables

Assume expect dependent variable, P, according to LPM, is given by the following simple model: Pt = α + βTt + I + Ut

where t denotes time period, Tt represents information state of an individual at time t, I is unknown characteristic individual speci…c & constant across t,& Ut is time & individual speci…c unobservable error. Assuming only thing that a¤ects voter choices is information & individual characteristics.

Problem is that if I is correlated with Tt , then if just have I be in error term cannot consistently estimate e¤ect of information on voting behavior. If have just a single cross-section, then estimate of e¤ect of information on voting behavior is problematic unless we come up with a good proxy for I , or take an instrumental variable approach.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 60/92 There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods. All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j The second equality is an assumption of linearity.

When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j The second equality is an assumption of linearity.

When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ? There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j The second equality is an assumption of linearity.

When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ? There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods. All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 The second equality is an assumption of linearity.

When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ? There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods. All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ? There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods. All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j The second equality is an assumption of linearity.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 Strict Exogeneity

Can we estimate Pt = α + βTt + I + Ut if we have panel data, using repeated observations for the same individuals as a way to control for I ? There are a number of relatively common estimation procedures that political scientists use to control for I such as random e¤ects estimators, …xed e¤ects estimators, dummy variables, and …rst di¤erencing methods. All of these methods assume at the minimum strict exogeneity, that is, that once Tt & I are controlled for, Ts has no partial e¤ect on Pt for s = t. 6 Formally, assume: E (Pt T1, T2, ..., Ts , I ) = E (Pt Tt , I ) = α + βTt + I j j The second equality is an assumption of linearity.

When strict exogeneity holds then we say that the Tt are strictly exogenous conditional on the unobserved e¤ect.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 61/92 It is relatively simple to show that if we include a lagged dependent variable then the error terms will necessarily be correlated with future explanatory variables & strict exogeneity does not hold. So strict exogeneity rules out a type of feedback from current values of the dependent variable to future values of the explanatory variable, a feedback from current voting choices to future information levels. Problem in observational data, not so problematic with experimental data?

How Reasonable is Strict Exogeneity?

Implies that the explanatory variable or variables in each time period, in this case, a single one, information, is uncorrelated with the idiosyncratic error in each time period.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 62/92 So strict exogeneity rules out a type of feedback from current values of the dependent variable to future values of the explanatory variable, a feedback from current voting choices to future information levels. Problem in observational data, not so problematic with experimental data?

How Reasonable is Strict Exogeneity?

Implies that the explanatory variable or variables in each time period, in this case, a single one, information, is uncorrelated with the idiosyncratic error in each time period. It is relatively simple to show that if we include a lagged dependent variable then the error terms will necessarily be correlated with future explanatory variables & strict exogeneity does not hold.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 62/92 Problem in observational data, not so problematic with experimental data?

How Reasonable is Strict Exogeneity?

Implies that the explanatory variable or variables in each time period, in this case, a single one, information, is uncorrelated with the idiosyncratic error in each time period. It is relatively simple to show that if we include a lagged dependent variable then the error terms will necessarily be correlated with future explanatory variables & strict exogeneity does not hold. So strict exogeneity rules out a type of feedback from current values of the dependent variable to future values of the explanatory variable, a feedback from current voting choices to future information levels.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 62/92 How Reasonable is Strict Exogeneity?

Implies that the explanatory variable or variables in each time period, in this case, a single one, information, is uncorrelated with the idiosyncratic error in each time period. It is relatively simple to show that if we include a lagged dependent variable then the error terms will necessarily be correlated with future explanatory variables & strict exogeneity does not hold. So strict exogeneity rules out a type of feedback from current values of the dependent variable to future values of the explanatory variable, a feedback from current voting choices to future information levels. Problem in observational data, not so problematic with experimental data?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 62/92 Wooldridge (2002, chapter 11) reviews. The upshot is that panel data can control for these individual unobservables, but a researcher must be extremely careful to understand the assumptions behind the estimation procedure & reasonableness of these assumptions for the particular dataset and research question. When these assumptions do not hold or are unlikely to hold, then the conclusions of the analysis are suspect.

Can We Relax Strict Exogeneity?

It is possible to estimate the partial e¤ect of an explanatory variable relaxing strict exogeneity

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 63/92 The upshot is that panel data can control for these individual unobservables, but a researcher must be extremely careful to understand the assumptions behind the estimation procedure & reasonableness of these assumptions for the particular dataset and research question. When these assumptions do not hold or are unlikely to hold, then the conclusions of the analysis are suspect.

Can We Relax Strict Exogeneity?

It is possible to estimate the partial e¤ect of an explanatory variable relaxing strict exogeneity Wooldridge (2002, chapter 11) reviews.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 63/92 When these assumptions do not hold or are unlikely to hold, then the conclusions of the analysis are suspect.

Can We Relax Strict Exogeneity?

It is possible to estimate the partial e¤ect of an explanatory variable relaxing strict exogeneity Wooldridge (2002, chapter 11) reviews. The upshot is that panel data can control for these individual unobservables, but a researcher must be extremely careful to understand the assumptions behind the estimation procedure & reasonableness of these assumptions for the particular dataset and research question.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 63/92 Can We Relax Strict Exogeneity?

It is possible to estimate the partial e¤ect of an explanatory variable relaxing strict exogeneity Wooldridge (2002, chapter 11) reviews. The upshot is that panel data can control for these individual unobservables, but a researcher must be extremely careful to understand the assumptions behind the estimation procedure & reasonableness of these assumptions for the particular dataset and research question. When these assumptions do not hold or are unlikely to hold, then the conclusions of the analysis are suspect.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 63/92 Thus, panel data can be used to control for unobservables if the treatment variable, the independent variable of interest, varies over time by unit. For example, if a researcher wished to use panel data to control for unobservable individual speci…c e¤ects in studying the e¤ect of information on voting, the panel data approach will only work if the information levels of the voters varies over the time period in the panel.

Can We Relax Strict Exogeneity?

Finally, when these methods are used with panel data to control for unit speci…c unobservables it makes it impossible for a researcher to determine the e¤ect of observable variables that do not vary over time by unit.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 64/92 For example, if a researcher wished to use panel data to control for unobservable individual speci…c e¤ects in studying the e¤ect of information on voting, the panel data approach will only work if the information levels of the voters varies over the time period in the panel.

Can We Relax Strict Exogeneity?

Finally, when these methods are used with panel data to control for unit speci…c unobservables it makes it impossible for a researcher to determine the e¤ect of observable variables that do not vary over time by unit. Thus, panel data can be used to control for unobservables if the treatment variable, the independent variable of interest, varies over time by unit.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 64/92 Can We Relax Strict Exogeneity?

Finally, when these methods are used with panel data to control for unit speci…c unobservables it makes it impossible for a researcher to determine the e¤ect of observable variables that do not vary over time by unit. Thus, panel data can be used to control for unobservables if the treatment variable, the independent variable of interest, varies over time by unit. For example, if a researcher wished to use panel data to control for unobservable individual speci…c e¤ects in studying the e¤ect of information on voting, the panel data approach will only work if the information levels of the voters varies over the time period in the panel.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 64/92 Their panel consists of the 50 states for the presidential elections in 1992, 1996, and 2000. They assume (page 702) “that the link between campaign activity and the mobilization of core voters is based on the information transmitted by campaigns.” However, the causal e¤ect they investigate is not of information level variations across states but of campaign activity across states.

A Panel Study of Turnout

Holbrook and McClurg (2005) is an example of the use of panel data in a regression in order to estimate the causal e¤ects of presidential campaign activities on voter turnout across states.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 65/92 They assume (page 702) “that the link between campaign activity and the mobilization of core voters is based on the information transmitted by campaigns.” However, the causal e¤ect they investigate is not of information level variations across states but of campaign activity across states.

A Panel Study of Turnout

Holbrook and McClurg (2005) is an example of the use of panel data in a regression in order to estimate the causal e¤ects of presidential campaign activities on voter turnout across states. Their panel consists of the 50 states for the presidential elections in 1992, 1996, and 2000.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 65/92 However, the causal e¤ect they investigate is not of information level variations across states but of campaign activity across states.

A Panel Study of Turnout

Holbrook and McClurg (2005) is an example of the use of panel data in a regression in order to estimate the causal e¤ects of presidential campaign activities on voter turnout across states. Their panel consists of the 50 states for the presidential elections in 1992, 1996, and 2000. They assume (page 702) “that the link between campaign activity and the mobilization of core voters is based on the information transmitted by campaigns.”

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 65/92 A Panel Study of Turnout

Holbrook and McClurg (2005) is an example of the use of panel data in a regression in order to estimate the causal e¤ects of presidential campaign activities on voter turnout across states. Their panel consists of the 50 states for the presidential elections in 1992, 1996, and 2000. They assume (page 702) “that the link between campaign activity and the mobilization of core voters is based on the information transmitted by campaigns.” However, the causal e¤ect they investigate is not of information level variations across states but of campaign activity across states.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 65/92 That is, in the di¤erencing method a researcher drops the …rst period of observations and then uses as dependent and independent variables the change from period to period in place of the original dependent and independent variables. In this way, e¤ects are controlled that are unit speci…c–in this case states–and that could confound the causal e¤ect as discussed above.

A Panel Study of Turnout

They use a modi…cation of a …rst di¤erencing method to attempt to control for unobservable state variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 66/92 In this way, e¤ects are controlled that are unit speci…c–in this case states–and that could confound the causal e¤ect as discussed above.

A Panel Study of Turnout

They use a modi…cation of a …rst di¤erencing method to attempt to control for unobservable state variables. That is, in the di¤erencing method a researcher drops the …rst period of observations and then uses as dependent and independent variables the change from period to period in place of the original dependent and independent variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 66/92 A Panel Study of Turnout

They use a modi…cation of a …rst di¤erencing method to attempt to control for unobservable state variables. That is, in the di¤erencing method a researcher drops the …rst period of observations and then uses as dependent and independent variables the change from period to period in place of the original dependent and independent variables. In this way, e¤ects are controlled that are unit speci…c–in this case states–and that could confound the causal e¤ect as discussed above.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 66/92 They do include as an independent variable lagged turnout, which, since it is a function of the other independent variables in the previous period, is an indirect method of …rst di¤erencing. As lagged turnout is also a function of the state speci…c unobservable variable, including lagged turnout includes that variable as well in the estimation and there is the possibility of correlation across years in the error terms because of its inclusion.

A Panel Study of Turnout

Holbrook and McClurg use as a dependent variable the …rst di¤erence in turnout, but do not di¤erence their independent variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 67/92 As lagged turnout is also a function of the state speci…c unobservable variable, including lagged turnout includes that variable as well in the estimation and there is the possibility of correlation across years in the error terms because of its inclusion.

A Panel Study of Turnout

Holbrook and McClurg use as a dependent variable the …rst di¤erence in turnout, but do not di¤erence their independent variables. They do include as an independent variable lagged turnout, which, since it is a function of the other independent variables in the previous period, is an indirect method of …rst di¤erencing.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 67/92 A Panel Study of Turnout

Holbrook and McClurg use as a dependent variable the …rst di¤erence in turnout, but do not di¤erence their independent variables. They do include as an independent variable lagged turnout, which, since it is a function of the other independent variables in the previous period, is an indirect method of …rst di¤erencing. As lagged turnout is also a function of the state speci…c unobservable variable, including lagged turnout includes that variable as well in the estimation and there is the possibility of correlation across years in the error terms because of its inclusion.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 67/92 That is, campaign activities are choices candidates make in order to maximize their probability of winning. Work by Stromberg (2006) demonstrates that campaign visits & advertisements closely match those predicted by a game theoretic model in which candidates make these choices strategically in reaction to both anticipated e¤ects on voter choices in a state and the activities of their competitors. An estimate of the causal e¤ect of campaign activities should take this endogeneity into account, not possible in a single equation regression.

A Panel Study of Turnout

A potential problem with Holbrook’sand McClurg’sanalysis is endogeneity of campaign activities, which are probably functions of the expected turnout e¤ects, even when di¤erenced.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 68/92 Work by Stromberg (2006) demonstrates that campaign visits & advertisements closely match those predicted by a game theoretic model in which candidates make these choices strategically in reaction to both anticipated e¤ects on voter choices in a state and the activities of their competitors. An estimate of the causal e¤ect of campaign activities should take this endogeneity into account, not possible in a single equation regression.

A Panel Study of Turnout

A potential problem with Holbrook’sand McClurg’sanalysis is endogeneity of campaign activities, which are probably functions of the expected turnout e¤ects, even when di¤erenced. That is, campaign activities are choices candidates make in order to maximize their probability of winning.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 68/92 An estimate of the causal e¤ect of campaign activities should take this endogeneity into account, not possible in a single equation regression.

A Panel Study of Turnout

A potential problem with Holbrook’sand McClurg’sanalysis is endogeneity of campaign activities, which are probably functions of the expected turnout e¤ects, even when di¤erenced. That is, campaign activities are choices candidates make in order to maximize their probability of winning. Work by Stromberg (2006) demonstrates that campaign visits & advertisements closely match those predicted by a game theoretic model in which candidates make these choices strategically in reaction to both anticipated e¤ects on voter choices in a state and the activities of their competitors.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 68/92 A Panel Study of Turnout

A potential problem with Holbrook’sand McClurg’sanalysis is endogeneity of campaign activities, which are probably functions of the expected turnout e¤ects, even when di¤erenced. That is, campaign activities are choices candidates make in order to maximize their probability of winning. Work by Stromberg (2006) demonstrates that campaign visits & advertisements closely match those predicted by a game theoretic model in which candidates make these choices strategically in reaction to both anticipated e¤ects on voter choices in a state and the activities of their competitors. An estimate of the causal e¤ect of campaign activities should take this endogeneity into account, not possible in a single equation regression.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 68/92 What are propensity scores? Propensity scores are estimates of the probability of receiving treatment as a function of W , which we de…ne as π (W ).

Propensity Scores in Regressions

An alternative to the standard control function approach discussed above is to use propensity scores as control functions in the regression equations.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 69/92 Propensity scores are estimates of the probability of receiving treatment as a function of W , which we de…ne as π (W ).

Propensity Scores in Regressions

An alternative to the standard control function approach discussed above is to use propensity scores as control functions in the regression equations. What are propensity scores?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 69/92 Propensity Scores in Regressions

An alternative to the standard control function approach discussed above is to use propensity scores as control functions in the regression equations. What are propensity scores? Propensity scores are estimates of the probability of receiving treatment as a function of W , which we de…ne as π (W ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 69/92 1 mean ignorability of treatment, 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), 3 the parametric estimatorj of the propensity scorej is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)). Wooldridge (1999) shows that if assume

Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 1 mean ignorability of treatment, 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), 3 the parametric estimatorj of the propensity scorej is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

Wooldridge (1999) shows that if assume

Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included. Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 1 mean ignorability of treatment, 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), 3 the parametric estimatorj of the propensity scorej is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included. Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)). Wooldridge (1999) shows that if assume

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), 3 the parametric estimatorj of the propensity scorej is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included. Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)). Wooldridge (1999) shows that if assume

1 mean ignorability of treatment,

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 3 the parametric estimator of the propensity score is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included. Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)). Wooldridge (1999) shows that if assume

1 mean ignorability of treatment, 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 Propensity Scores in Regressions

Rosenbaum and Rubin (1983) suggest that they be estimated using a ‡exible logit model, where W and various functions of W are included. Others have estimated them using nonparametric methods (see Powell (1994) and Heckman, Ichimura, and Todd (1997)). Wooldridge (1999) shows that if assume

1 mean ignorability of treatment, 2 that E (P1 P0 W ) is uncorrelated with Var(T W , P), 3 the parametric estimatorj of the propensity scorej is consistent, then the OLS coe¢ cient on T is also a consistent estimator of ATE.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 70/92 As they demonstrate, if the parametric estimation of the propensity score is inaccurate, then the causal e¤ects estimated are inconsistent. They advocate that researchers make use of what they call the “propensity score tautology.”

Propensity Scores in Regressions

Ho, Imai, King, and Stuart (2007) point out that the last assumption, that the parametric estimator of the propensity score is consistent, is something that is untestable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 71/92 They advocate that researchers make use of what they call the “propensity score tautology.”

Propensity Scores in Regressions

Ho, Imai, King, and Stuart (2007) point out that the last assumption, that the parametric estimator of the propensity score is consistent, is something that is untestable. As they demonstrate, if the parametric estimation of the propensity score is inaccurate, then the causal e¤ects estimated are inconsistent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 71/92 Propensity Scores in Regressions

Ho, Imai, King, and Stuart (2007) point out that the last assumption, that the parametric estimator of the propensity score is consistent, is something that is untestable. As they demonstrate, if the parametric estimation of the propensity score is inaccurate, then the causal e¤ects estimated are inconsistent. They advocate that researchers make use of what they call the “propensity score tautology.”

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 71/92 To check for balance, the researcher matches (discuss shortly) each treated unit to the control unit with the most similar value of the estimated propensity score (nearest neighbor matching on the propensity score). If the matching shows that W is balanced (Ho et al discuss how to check for balance), then the researcher should use the propensity scores. If not, the researcher respeci…es the estimation adding in interaction terms and/or squared terms and check balance again. If this fails, the researcher turns to more elaborate speci…cations.

Propensity Scores in Regressions

Ho et al suggest that researchers …rst begin with a simple estimation of the propensity score using a logistic regression of T on W .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 72/92 If the matching shows that W is balanced (Ho et al discuss how to check for balance), then the researcher should use the propensity scores. If not, the researcher respeci…es the estimation adding in interaction terms and/or squared terms and check balance again. If this fails, the researcher turns to more elaborate speci…cations.

Propensity Scores in Regressions

Ho et al suggest that researchers …rst begin with a simple estimation of the propensity score using a logistic regression of T on W . To check for balance, the researcher matches (discuss shortly) each treated unit to the control unit with the most similar value of the estimated propensity score (nearest neighbor matching on the propensity score).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 72/92 If not, the researcher respeci…es the estimation adding in interaction terms and/or squared terms and check balance again. If this fails, the researcher turns to more elaborate speci…cations.

Propensity Scores in Regressions

Ho et al suggest that researchers …rst begin with a simple estimation of the propensity score using a logistic regression of T on W . To check for balance, the researcher matches (discuss shortly) each treated unit to the control unit with the most similar value of the estimated propensity score (nearest neighbor matching on the propensity score). If the matching shows that W is balanced (Ho et al discuss how to check for balance), then the researcher should use the propensity scores.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 72/92 If this fails, the researcher turns to more elaborate speci…cations.

Propensity Scores in Regressions

Ho et al suggest that researchers …rst begin with a simple estimation of the propensity score using a logistic regression of T on W . To check for balance, the researcher matches (discuss shortly) each treated unit to the control unit with the most similar value of the estimated propensity score (nearest neighbor matching on the propensity score). If the matching shows that W is balanced (Ho et al discuss how to check for balance), then the researcher should use the propensity scores. If not, the researcher respeci…es the estimation adding in interaction terms and/or squared terms and check balance again.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 72/92 Propensity Scores in Regressions

Ho et al suggest that researchers …rst begin with a simple estimation of the propensity score using a logistic regression of T on W . To check for balance, the researcher matches (discuss shortly) each treated unit to the control unit with the most similar value of the estimated propensity score (nearest neighbor matching on the propensity score). If the matching shows that W is balanced (Ho et al discuss how to check for balance), then the researcher should use the propensity scores. If not, the researcher respeci…es the estimation adding in interaction terms and/or squared terms and check balance again. If this fails, the researcher turns to more elaborate speci…cations.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 72/92 This allows for the application of a standard for determining the optimal model that balances the covariates, independent of the outcome data. Propensity scores also appear parsimonious as compared to the kitchen sink like regression with control functions.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

Sekhon (2005) contends that an advantage of propensity scores is that they are estimated independent of outcome or choice data while control functions in regressions are not.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 73/92 Propensity scores also appear parsimonious as compared to the kitchen sink like regression with control functions.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

Sekhon (2005) contends that an advantage of propensity scores is that they are estimated independent of outcome or choice data while control functions in regressions are not. This allows for the application of a standard for determining the optimal model that balances the covariates, independent of the outcome data.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 73/92 Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

Sekhon (2005) contends that an advantage of propensity scores is that they are estimated independent of outcome or choice data while control functions in regressions are not. This allows for the application of a standard for determining the optimal model that balances the covariates, independent of the outcome data. Propensity scores also appear parsimonious as compared to the kitchen sink like regression with control functions.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 73/92 If the propensity score is estimated in linear regressions without interaction e¤ects the estimations are identical. Moreover, the standard errors in the kitchen sink regression have known reliability and adjustments for problems can be made as is standard in regression analysis.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

The appearance of parsimony is deceiving since the propensity scores themselves are estimated in a similar kitchen sink fashion.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 74/92 Moreover, the standard errors in the kitchen sink regression have known reliability and adjustments for problems can be made as is standard in regression analysis.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

The appearance of parsimony is deceiving since the propensity scores themselves are estimated in a similar kitchen sink fashion. If the propensity score is estimated in linear regressions without interaction e¤ects the estimations are identical.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 74/92 Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

The appearance of parsimony is deceiving since the propensity scores themselves are estimated in a similar kitchen sink fashion. If the propensity score is estimated in linear regressions without interaction e¤ects the estimations are identical. Moreover, the standard errors in the kitchen sink regression have known reliability and adjustments for problems can be made as is standard in regression analysis.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 74/92 Researchers who use propensity scores need to be careful in the construction of the standard error and in making these adjustments. Furthermore, the two techniques rely on di¤erent assumptions and arguably the ones underlying propensity scores are more restrictive and less likely to be satis…ed. Finally, the use of the propensity score depends on the assumption that the parametric model used is consistent.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

However, in regressions with propensity scores, the …rst stage estimation is sometimes ignored in computing the standard error and the proper adjustments are not made.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 75/92 Furthermore, the two techniques rely on di¤erent assumptions and arguably the ones underlying propensity scores are more restrictive and less likely to be satis…ed. Finally, the use of the propensity score depends on the assumption that the parametric model used is consistent.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

However, in regressions with propensity scores, the …rst stage estimation is sometimes ignored in computing the standard error and the proper adjustments are not made. Researchers who use propensity scores need to be careful in the construction of the standard error and in making these adjustments.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 75/92 Finally, the use of the propensity score depends on the assumption that the parametric model used is consistent.

Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

However, in regressions with propensity scores, the …rst stage estimation is sometimes ignored in computing the standard error and the proper adjustments are not made. Researchers who use propensity scores need to be careful in the construction of the standard error and in making these adjustments. Furthermore, the two techniques rely on di¤erent assumptions and arguably the ones underlying propensity scores are more restrictive and less likely to be satis…ed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 75/92 Which is Better in Regressions as Controls: Propensity Scores or Standard Control Functions?

However, in regressions with propensity scores, the …rst stage estimation is sometimes ignored in computing the standard error and the proper adjustments are not made. Researchers who use propensity scores need to be careful in the construction of the standard error and in making these adjustments. Furthermore, the two techniques rely on di¤erent assumptions and arguably the ones underlying propensity scores are more restrictive and less likely to be satis…ed. Finally, the use of the propensity score depends on the assumption that the parametric model used is consistent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 75/92 Nonparametric approaches are useful particularly for discrete dependent variables such as voting behavior. That is, recall that we needed to restrict our attention to a binary dependent variable & then assume that the probability of turnout was given by LPM, even though we knew that this placed unrealistic assumptions about that probability. If we use as our estimating equation a more realistic probit or logit or multinomial variant, then the linearity assumptions breakdown & estimating ATE is more complicated.

Use of Controls & Propensity Scores without Regression

Nonregression methods which are attractive since do not require making assumptions about linearity as in most regression analysis used in political science.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 76/92 That is, recall that we needed to restrict our attention to a binary dependent variable & then assume that the probability of turnout was given by LPM, even though we knew that this placed unrealistic assumptions about that probability. If we use as our estimating equation a more realistic probit or logit or multinomial variant, then the linearity assumptions breakdown & estimating ATE is more complicated.

Use of Controls & Propensity Scores without Regression

Nonregression methods which are attractive since do not require making assumptions about linearity as in most regression analysis used in political science. Nonparametric approaches are useful particularly for discrete dependent variables such as voting behavior.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 76/92 If we use as our estimating equation a more realistic probit or logit or multinomial variant, then the linearity assumptions breakdown & estimating ATE is more complicated.

Use of Controls & Propensity Scores without Regression

Nonregression methods which are attractive since do not require making assumptions about linearity as in most regression analysis used in political science. Nonparametric approaches are useful particularly for discrete dependent variables such as voting behavior. That is, recall that we needed to restrict our attention to a binary dependent variable & then assume that the probability of turnout was given by LPM, even though we knew that this placed unrealistic assumptions about that probability.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 76/92 Use of Controls & Propensity Scores without Regression

Nonregression methods which are attractive since do not require making assumptions about linearity as in most regression analysis used in political science. Nonparametric approaches are useful particularly for discrete dependent variables such as voting behavior. That is, recall that we needed to restrict our attention to a binary dependent variable & then assume that the probability of turnout was given by LPM, even though we knew that this placed unrealistic assumptions about that probability. If we use as our estimating equation a more realistic probit or logit or multinomial variant, then the linearity assumptions breakdown & estimating ATE is more complicated.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 76/92 Speci…cally, ATE = E [ATE (W )] and ATT = E [ATE (W ) T = 1]. j If ATE(W ) can be estimated, then the average treatment e¤ects can be estimated. It turns out that if we have a random sample, then E [ATE (W ) T = j] are nonparametically identi…ed & from them ATE can be estimatedj as well. However, obtaining the asymptoptically valid standard errors using this approach can be very di¢ cult, making nonparametric estimation less desirable than the standard regression approach.

Use of Controls & Propensity Scores without Regression

Can avoid by noting that if assume mean ignorability of treatment then it can be shown that the conditional treatment e¤ects, ATT(W ) and ATE(W ), are equal, although the unconditional ones are not necessarily equal.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 77/92 If ATE(W ) can be estimated, then the average treatment e¤ects can be estimated. It turns out that if we have a random sample, then E [ATE (W ) T = j] are nonparametically identi…ed & from them ATE can be estimatedj as well. However, obtaining the asymptoptically valid standard errors using this approach can be very di¢ cult, making nonparametric estimation less desirable than the standard regression approach.

Use of Controls & Propensity Scores without Regression

Can avoid by noting that if assume mean ignorability of treatment then it can be shown that the conditional treatment e¤ects, ATT(W ) and ATE(W ), are equal, although the unconditional ones are not necessarily equal. Speci…cally, ATE = E [ATE (W )] and ATT = E [ATE (W ) T = 1]. j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 77/92 It turns out that if we have a random sample, then E [ATE (W ) T = j] are nonparametically identi…ed & from them ATE can be estimatedj as well. However, obtaining the asymptoptically valid standard errors using this approach can be very di¢ cult, making nonparametric estimation less desirable than the standard regression approach.

Use of Controls & Propensity Scores without Regression

Can avoid by noting that if assume mean ignorability of treatment then it can be shown that the conditional treatment e¤ects, ATT(W ) and ATE(W ), are equal, although the unconditional ones are not necessarily equal. Speci…cally, ATE = E [ATE (W )] and ATT = E [ATE (W ) T = 1]. j If ATE(W ) can be estimated, then the average treatment e¤ects can be estimated.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 77/92 However, obtaining the asymptoptically valid standard errors using this approach can be very di¢ cult, making nonparametric estimation less desirable than the standard regression approach.

Use of Controls & Propensity Scores without Regression

Can avoid by noting that if assume mean ignorability of treatment then it can be shown that the conditional treatment e¤ects, ATT(W ) and ATE(W ), are equal, although the unconditional ones are not necessarily equal. Speci…cally, ATE = E [ATE (W )] and ATT = E [ATE (W ) T = 1]. j If ATE(W ) can be estimated, then the average treatment e¤ects can be estimated. It turns out that if we have a random sample, then E [ATE (W ) T = j] are nonparametically identi…ed & from them ATE can be estimatedj as well.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 77/92 Use of Controls & Propensity Scores without Regression

Can avoid by noting that if assume mean ignorability of treatment then it can be shown that the conditional treatment e¤ects, ATT(W ) and ATE(W ), are equal, although the unconditional ones are not necessarily equal. Speci…cally, ATE = E [ATE (W )] and ATT = E [ATE (W ) T = 1]. j If ATE(W ) can be estimated, then the average treatment e¤ects can be estimated. It turns out that if we have a random sample, then E [ATE (W ) T = j] are nonparametically identi…ed & from them ATE can be estimatedj as well. However, obtaining the asymptoptically valid standard errors using this approach can be very di¢ cult, making nonparametric estimation less desirable than the standard regression approach.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 77/92 1 ignorability of treatment in its stricter version, 2 that propensity scores are bounded between 0 and 1 (that is, every observation has a positive probability of being in either state of the world), 3 that the parametric estimation of the propensity score is correct such that the propensity scores are consistent.

Use of Controls & Propensity Scores without Regression

Alternatively, propensity scores can be used directly to estimate causal e¤ects nonparametically if we assume

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 78/92 2 that propensity scores are bounded between 0 and 1 (that is, every observation has a positive probability of being in either state of the world), 3 that the parametric estimation of the propensity score is correct such that the propensity scores are consistent.

Use of Controls & Propensity Scores without Regression

Alternatively, propensity scores can be used directly to estimate causal e¤ects nonparametically if we assume

1 ignorability of treatment in its stricter version,

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 78/92 3 that the parametric estimation of the propensity score is correct such that the propensity scores are consistent.

Use of Controls & Propensity Scores without Regression

Alternatively, propensity scores can be used directly to estimate causal e¤ects nonparametically if we assume

1 ignorability of treatment in its stricter version, 2 that propensity scores are bounded between 0 and 1 (that is, every observation has a positive probability of being in either state of the world),

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 78/92 Use of Controls & Propensity Scores without Regression

Alternatively, propensity scores can be used directly to estimate causal e¤ects nonparametically if we assume

1 ignorability of treatment in its stricter version, 2 that propensity scores are bounded between 0 and 1 (that is, every observation has a positive probability of being in either state of the world), 3 that the parametric estimation of the propensity score is correct such that the propensity scores are consistent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 78/92 [T π (W )] P E π (W )(1 π (W )) and ATT =   Pr (T = 1)

Use of Controls & Propensity Scores without Regression

[T π (W )] P Then: ATE = E π (W )(1 π (W ))  

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 79/92 Use of Controls & Propensity Scores without Regression

[T π (W )] P Then: ATE = E π (W )(1 π (W ))   [T π (W )] P E π (W )(1 π (W )) and ATT =   Pr (T = 1)

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 79/92 Again, note that both of these approaches assume ignorability of treatment, with the propensity score approach making the more restrictive assumption and that the propensity score estimates are consistent.

Use of Controls & Propensity Scores without Regression

Estimated after estimating the propensity scores, both nonparametically and using ‡exible parametric approaches.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 80/92 Use of Controls & Propensity Scores without Regression

Estimated after estimating the propensity scores, both nonparametically and using ‡exible parametric approaches. Again, note that both of these approaches assume ignorability of treatment, with the propensity score approach making the more restrictive assumption and that the propensity score estimates are consistent.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 80/92 Again, this can be useful if the dependent variable is, as in our case, discrete. The behind matching is simple— suppose that we randomly draw a propensity score from the population and then match two voting choices from the population where one individual is informed and the other individual is uninformed. If we assume ignorability of treatment in the stricter form, that the propensity scores are bounded between 0 and 1, and that the propensity score estimates are consistent, then we know that E [P T = 1, π (W )] E [P T = 0, π (W )] = E [P1 P0 π (W )] j j j

Control by Matching Propensity Scores and Matching

An alternative use of propensity scores in estimating the e¤ects of a cause is through the process of matching on propensity scores.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 81/92 The idea behind matching is simple— suppose that we randomly draw a propensity score from the population and then match two voting choices from the population where one individual is informed and the other individual is uninformed. If we assume ignorability of treatment in the stricter form, that the propensity scores are bounded between 0 and 1, and that the propensity score estimates are consistent, then we know that E [P T = 1, π (W )] E [P T = 0, π (W )] = E [P1 P0 π (W )] j j j

Control by Matching Propensity Scores and Matching

An alternative use of propensity scores in estimating the e¤ects of a cause is through the process of matching on propensity scores. Again, this can be useful if the dependent variable is, as in our case, discrete.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 81/92 If we assume ignorability of treatment in the stricter form, that the propensity scores are bounded between 0 and 1, and that the propensity score estimates are consistent, then we know that E [P T = 1, π (W )] E [P T = 0, π (W )] = E [P1 P0 π (W )] j j j

Control by Matching Propensity Scores and Matching

An alternative use of propensity scores in estimating the e¤ects of a cause is through the process of matching on propensity scores. Again, this can be useful if the dependent variable is, as in our case, discrete. The idea behind matching is simple— suppose that we randomly draw a propensity score from the population and then match two voting choices from the population where one individual is informed and the other individual is uninformed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 81/92 Control by Matching Propensity Scores and Matching

An alternative use of propensity scores in estimating the e¤ects of a cause is through the process of matching on propensity scores. Again, this can be useful if the dependent variable is, as in our case, discrete. The idea behind matching is simple— suppose that we randomly draw a propensity score from the population and then match two voting choices from the population where one individual is informed and the other individual is uninformed. If we assume ignorability of treatment in the stricter form, that the propensity scores are bounded between 0 and 1, and that the propensity score estimates are consistent, then we know that E [P T = 1, π (W )] E [P T = 0, π (W )] = E [P1 P0 π (W )] j j j

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 81/92 Matching implicitly assumes that conditioned on W , some unspeci…ed random process assigns individuals to be either informed or uninformed. Since the process of assignment is random, the possible e¤ects of unobservables on voting choices wash out, allowing for accurate estimates of the causal e¤ect of information on voting. If we assume just mean ignorability of treatment, then we can estimate ATT [see Heckman, Ichimura, and Todd (1997) and Ho, Imai, King, and Stuart (2007)].

Control by Matching Propensity Scores and Matching

By doing this iteratively & averaging across the distribution of propensity scores, a researcher can compute ATE.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 82/92 Since the process of assignment is random, the possible e¤ects of unobservables on voting choices wash out, allowing for accurate estimates of the causal e¤ect of information on voting. If we assume just mean ignorability of treatment, then we can estimate ATT [see Heckman, Ichimura, and Todd (1997) and Ho, Imai, King, and Stuart (2007)].

Control by Matching Propensity Scores and Matching

By doing this iteratively & averaging across the distribution of propensity scores, a researcher can compute ATE. Matching implicitly assumes that conditioned on W , some unspeci…ed random process assigns individuals to be either informed or uninformed.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 82/92 If we assume just mean ignorability of treatment, then we can estimate ATT [see Heckman, Ichimura, and Todd (1997) and Ho, Imai, King, and Stuart (2007)].

Control by Matching Propensity Scores and Matching

By doing this iteratively & averaging across the distribution of propensity scores, a researcher can compute ATE. Matching implicitly assumes that conditioned on W , some unspeci…ed random process assigns individuals to be either informed or uninformed. Since the process of assignment is random, the possible e¤ects of unobservables on voting choices wash out, allowing for accurate estimates of the causal e¤ect of information on voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 82/92 Control by Matching Propensity Scores and Matching

By doing this iteratively & averaging across the distribution of propensity scores, a researcher can compute ATE. Matching implicitly assumes that conditioned on W , some unspeci…ed random process assigns individuals to be either informed or uninformed. Since the process of assignment is random, the possible e¤ects of unobservables on voting choices wash out, allowing for accurate estimates of the causal e¤ect of information on voting. If we assume just mean ignorability of treatment, then we can estimate ATT [see Heckman, Ichimura, and Todd (1997) and Ho, Imai, King, and Stuart (2007)].

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 82/92 Thus, most researchers who use matching procedures also use some grouping or local averaging to determine similarities in terms of propensity between treated & nontreated observations as well as exact matching on certain selected variables. The methods employed are discussed in Ho, Imai, King, and Stuart (2007), Heckman, Ichimura, and Todd (1997), Angrist (1998), and Dehejia and Wahba (1999). Ho et al provide free software for matching and in the documentation explain these procedures in detail.

Control by Matching Propensity Scores and Matching

The process is complicated because it is di¢ cult to get exact matches for propensity scores, which of course must be estimated.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 83/92 The methods employed are discussed in Ho, Imai, King, and Stuart (2007), Heckman, Ichimura, and Todd (1997), Angrist (1998), and Dehejia and Wahba (1999). Ho et al provide free software for matching and in the documentation explain these procedures in detail.

Control by Matching Propensity Scores and Matching

The process is complicated because it is di¢ cult to get exact matches for propensity scores, which of course must be estimated. Thus, most researchers who use matching procedures also use some grouping or local averaging to determine similarities in terms of propensity between treated & nontreated observations as well as exact matching on certain selected variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 83/92 Ho et al provide free software for matching and in the documentation explain these procedures in detail.

Control by Matching Propensity Scores and Matching

The process is complicated because it is di¢ cult to get exact matches for propensity scores, which of course must be estimated. Thus, most researchers who use matching procedures also use some grouping or local averaging to determine similarities in terms of propensity between treated & nontreated observations as well as exact matching on certain selected variables. The methods employed are discussed in Ho, Imai, King, and Stuart (2007), Heckman, Ichimura, and Todd (1997), Angrist (1998), and Dehejia and Wahba (1999).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 83/92 Control by Matching Propensity Scores and Matching

The process is complicated because it is di¢ cult to get exact matches for propensity scores, which of course must be estimated. Thus, most researchers who use matching procedures also use some grouping or local averaging to determine similarities in terms of propensity between treated & nontreated observations as well as exact matching on certain selected variables. The methods employed are discussed in Ho, Imai, King, and Stuart (2007), Heckman, Ichimura, and Todd (1997), Angrist (1998), and Dehejia and Wahba (1999). Ho et al provide free software for matching and in the documentation explain these procedures in detail.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 83/92 The procedure produces standard errors that take into account ties and sampling controls with replacement when …nding matches. Sekhon matches the observations using a combination of variables and propensity scores that are estimated using principal components of the baseline variables. He uses a general procedure to estimate the principal components, including …rst-order interactions and nonlinear speci…cations of continuous variables.

Matching and the E¤ect of Information on Voting

Sekhon (2005) looks at information & voting in survey data using a matching procedure he developed for the statistical software program R and available on his web page.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 84/92 Sekhon matches the observations using a combination of variables and propensity scores that are estimated using principal components of the baseline variables. He uses a general procedure to estimate the principal components, including …rst-order interactions and nonlinear speci…cations of continuous variables.

Matching and the E¤ect of Information on Voting

Sekhon (2005) looks at information & voting in survey data using a matching procedure he developed for the statistical software program R and available on his web page. The procedure produces standard errors that take into account ties and sampling controls with replacement when …nding matches.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 84/92 He uses a general procedure to estimate the principal components, including …rst-order interactions and nonlinear speci…cations of continuous variables.

Matching and the E¤ect of Information on Voting

Sekhon (2005) looks at information & voting in survey data using a matching procedure he developed for the statistical software program R and available on his web page. The procedure produces standard errors that take into account ties and sampling controls with replacement when …nding matches. Sekhon matches the observations using a combination of variables and propensity scores that are estimated using principal components of the baseline variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 84/92 Matching and the E¤ect of Information on Voting

Sekhon (2005) looks at information & voting in survey data using a matching procedure he developed for the statistical software program R and available on his web page. The procedure produces standard errors that take into account ties and sampling controls with replacement when …nding matches. Sekhon matches the observations using a combination of variables and propensity scores that are estimated using principal components of the baseline variables. He uses a general procedure to estimate the principal components, including …rst-order interactions and nonlinear speci…cations of continuous variables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 84/92 1 the Wilcoxon rank sum test for univariate balance, 2 the McNemar test of marginal homogeneity on paired binary data, 3 the Kolmogorov-Smirnov test for equality.

He …rst uses a number of nonparametric tests:

He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 1 the Wilcoxon rank sum test for univariate balance, 2 the McNemar test of marginal homogeneity on paired binary data, 3 the Kolmogorov-Smirnov test for equality. He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance. He …rst uses a number of nonparametric tests:

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 2 the McNemar test of marginal homogeneity on paired binary data, 3 the Kolmogorov-Smirnov test for equality. He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance. He …rst uses a number of nonparametric tests:

1 the Wilcoxon rank sum test for univariate balance,

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 3 the Kolmogorov-Smirnov test for equality. He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance. He …rst uses a number of nonparametric tests:

1 the Wilcoxon rank sum test for univariate balance, 2 the McNemar test of marginal homogeneity on paired binary data,

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance. He …rst uses a number of nonparametric tests:

1 the Wilcoxon rank sum test for univariate balance, 2 the McNemar test of marginal homogeneity on paired binary data, 3 the Kolmogorov-Smirnov test for equality.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 Matching and the E¤ect of Information on Voting

Sekhon tests his use of the matching procedures for balance. He …rst uses a number of nonparametric tests:

1 the Wilcoxon rank sum test for univariate balance, 2 the McNemar test of marginal homogeneity on paired binary data, 3 the Kolmogorov-Smirnov test for equality. He also estimates a logistic model in which the dependent variable is treatment assignment and the baseline variables are explanatory.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 85/92 Sekhon assumes only mean ignorability and thus reports only ATT(W ) rather than ATE(W ).

Matching and the E¤ect of Information on Voting

Unlike the control function approach in regressions, the tests of balance are conducted independent of the examination of the relationship between information and voting.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 86/92 Matching and the E¤ect of Information on Voting

Unlike the control function approach in regressions, the tests of balance are conducted independent of the examination of the relationship between information and voting. Sekhon assumes only mean ignorability and thus reports only ATT(W ) rather than ATE(W ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 86/92 Sekhon avoids this problem by using panel survey data and matching on the baseline survey or “pre-treatment” survey. Using panel surveys from three NES datasets, 1980, 1972-1974-1976, and 1992-1994-1996, Sekhon …nds that there are not signi…cant di¤erences between the choices of informed and uninformed voters in United States elections at the time of the election (he …nds e¤ects at earlier points in the election campaign).

Panel Data Redux

As noted above, Bartels (1996) restricted the independent variables he used to those that could be considered exogenous and not in‡uenced by voter information levels which led to a high number of missclassi…cation errors and poor general goodness of …t compared to other models of voting behavior.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 87/92 Using panel surveys from three NES datasets, 1980, 1972-1974-1976, and 1992-1994-1996, Sekhon …nds that there are not signi…cant di¤erences between the choices of informed and uninformed voters in United States elections at the time of the election (he …nds e¤ects at earlier points in the election campaign).

Panel Data Redux

As noted above, Bartels (1996) restricted the independent variables he used to those that could be considered exogenous and not in‡uenced by voter information levels which led to a high number of missclassi…cation errors and poor general goodness of …t compared to other models of voting behavior. Sekhon avoids this problem by using panel survey data and matching on the baseline survey or “pre-treatment” survey.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 87/92 Panel Data Redux

As noted above, Bartels (1996) restricted the independent variables he used to those that could be considered exogenous and not in‡uenced by voter information levels which led to a high number of missclassi…cation errors and poor general goodness of …t compared to other models of voting behavior. Sekhon avoids this problem by using panel survey data and matching on the baseline survey or “pre-treatment” survey. Using panel surveys from three NES datasets, 1980, 1972-1974-1976, and 1992-1994-1996, Sekhon …nds that there are not signi…cant di¤erences between the choices of informed and uninformed voters in United States elections at the time of the election (he …nds e¤ects at earlier points in the election campaign).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 87/92 Sekhon conjectures that in more advanced democracies it is easier for voters to vote as if they were informed, even if they are not, due to communications that occur during elections campaigns in advanced democracies. Note that Sekhon uses panel data as a way to control for observables, not unobservables. He explicitly assumes, as in other matching procedures, mean ignorability of treatment.

Panel Data Redux

In contrast, he …nds that a similar analysis of Mexican voters in 2000 at the time of the election shows signi…cant di¤erences.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 88/92 Note that Sekhon uses panel data as a way to control for observables, not unobservables. He explicitly assumes, as in other matching procedures, mean ignorability of treatment.

Panel Data Redux

In contrast, he …nds that a similar analysis of Mexican voters in 2000 at the time of the election shows signi…cant di¤erences. Sekhon conjectures that in more advanced democracies it is easier for voters to vote as if they were informed, even if they are not, due to communications that occur during elections campaigns in advanced democracies.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 88/92 He explicitly assumes, as in other matching procedures, mean ignorability of treatment.

Panel Data Redux

In contrast, he …nds that a similar analysis of Mexican voters in 2000 at the time of the election shows signi…cant di¤erences. Sekhon conjectures that in more advanced democracies it is easier for voters to vote as if they were informed, even if they are not, due to communications that occur during elections campaigns in advanced democracies. Note that Sekhon uses panel data as a way to control for observables, not unobservables.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 88/92 Panel Data Redux

In contrast, he …nds that a similar analysis of Mexican voters in 2000 at the time of the election shows signi…cant di¤erences. Sekhon conjectures that in more advanced democracies it is easier for voters to vote as if they were informed, even if they are not, due to communications that occur during elections campaigns in advanced democracies. Note that Sekhon uses panel data as a way to control for observables, not unobservables. He explicitly assumes, as in other matching procedures, mean ignorability of treatment.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 88/92 More generally, interested in causal mediation e¤ects. Assume that X is a mediating variable. Since X is a function now of T , then we can think of two potential values for X , X0 when T = 0, and X1 when T = 1. As with Y , we only observe one value of X for each individual which is given by: X = TX1 + (1 T ) X0

Causal E¤ects Through Mediating Variables

Suppose interested in identifying causal e¤ect of information on voting through partisan identi…cation or the causal e¤ect of partisan identi…cation on voting?

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 89/92 Assume that X is a mediating variable. Since X is a function now of T , then we can think of two potential values for X , X0 when T = 0, and X1 when T = 1. As with Y , we only observe one value of X for each individual which is given by: X = TX1 + (1 T ) X0

Causal E¤ects Through Mediating Variables

Suppose interested in identifying causal e¤ect of information on voting through partisan identi…cation or the causal e¤ect of partisan identi…cation on voting? More generally, interested in causal mediation e¤ects.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 89/92 Since X is a function now of T , then we can think of two potential values for X , X0 when T = 0, and X1 when T = 1. As with Y , we only observe one value of X for each individual which is given by: X = TX1 + (1 T ) X0

Causal E¤ects Through Mediating Variables

Suppose interested in identifying causal e¤ect of information on voting through partisan identi…cation or the causal e¤ect of partisan identi…cation on voting? More generally, interested in causal mediation e¤ects. Assume that X is a mediating variable.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 89/92 As with Y , we only observe one value of X for each individual which is given by: X = TX1 + (1 T ) X0

Causal E¤ects Through Mediating Variables

Suppose interested in identifying causal e¤ect of information on voting through partisan identi…cation or the causal e¤ect of partisan identi…cation on voting? More generally, interested in causal mediation e¤ects. Assume that X is a mediating variable. Since X is a function now of T , then we can think of two potential values for X , X0 when T = 0, and X1 when T = 1.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 89/92 Causal E¤ects Through Mediating Variables

Suppose interested in identifying causal e¤ect of information on voting through partisan identi…cation or the causal e¤ect of partisan identi…cation on voting? More generally, interested in causal mediation e¤ects. Assume that X is a mediating variable. Since X is a function now of T , then we can think of two potential values for X , X0 when T = 0, and X1 when T = 1. As with Y , we only observe one value of X for each individual which is given by: X = TX1 + (1 T ) X0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 89/92 We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

That is, CME is given by: CME(T ) = YTX YTX 1 0 Since cannot observe counterfactual situations cannot observe CME(T ). Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

That is, CME is given by: CME(T ) = YTX YTX 1 0 Since cannot observe counterfactual situations cannot observe CME(T ). Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 That is, CME is given by: CME(T ) = YTX YTX 1 0 Since cannot observe counterfactual situations cannot observe CME(T ). Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 Since cannot observe counterfactual situations cannot observe CME(T ). Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

That is, CME is given by: CME(T ) = YTX YTX 1 0

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

That is, CME is given by: CME(T ) = YTX YTX 1 0 Since cannot observe counterfactual situations cannot observe CME(T ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 Causal E¤ects Through Mediating Variables

Can think of potential outcomes as functions of both the values of

the mediating variable & treatment, such that YjXj is the potential value of Y given that T = j.

We de…ne the observed outcome as: Y = TY1X + (1 T )Y0X 1 0 The Causal Mediation E¤ect or CME is the e¤ect on the outcome of changing the mediator value as a¤ected by the treatment without actually changing the treatment value.

That is, CME is given by: CME(T ) = YTX YTX 1 0 Since cannot observe counterfactual situations cannot observe CME(T ). Instead estimate the Average Causal Mediation E¤ect or ACME(T ): ACME(T ) = E (CME (T ))

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 90/92 Random assignment of values of the mediator value holding T constant or randomizing T cannot measure ACME(T ) since the point is to measure the e¤ects of the changes in the mediator value as a consequence of changes in T . Thus in order to estimate ACME(T ) a researcher must adopt a control approach.

Causal E¤ects Through Mediating Variables

Imai, Keele, and Yamamoto (2008) & Imai, Keele, and Tingley (2009) consider what assumptions are necessary to estimate ACME(T ).

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 91/92 Thus in order to estimate ACME(T ) a researcher must adopt a control approach.

Causal E¤ects Through Mediating Variables

Imai, Keele, and Yamamoto (2008) & Imai, Keele, and Tingley (2009) consider what assumptions are necessary to estimate ACME(T ). Random assignment of values of the mediator value holding T constant or randomizing T cannot measure ACME(T ) since the point is to measure the e¤ects of the changes in the mediator value as a consequence of changes in T .

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 91/92 Causal E¤ects Through Mediating Variables

Imai, Keele, and Yamamoto (2008) & Imai, Keele, and Tingley (2009) consider what assumptions are necessary to estimate ACME(T ). Random assignment of values of the mediator value holding T constant or randomizing T cannot measure ACME(T ) since the point is to measure the e¤ects of the changes in the mediator value as a consequence of changes in T . Thus in order to estimate ACME(T ) a researcher must adopt a control approach.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 91/92 Sequential Ignorability: Conditional on W , Y , X are T 0X T independent of T and Y is independent of fX . g T 0X First assumption is that treatment assignment is ignorable with respect to both potential outcomes & potential mediators Second assumption is that the mediator values are also ignorable with respect to potential outcomes. As Imai et al observe, these are strong & untestable assumptions which are likely to not hold even if random assignment of treatment is conducted perfectly.

Causal E¤ects Through Mediating Variables

Imai et al (2008) show that if the following axiom of sequential ignorability holds then ACME(T ) can be easily estimated (where W does not include X ):

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 92/92 First assumption is that treatment assignment is ignorable with respect to both potential outcomes & potential mediators Second assumption is that the mediator values are also ignorable with respect to potential outcomes. As Imai et al observe, these are strong & untestable assumptions which are likely to not hold even if random assignment of treatment is conducted perfectly.

Causal E¤ects Through Mediating Variables

Imai et al (2008) show that if the following axiom of sequential ignorability holds then ACME(T ) can be easily estimated (where W does not include X ): Sequential Ignorability: Conditional on W , Y , X are T 0X T independent of T and Y is independent of fX . g T 0X

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 92/92 Second assumption is that the mediator values are also ignorable with respect to potential outcomes. As Imai et al observe, these are strong & untestable assumptions which are likely to not hold even if random assignment of treatment is conducted perfectly.

Causal E¤ects Through Mediating Variables

Imai et al (2008) show that if the following axiom of sequential ignorability holds then ACME(T ) can be easily estimated (where W does not include X ): Sequential Ignorability: Conditional on W , Y , X are T 0X T independent of T and Y is independent of fX . g T 0X First assumption is that treatment assignment is ignorable with respect to both potential outcomes & potential mediators

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 92/92 As Imai et al observe, these are strong & untestable assumptions which are likely to not hold even if random assignment of treatment is conducted perfectly.

Causal E¤ects Through Mediating Variables

Imai et al (2008) show that if the following axiom of sequential ignorability holds then ACME(T ) can be easily estimated (where W does not include X ): Sequential Ignorability: Conditional on W , Y , X are T 0X T independent of T and Y is independent of fX . g T 0X First assumption is that treatment assignment is ignorable with respect to both potential outcomes & potential mediators Second assumption is that the mediator values are also ignorable with respect to potential outcomes.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 92/92 Causal E¤ects Through Mediating Variables

Imai et al (2008) show that if the following axiom of sequential ignorability holds then ACME(T ) can be easily estimated (where W does not include X ): Sequential Ignorability: Conditional on W , Y , X are T 0X T independent of T and Y is independent of fX . g T 0X First assumption is that treatment assignment is ignorable with respect to both potential outcomes & potential mediators Second assumption is that the mediator values are also ignorable with respect to potential outcomes. As Imai et al observe, these are strong & untestable assumptions which are likely to not hold even if random assignment of treatment is conducted perfectly.

R B Morton (NYU) EPS Lecture 3 ExpClassLectures 92/92