Quick viewing(Text Mode)

Research Design

©Thomas Plümper 2007/08

Research

Thomas Plümper Department of Government University of Essex [email protected] ©Thomas Plümper 2007/08

Organization

Lesson 1: Why ?

Lesson 2: What is a good ?

Lesson 3: Basic Concepts, Discussions and Axioms in the Philosophy of Science

Lesson 4: Causality

Lesson 5: Theory Formation

Lesson 6: Theory and Empirical Analysis

Lesson 7: Designs and Case Studies

Lesson 8: Designs

Lesson 9: Robustness ©Thomas Plümper 2007/08

Lesson 10: Efficient Writing: A Summary of Issues

Lesson 11: ©Thomas Plümper 2007/08

Lesson 1: Why Research Design? ©Thomas Plümper 2007/08

What is Science?

“Science is a public process. It uses systems of concepts called theories to help interpret and unify observation statements called data; in turn the data are used to check or ‘test’ the theories. Theory creation may be inductive, but demonstration and testing are deductive, although, in inexact subjects, testing will involve . Theories that are at once simple, general and coherent are valued as they aid productive and precise scientific practice.”

David F. Hendry 1980 ©Thomas Plümper 2007/08

Why Research Design?

Axiom 1:

A research design is good, if and only if it allows researchers to draw valid inferences.

btw:

An axiom is a sentence or proposition that is not proved or demonstrated and is considered as self- evident or as an initial necessary consensus for a theory building or acceptation.

Nevertheless, axiom 1 leads to two questions:

1. Why should scientists be interested in valid inferences?

2. Can we proof that inferences are valid? ©Thomas Plümper 2007/08

Why should scientists be interested in valid inferences?

One step back: What are scientists interested in? ©Thomas Plümper 2007/08

Why (social) scientists should be interested in valid inferences 1

One step back: What are scientists interested in?

− maximizing life-time utility (which is a function of income, social status, 1-uncertainty, and so on) − getting tenure − getting cited − publications in a certain type of journals (a book with a very good publisher) or, perhaps more seriously but certainly on a lower plane:

− explanations − generalizations − simplifications or, in short

− advance scientific knowledge (we repeatedly come back to this statement) ©Thomas Plümper 2007/08

Why (social) scientists should be interested in valid inferences 2

Axiom 2

Eventually, (social) scientists are interested in theories, which are simultaneously as simple and as general as possible.

It follows:

1. A simpler theory is better than a more complicated theory which does not explain more.

2. An equally simple theory that explains more is better than a theory that explains less.

3. A more complicated theory that explains more is not per se better than a less complicated theory that explains less.

‘More’ means more cases, more phenomena, … ©Thomas Plümper 2007/08

Why (social) scientists should be interested in valid inferences 3

We can see how complicated a theory is when we see one (or compare it to other theories).

We cannot see how valid the generalizations are that the theory makes.

Thus:

(Social) scientists need to develop theories and test them (test generalizations of the theory).

BUT: keep in mind that theories need to simplify. Thus, testing theories means testing whether the prediction of the theory are correct, not whether the assumptions are ‘true’ (or whatsoever).

Hence Axiom 3:

Valid inferences are a necessary condition for an appropriate test of a theory. ©Thomas Plümper 2007/08

And back to research design:

Axiom 1 reformulated:

A good research design is one that allows making valid inferences and thus is a necessary condition for an appropriate test of a theory. ©Thomas Plümper 2007/08

Lesson 2: What is a good Research Question

Axiom 4:

A good research question is one that leads to a theory, which has an ex ante probability of being correct close to 0.5.

‘Correct’ here means: the theory simplifies reality in a way that leads to generalizations which help understanding many real world phenomena. ©Thomas Plümper 2007/08

Why app. 0.5?

Given that research should increase (or foster) the visibility of the researcher:

A theory which has a prior probability of finding empirical support close to 1.0 is trivial.

A theory which has a prior probability of finding empirical support close to 0.0 is risky.

Again, keep in mind that researchers test the predictions of a theory, not the assumptions.

Do we know prior probabilities? Of course, just ask your colleagues whether they think your hypotheses are correct. ©Thomas Plümper 2007/08

Falsification and Falsifiability

Karl Popper (1963):

Theories must be falsifiable.

Thus, the words ‘may’, ‘could’, ‘should’ and so on shall not be used in theories. If some action or effect is conditional, make the conditionality explicit, if it takes place only with a certain probability, make this clear.

Imre Lakatos (1973):

“The demarcation between science and pseudoscience is not merely a problem of armchair philosophy, it is of vital social and political relevance.”

David Hume (1748):

“Let us ask: does it [any volume] contain any abstract reasoning concerning quantity or numbers? (…) No. Commit it then to the flames.” ©Thomas Plümper 2007/08

Falsifiability

Popper uses the term falsifiability with two different meanings:

1. Falsifiability is a logical property of statements which requires that scientific statements logically imply at least one testable prediction.

2. Falsifiability is a normative construct, telling scientists that an test of a theory should try to refute it.

There is no relevant dissent with the first meaning, but the prescriptive meaning has let to huge controversies.

I use the term in the first sense and will explain why ‘naïve simplification’ does not lead to scientific progress. ©Thomas Plümper 2007/08

On Naïve Falsification

Thomas Kuhn and Imre Lakatos:

Abandoning a theory the instant it makes false predictions would eliminate too much good research.

Well, yes, but the main point is that

The huge majority of () theories is not deterministic but probabilistic.

This implies that we cannot falsify a theory in Popper’s sense. Rather, we have to show that on average the theory’s predictions are wrong.

Thomas Kuhn (1962):

Actual scientists do not refute a theory simply because it makes false predictions (or even worse: one false prediction).

But let’s not talk about paradigmatic change and scientific revolutions here… ©Thomas Plümper 2007/08

‘Proving’ theories right?

Hume and Popper and Lakatos and in fact about everyone agree that scientists cannot prove a theory to be right.

This made Hume and Popper stress that scientists need to try to prove a theory wrong.

Lakatos, however, claims that science is not a competition between theories. For Lakatos, research programs compete.

Research programs can either be progressive or degenerating: progressive research programs continue to predict novel facts, degenerating research programs fabricate theories in order to accommodate known facts.

But let’s repeat and keep in mind: Verification of truth is logically impossible. ©Thomas Plümper 2007/08

Can we test probabilistic theories? ©Thomas Plümper 2007/08

Can probabilistic theories be tested?

Of course, but scientists need to socially agree on a certain threshold which tells us when empirical evidence ‘too much’ contradicts the probabilistic predictions derived from a theory. ©Thomas Plümper 2007/08

Probabilistic theories and verification

“Verification is logically impossible.” (page 13)

Yet, when studying the empirical relevance of probabilistic theories, we get a more complex result (the statement still remains correct as we will see).

Let’s turn to Bayesian Philosophy of Science …

Bayesians contend that confirmation is quantitative:

Evidence E confirms a H if and only if it raises the probability of H: p(H|E)>p(H) read: the probability of H (being correct) given empirical evidence E is larger than the probability of H prior to the presentation of empirical evidence E. We refer to these probabilities as prior and posterior probability. ©Thomas Plümper 2007/08

Bayes’s Theorem

Axioms

1) Every probability is a real number between 0 and 1, 0≤p ( A ) ≤ 1.

2) If A is necessarily true, then p(A)=1.

Note: we cannot prove p(A)=1 but we can believe it.

3) If A and B are mutually exclusive (it is not possible that statements A and B are both true), then p()()(() A∨ B = p A + p B

4) p(¬ A ) = 1 − p ( A )

5) If A logically entails B, then p()() B≥ p A .

6) p()()()(&) A∨ B = p A + p B − p A B if A and B are mutually exclusive we get 3.

Bayes’s theorem in its simplest form

p( A | B )⋅ p ( B ) p( B | A ) = p() A ©Thomas Plümper 2007/08

Bayes and Advancement in Scientific Knowledge

A more complex form of Bayes’s theorem:

p( E| T)⋅ p( T ) p( T| E) = p( E| T)⋅ p( T) + p( E |∼ T) ⋅ p( ∼ T )

Read:

The probability that a theory T is correct given some evidence E is a function of

− the prior probability that T is correct,

− the expectedness of evidence E,

− the posterior probability of T.

If a theory is deterministic, then p(E|T)=1.

If a theory is probabilistic, then p(E|T)<1. ©Thomas Plümper 2007/08

Problems p(E), p(T) and p(~T)are difficult to determine. p(E) might be biased upward due to old evidence. p(~T) would be rival theories, which may or may not exist. p(T) may vary across researchers (early Bayesians suggested to derive p(T) from observed behavior). ©Thomas Plümper 2007/08

Choosing research questions according to Bayes

Scientist can ‘confirm’ theories in a Bayesian world. Confirmation, however, just means that new evidence increases the (inter-) subjective probability if T. A final proof is still impossible.

In other words: new evidence may inflict us to increase the probability that a theory is correct.

The marginal effect, however, depends on the probability of the presented evidence. If the presented evidence was likely, then the posterior probability will only be slightly larger than the prior probability.

Most importantly: if scientists knew the evidence before and believed it was correctly gathered, then the effect of presented evidence on p(T)=0. Hence p(T|E)=p(T) if E was known and believed to be correct. However, scientists often believe that E was noisy, so that a re-analyses may increase p(T).

Moreover, p(T|E)-p(T) is larger when p(T) is smaller, for all p(E)<1.

Note, however, Bayesian type of confirmation is possible and should be the goal of .

Then: less likely tests are better than easy tests, evidence in favor of theories which are less likely to be true is more important than evidence in favor of a theory which is likely to be true.

The lower p(T) of a given individual, the more they will update their beliefs in the presence of confirmatory evidence. ©Thomas Plümper 2007/08

Summary: A good research question

Scientists are interested in simplification and generalization.

Simplification refers to the assumptions and (to a lesser extent) to causal mechanisms.

Generalization refers to the number of cases covered and the number of phenomena explained.

Scientists work in a world with priors.

Research is more important, the larger the impact of results on the priors.

For example, a scientific revolution has a huge impact, because it deflates priors.

Normal science does nothing to priors, thus the relevance of normal science is

p( T| E) power of E = p( T ) . ©Thomas Plümper 2007/08

Lesson 3: Basic Concepts, Discussions and Axioms in the Philosophy of Science

Definitions:

Law, Explanation, Theory, Prediction, Evidence, Test, and all that… ©Thomas Plümper 2007/08

Laws

Today, scientists are usually interested in theories. This was not always the case:

Ernst Mach and Pierre Duhem claimed that the proper function of scientific theories was not to explain phenomena but merely to classify and summarize experimental laws.

Law: A statement containing an all-relation between explanans and explanandum.

For all A, B follows. ©Thomas Plümper 2007/08

The status of laws for social science research is weak

Try this: All democracies organize elections.

This may be true as the definition of a democracy centers on elections.

However, we are either bordering a tautology here or the statement is untrue as we may have more than one necessary condition:

Country A held an election. All democracies organize elections. Country A is a democracy.

Countries A and B are democracies. Democracies never fight wars against each other. War is absent between countries A and B.

But is the dyadic information (democracy|democracy) causing absence of war?

And btw: what about Israel – Lebanon? ©Thomas Plümper 2007/08

The Status of Laws for science in general is not particularly strong either

Ardon Lyon (1990):

1) All metals conduct electricity. 2) Whatever conducts electricity is subject to gravitational attraction. 3) All metals are subject to gravitational attraction.

All statements are true, but 1 and 2 are irrelevant for 3. Thus, 3 does not follow from 1 and 2, it is just trivially true.

Axiom: Laws without causal explanation are likely to be spurious. ©Thomas Plümper 2007/08

Theory

Definition:

A theory is a set of assumptions and propositions, which allows the derivation of at least one prediction.

Predictions are in principle testable, but not all predictions can be tested with the evidence at hand.

Example: Einstein’s theory of relativity predicted that light will be deflected by large masses. Accordingly, light from distant stars will be deflected by the sun. This prediction was tested by Arthur Stanley Eddington, who use a eclipse of the sun to verify Einstein’s prediction, but that was only years after the prediction was made.

Axiom: If some set of propositions does not make predictions, it is not a theory. ©Thomas Plümper 2007/08

Theories and Predictions

Theories are not per se correct. All theories necessarily simplify and all simplifications are necessarily (partly) wrong. Accordingly, all theories make wrong assumptions. Hence, evidence that the assumptions of a theory are wrong cannot falsify a theory.

All theories make ceteris paribus assumptions, which often remain implicit. Accordingly, in order to appropriately test the predictions of theories (hypotheses), researchers need to control for other influences. This can be done by

− statistically controlling for the influence of other explanatory variables on the dependent variable, − case selection which holds all other influences constant, − matching (if the noise in the data is low), and − carefully designed .

Theories make predictions. These predictions are testable. Evidence falsifying the predictions of a theory therefore falsifies the theory, if the test is appropriate and consistent. ©Thomas Plümper 2007/08

Theories and Assumptions

I just said: Testing assumptions is useless because all assumptions are wrong.

BUT: Some assumptions are better than others.

Assumptions should be consistent

if you assume that voters maximize income, then you should not assume that governments are ideological (without explaining why income-maximizing individuals establish ideological parties different from the hypothesis that you derive

if you assume that x influences y, then the prediction that y=f(x) is tautological and perhaps: plausible (though I’m not entirely convinced…) ©Thomas Plümper 2007/08

Theories and Causal Mechanism

Axiom: Theories identify at least one causal mechanism.

Example: A increases B, because …

They are answers to the ‘why’ question which are meant to be valid for a defined set of cases.

(in contrast, explanations are answers to the ‘why’ question which are valid for one case or an undefined subset of cases).

Theories can be used to explain cases, explanations can be generalized to theories. However, explanations that fit one case (or a limited number of cases) generalize not to correct theories.

Axiom Inductive are not deductively valid.

(David Hume, Karl Popper, Carl Hempel, Wesley Salmon and many more…) ©Thomas Plümper 2007/08

Typology of Social Sciences Theories

Social science theories can be divided into two broad categories:

− functionalistic theories,

− cultural theories,

− rationalistic theories.

Functionalistic theories explain a macrophenomenon by a macrophenomenon, without providing a microfoundation. Functionalistic theories are often criticized for being just that: functionalistic theories.

Cultural theories explain social phenomena by underlying social norms and values.

Rational theories explain social phenomena by individual calculation and preferences and social interaction. ©Thomas Plümper 2007/08

The Status of Social Science Theories

In the social sciences, theories are far more common than laws. Also, theories are more plausible.

Yet, the status of theories is often weak. This holds especially true if theories allow the derivation of just one prediction. Typically, the prediction has since long been known to be in line with empirical evidence.

From a Bayesian perspective, evidence known before the formulation of a theory cannot confirm the theory. The posterior is identical to the prior.

Axiom: Theories should allow the derivation of multiple hypotheses, at least one of the hypotheses should include a prediction not previously known to the scientific community (or at least not to the scientist). ©Thomas Plümper 2007/08

The Problem of Theory Comparison: Underdetermination

Empirical evidence can typically be explained by more than one theory.

Axiom (Humean underdetermination): For any finite body of evidence, there are indefinitely many mutually contrary theories, each of which logically entails that evidence.

Axiom: (Quinean Underdetermination): For any theory T, and any given body of evidence supporting T, there is at least one rival to T that is as well supported by that evidence.

Axiom (Duhemian Underdetermination): When a theory is being tested (say: by ), it is not the prediction of the theory alone that is being tested. Auxiliary hypotheses, assumptions, measurement instruments are simultaneously tested.

Hence, the refutation of a theory based on a single test is premature.

See also Lakatos on falsification. ©Thomas Plümper 2007/08

Theory and Research Designs

All theories can be tested by a variety of research designs – some better some worse.

The goal of ‘theory tests’ is neither falsification nor ‘proof’.

Single theory test cannot falsify theories – see Lakatos and Duhem.

Empirical analysis and induction can never proof a theory – see Popper amongst others.

Hence, empirical analysis should aim at changing the perceived probability that a theory is correct and usefully simplifies reality.

Research design is important in this respect: Choose the design (and analyze evidence) that maximizes leverage on p(T|E). ©Thomas Plümper 2007/08

Induction versus (?) Deduction

If, on the one hand, inductive results are not deductively valid, then the question emerges:

What is the scientific status of induction?

If, on the other hand, we cannot prove deductive theories to be right, then the question emerges:

What is the scientific status of deduction? ©Thomas Plümper 2007/08

What is Induction?

Peter Lipton (1991):

“Inductive inference is a matter of weighting evidence and judging likelihood, not a proof.”

For Chomsky and Kuhn, this necessarily follows from the validity of the underdetermination :

Axiom: Inductive arguments are deductively underdetermined (Curd and Cover p. 496).

Induction means that scientists generalize from ‘singular statements’ to ‘universal statements’. In other words, induction is generalization from observation to law.

Axiom: Induction is highly problematic in a world in which causal mechanism are not deterministic.

Popper (1959): The question whether inductive inferences are justified (…) is known as the problem of induction. ©Thomas Plümper 2007/08

Answers to the Problem of Induction

Induction is never justified. (Popper)

Induction should be based on the method of difference (John Stuart Mill 1904).

The quality of inductive inferences depends on the random selection of cases and on the number of cases. (all Statistics textbooks, see also King, Keohane and Verba)

Axioms:

Inferences from typical cases are more valid than inferences from untypical cases. However: without deductive theory: how can we tell typical from a-typical cases???

Inferences from a broader distribution of cases a more valid than inferences from a smaller number of cases.

[We will get back to these issues when we discuss qualitative research.]

In sum, the problem of induction is SERIOUS. ©Thomas Plümper 2007/08

Induction is highly problematic in a world in which causal mechanism are not deterministic

Assume a non deterministic causal mechanism

B yes no yes 40 10 A no 10 40

Assume that in 80 percent of A, B follows. Assume that in 80 percent of ŃA, ŃB follows.

A random draw of one single case has a zero probability of making correct inferences. A random draw of two cases has a zero probability of making correct inferences. and so on… you’d need approximately 20 cases to make app. valid inferences. And THIS IS a trivial probabilistic theory with only two influences: A and noise. The world never is that simple.

[We return to the problem of noise when we discuss quantitative and qualitative research designs.] ©Thomas Plümper 2007/08

David Hume on Induction

Induction (IL) is:

All observed As have been B. The next indicidual is an A. The next individual will also be a B.

1) If IL is justified, then these must be an argument (a causal mechanism) that shows it. 2) Arguments are either deductively valid or inductive. 3) No deductively valid can justify IL because of underdetermination. 4) No inductive argument can justify IL because of circularity. 5) IL cannot be shown to be justified.

If the assumptions of the argument are true, then 5) follows. ©Thomas Plümper 2007/08

Examples

In other words, we know that the sun will rise in the east not because the sun has always done so before, but because we have a model of the universe which links the sunrise to the motion of the planet around their polar axes.

Now try this with the democratic peace hypothesis:

We do not know that a dyad of democracies do not fight a war because they never have done so, but because of a model that unequivocally predicts peace in democratic dyads?

My answer is: we do not have a convincing model explaining the democratic peace, and the IL is not deductively valid, so we do not know that the democratic peace hypotheses is correct.

In other words: inductive logic suffers from not being deductively valid, and deductive logic suffers from underdetermination and from the fact that most social science theories are probabilistic.

Again, we have established that a scientific proof is impossible. ©Thomas Plümper 2007/08

Exkursus: Curd and Cover on Hume

While everything on the previous slide is in order, Curd and Cover (p. 501) argue that

“Even if Hume’s skeptical argument is flawless, its conclusion is strictly limited. Hume has not proven that inductive inference is unjustified. Nor has he proven that no argument can justify the belief that inductive inference is justified. At best, he has proven that no one can use an inductive argument to show an inductive skeptic that inductive inference is justified.”

So what to conclude?

As I said before:

Inductive inference is not deductively valid. Skeptics of induction are widely distributed in the (social) sciences.

Therefore, inductive inferences will never be very convincing. ©Thomas Plümper 2007/08

Consequences

Axioms:

The result of inductive research is called a hypothesis (one that requires a theoretical justification).

The result of deductive research is called ‘evidence in favor of theory’ or ‘inability of rejecting the hypotheses’.

______

Attention!

The test of an inductively derived hypothesis based on the evidence from which it was derived is not valid! ©Thomas Plümper 2007/08

Lesson 4: Causality

Causality is a necessary condition for inferences. Without the notion of causality, we cannot ‘solve’ the problem of induction.

Judea Pearl:

Causality stands at the core of enlightenment:

In the ancient world, events were simply predetermined. Similarly, even today very religious societies accept explanations such as ‘act of god’ for all types of event.

This went hand in hand with the not entirely consistent idea that human beings had a free will, for which execution they could get punished or awarded.

Since the detection that causes exist in the physical world, causes stand at the core of blame (and credit), but detected causes could also be used to control outcomes. ©Thomas Plümper 2007/08

Evidence that Scientists take Causality serious…

…is provided by the notion of spurious correlation.

If a correlation is spurious, it is readily dismissed, which means scientists regret making inferences based on it.

Example:

The correlation between the number of storks in a region and the regional birth-rate is positive (as is the correlation between storks and births over time).

Yet, for all we know, storks do not bring babies. ©Thomas Plümper 2007/08

Why Causality?

To Karl Pearson, the guy with the rank correlation, causation was second to correlation. In the ‘Grammar of Science’, he wrote:

“Beyond such discarded fundamentals as ‘matter’ and ‘force’ lies still another fetish amidst the inscrutable arcana of modern science, namely, the category of cause and effect.”

He then proceeds and recommends ‘correlation’ as a more general concept.

But then we are back at the stork and births problem.

It therefore seems fair to say that scientists are interested in causality, not in correlation.

And: correlation does not imply causation. ©Thomas Plümper 2007/08

What is Causality?

Does reckless driving cause accidents?

Does high income cause a preference for voting for conservative parties?

Does economic growth cause an increase in support for the incumbent?

Does economic growth cause an increase in employment?

Does global economic integration cause and increase in pollution?

Does smoking cause lung cancer?

Causality is directed: observe the difference between:

1) If it rains, the grass gets wet.

2) If we make the grass wet, it will rain. ©Thomas Plümper 2007/08

Does a cause precede an effect?

Often, but not always:

1) Do capital exports cause a financial market crash?

2) Does a financial market crash cause capital exports?

3) Does the expectation of a financial market crash cause capital exports?

4) Does the expectation of capital exports cause a financial market crash? ©Thomas Plümper 2007/08

Two Concepts of Causality

1) Causality is a theoretical concept independent of the evidence used to learn about it (KKV: 76).

Researchers claim that the relation between two factors is causal in at least one direction.

A à B or A à B | C

This concept of causality is trivial. All theories (explicitly or implicitly but better explicitly) have a causal core.

2) Causality as empirical concept:

Researchers observe that the relation between two factors is causal. ©Thomas Plümper 2007/08

The Consensual View of Causal Inference

Causality cannot be detected by statistical analysis.

Statistically, we observe that smokers are more likely than non-smokers to suffer from lung cancer.

Does smoking thus cause lung cancer?

We do not know, because it could be that a genotype G exists, which increases the probability of cigarette addiction C and simultaneously increase the predisposition for lung cancer L.

L=f(G) and C=f(G) thus, corr(L,C)>0 but L¬f(C)

As a consequence, most researchers nowadays believe that only controlled experiments, (sometimes) quasi experiments, or matching can detect causality. ©Thomas Plümper 2007/08

Experiments

Experiments can detect causality if and only if the researcher manages to eliminate all other potential influences on the dependent variable. ©Thomas Plümper 2007/08

Quasi Experiments

Some researchers believe that if events are randomized, and if they have effects, then the observed effects must be cause by the randomized event.

Unfortunately, that is not logically true:

Even randomized events have a positive probability to be correlated with a variable that may influence the independent variable.

Only infinitely often repeated quasi experiments would undoubtedly allow detecting causal effects. ©Thomas Plümper 2007/08

Matching

Matching reduces the size to include only pairs of observations which are identical in all relevant respects but the treatment. This allows isolating the treatment effect.

Unfortunately, we typically do not know what the correct model is. This does not allow us to perfectly isolate the treatment effect.

More importantly, all data are noisy and matching procedures often do not eliminate noise. Therefore, the observed effect is the treatment effect plus noise. Even if the treatment effect is estimated rather than measures, bias from the correlation of noise and the treatment persists. ©Thomas Plümper 2007/08

Causality

In sum, causality is foremost part of a theoretical model.

Second, causal inference is possible requires controlled experiments. ©Thomas Plümper 2007/08

Lesson 5: Theory Formation

Theories provide answers to why questions.

Why does A influence B.

A theory is a systematic set of statements and propositions that link expalanadum to explanans. or:

Theories are systematic and consistent sets of assumptions, from which a set of basic principles is derived. ©Thomas Plümper 2007/08

Necessary Conditions for a Theory

− generalizations over a category of phenomena

− predictions over outcomes given a state or a change of state

− consistentency.

In other words, if something cannot be expressed by a formal model, it is not a theory.

More importantly, theories must be falsifiable. A set of assumptions and their derivations is formulated in a way that cannot be falsified, is useless. ©Thomas Plümper 2007/08

Once more: Falsification of Theories

In 1726, Swift Johannes Kepler predicted the existence of two Mars moons.

Indeed, in 1877 these two moons were actually detected. One of these moons is now also called Swift, the other one Voltaire (scientifically, they are called Phobos and Deimos), because the latter made an identical prediction in 1750.

Examples of good science?

Well, Kepler predicted two moons because in 1726 it was known that the Earth has one moon and Jupiter four moons. Since Kepler believed in the symmetry of a god-given universe, he predicted the existence of two Mars moons.

Hence, both predictions were based on wrong assumptions and a wrong causal mechanism (symmetric design).

Indeed, according to Kepler, the next planet, Saturn, should have 8 moons – it has five, Uranus should have 64 moons – and has five as well. Hence, no symmetry AND evidence that inductive results are not deductively valid. ©Thomas Plümper 2007/08

Purpose of Theories

Stephen Hawking: “[A theory] must accurately describe a large class of observations on the basis of a model which contains only a few arbitrary elements, and it must make definite predictions about the results of future observations.”

Theories shall be as simple as possible and as complicated as necessary.

Accordingly, theories have purposes.

A good theory in one context can be a bad theory in another context.

Example: Einstein’s theory of relativity is rarely used to build bridges (but Newton’s theory of gravity is). In turn, using Newton’s theory mankind would not be able to make a robot land on the Mars. ©Thomas Plümper 2007/08

Relevant Theories

A set of assumptions that does not make predictions is not a theory.

To be relevant and to advance scientific knowledge, a new theory needs to make a prediction which were either unknown before, or explain known phenomena in a more parsimonious way than existing theories.

Relevant theories are as precise as possible. They

1) predict the existence of effects of one or a combination of causal mechanism,

2) predict the sign of these effects,

3) predict the strengths of these effects,

4) predict the size of conditional effects.

The further down the list theories get, the more precise they are. Social science theories usually predict only the direction of effects. ©Thomas Plümper 2007/08

Rethinking Predictions: Necessary and sufficient Conditions

Necessary Condition: A is Necessary for B

B yes no yes X Y A no 0 Z

Sufficient Condition: A is Sufficient for B

B yes no yes X 0 A no Y Z

X, Y, Z > 0 ©Thomas Plümper 2007/08

Rethinking Predictions: Necessary and sufficient Conditions

Necessary and Sufficient Condition: A is Necessary and Sufficient for B

B yes no yes X 0 A no 0 Z

Neither Necessary not Sufficient

B yes no yes X P A no Y Z

X, Y, Z, P > 0 ©Thomas Plümper 2007/08

Note: a neither necessary not sufficient conditions can still significantly influence the explanandum.

Neither Necessary not Sufficient

B yes no yes X P A no Y Z

with X,Z >> Y,P or X,Z << Y,P ©Thomas Plümper 2007/08

Two Variants of Social Science Theories

Functionalistic theories link macro-phenomena to macro-phenomena.

Rationalistic theories provide a micro-foundation to explain the effect of macro-phenomena on macro-phenomena.

Rationalistic theories are superior, because they provide explanations for the link on the macro- . They offer an idea of causality (which of course can be wrong, but that’s the idea of a theory). ©Thomas Plümper 2007/08

Research Designs and Theory Construction

How can we, how should we construct political science theories?

Three types of research designs: x-centered: dominant interest in explanatory variable Question: why does x matter and what can all be explained by x? Example: Tsebelis’ veto-player theory y-centered: dominant interest in dependent variable Question: what explains the variation in y Example: most welfare state research y=f(x)-centered: dominant interest in influence of explanatory variable on dependent variable Question: why does x influence y? Example: most articles in high quality journals ©Thomas Plümper 2007/08

A Rule of Thumb x-centered and y-centered are good for books, very difficult for articles x-centered designs are necessarily a bit arbitrary, therefore y-centered approaches seem to be much more appropriate y=f(x) centered designs clearly dominate journal publications, books possible, though theory needs to be more complex than a linear relation between y and x

However: y=f(x) designs are much more promising than x- and y-centered design for junior researchers. Fallacy of x- and y-centered design is that in this world everything depends on everything else and vice versa. Therefore, it is very difficult to define the boundaries of the research interest.

There is no such problem in y=f(x) designs!

-> clear recommendation ©Thomas Plümper 2007/08

Providing Microfoundations in y=f(x) Designs: Coleman’s Bathtub

James Coleman: Foundations of Social Theory

Coleman suggests a model of a social science theory which consists of four ontological entities and four relations between them.

The entities are

− structure − actors and their preferences (utility functions) − interactions − social outcome

The four relations

− how the structure determines the social outcome (macro-hypotheses, predictions) − how structure shapes actors’ preferences (constraints and opportunities) − how actors’ preferences determine their behavior − how actors behavior determines outcome (aggregation rule, equilibrium selection,…) ©Thomas Plümper 2007/08

Coleman in General

structure macro-macro prediction social outcome

aggregation and equilibrium selection

actors and behavior and utility functions interactions ©Thomas Plümper 2007/08

Coleman in Particular majoritarian versus political proportional electoral system representation the electoral system influences the degree of political representation

party competition

parties maximize vote share party positioning voters have spatial preferences voting ©Thomas Plümper 2007/08

How to think about the World: Deterministic and Probabilistic Theories

The Coleman Scheme can be used to develop deterministic and probabilistic theories.

Note: Probabilistic theories have a deterministic core. Do not write ‘may’, ‘might’, ‘could’ in theory sections.

Just state that the predictions of the deterministic theory are meant in probabilistic sense:

A increases the probability of B rather than

When A then B.

Probabilistic theories are much more plausible in social science. Why? ©Thomas Plümper 2007/08

Why Probabilistic Theories?

Social science theories have to deal with an important black box: individuals are different: they do not respond equally to the same stimulus.

These differences cause behavior to be noisy.

There are little alternatives to accepting this noise: We cannot and should enrich our social theories with psychological accounts of all individuals which have the chance to make our data noisy.

Two reasons:

We are social scientists; individual behavior is an assumption for us – nothing we want to explain.

There are so many more relevant actors than ‘cases’ we can analyze. Hence, if we were to take all possible combinations of all possible actor types into considerations, our ‘theories’ would not have any predictive power. ©Thomas Plümper 2007/08

Most Importantly:

Probabilistic theories are not probabilistic, because we make assumptions or derivations probabilistic. Rather, we interpret the predictions in a probabilistic way. ©Thomas Plümper 2007/08

Summary: Good Social Science Theories

− make simplifying assumptions

− have predictive power

− predict more than just one phenomenon (allow derivation of multiple hypotheses)

− provide a microfoundation and perhaps

− are testable (not all theories that make predictions can be tested) ©Thomas Plümper 2007/08

Lesson 6: Theory and Empirical Analysis

Bridging gaps… ©Thomas Plümper 2007/08

The Supremacy of Theory

The idea behind predictions is that the responsibility for telling observers what to look for should fall upon theorists. Trying to expand our knowledge by waiting for new observations to be found by accident is like shooting in the dark. There are so many possible directions; how do we know where to look to find something new? Better to have a guiding theory telling you what to look for. Undoubtedly, it is observation that establishes facts, but without a theory one risks wasting a lot of time looking in vain.

Joao Magueijo, p. 55 ©Thomas Plümper 2007/08

…but incidents may happen

Naturally, science sometimes proceeds in the opposite way, and if that happens, so much the better. Experiments may go ahead of theory, so that we come across new facts first by observation. Theory is then about postdicting existing observations. The role of the theorist is now to collect existing new data and come up with a theory explaining it all; that is the theorist must find a framework within which all observations make sense. ibid ©Thomas Plümper 2007/08

Prediction and Postdiction

I reality, both prediction and postdiction play an important role in science – they are not mutually exclusive. ibid. ©Thomas Plümper 2007/08

Theory and Observation

It is sometimes said that we should never believe a scientific theory until it is verified by experiment. But a famous astronomer has also stated that we should never believe an observation until it is confirmed by theory.

Magueijo, p. 87 ©Thomas Plümper 2007/08

Linking Theory to Method

Apparently (and having discussed underdetermination), there always exist more than one way to bridge the gap between theory and empirical analysis.

Each theory can be tested in various ways and evidence is compatible with more than just one theory. ©Thomas Plümper 2007/08

Some Practical

1. Take your theory serious…

2. Write your theory twice: in general, abstract terms and in applied terms…

3. Add some empirical evidence to your ‘applied theory’…

4. Think hard about your cases and your sample…

5. Make sure that you use identical terms in the theory and the empirics section…

6. Do not use unobservable and/or unmeasurable variables in the ‘applied theory’ and the empirical analysis. ©Thomas Plümper 2007/08

Taking Theory seriously 1…

In its first version, Einstein’s theory of relativity predicts an expanding universe.

Einstein, however, had a very strong prior for a stable (or what he said: constant) universe.

Thus, he introduced a parameter with the sole purpose of ‘stabilizing’ the universe: the cosmological constant was born.

Einstein later called it his ‘größte Eselei’…

Hubble then found that the universe indeed does expand as predicted by the theory of relativity. ©Thomas Plümper 2007/08

Taking Theory seriously 2…

Plümper and Martin once wrote a manuscript which predicts a non-linear relation between political participation and economic growth. ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

Predictions of this theory ©Thomas Plümper 2007/08

Taking theory seriously?

In the first version of the paper, however, we tested only prediction 1, perhaps mainly because I did not believe that we could find support for hypothesis 3. ©Thomas Plümper 2007/08

However:

We later added the additional tests and the paper was published in a reasonable journal. ©Thomas Plümper 2007/08

What does this tell us?

− theory is often better than intuition and priors

− good models allow the derivation of more than one hypothesis ©Thomas Plümper 2007/08

Write your theory twice…

What this means? ©Thomas Plümper 2007/08

Write your theory twice…

the language of abstract theory (in Plümper and Neumayer 2007; Neumayer and Plümper 2007)

The more a foreign government stabilizes political control of a government of a country in which a radical group intends to overthrow the domestic government, the probability of international terrorism conducted by the group on the foreign government increases.

the language of ‘applied’ theory

First, by being an ally to the home country, a foreign country becomes part of the enemy fought by the terrorists, which is their home country government. Attacking targets from the ally provides terror groups with strategic advantages. Second, the effect of alliances on international terrorism is amplified by an asymmetry in the military capabilities of the ally relative to the home country. The more powerful the foreign ally relative to the home country the more terrorism from the home country is targeted against nationals of the foreign ally. (Plümper and Neumayer 2007: 5) ©Thomas Plümper 2007/08

What is the difference? ©Thomas Plümper 2007/08

What is the difference?

− stabilization not observable (counterfactual!!)

− power not directly measurable

BUT power asymmetry between domestic country and foreign country measurable stabilization via alliances ? ©Thomas Plümper 2007/08

Alternatives

military aid arms exports troops stationed

see Neumayer and Plümper 2007 ©Thomas Plümper 2007/08

Add (some!) empirical evidence to your theory or the or (perhaps best) in between…

‘India is an important for testing the political economy of responsiveness. It is home to a large vulnerable population (…). India is a federal democracy, and popularly elected state governments play a key role in relief activities. There is a relatively free and independent press.’ (Besley and Burgess 2002: 1416). Indeed, Sen regards post-independence democratic India as a major piece of evidence in favor of his claim that no famine ever took place in a democratic country with free press. He insists that India has not suffered a major famine since 1947: ‘The last major famine in India took place before independence, viz. the Bengal famine of 1943, in which about 3 million people died. Since then there have been a number of threats of severe famine (e.g. in Bihar in 1967, in Maharashtra in 1973, in West Bengal in 1979, in Gujarat in 1987), but they did not materialize, largely due to public intervention.’ (Drèze and Sen 1989: 8). Sen thus argues that large-scale famine mortality has been prevented by intervention of a responsive democratic government. Yet, on closer inspection the devil lies in the detail.’

Plümper and Neumayer 2007 ©Thomas Plümper 2007/08

Another example… ©Thomas Plümper 2007/08

What is this be good for?

Empirical evidence can be used to cast doubt on existing theories.

Empirical evidence can be used to support causal mechanism.

MAKE SURE TO USE CASES WHICH ARE ALSO IN THE ANALYZED SAMPLE.

But do not exaggerate: theory must still be recognizable as such. ©Thomas Plümper 2007/08

Theory and Sample Selection

A simple way to bring the empirical analysis closer to the theoretical predictions (and vice versa) is to clearly discuss the external validity of the theory.

For what subset (if any) of cases does the theory claim validity? Why? Why do the predictions not hold for other cases?

Is the selection criteria exogenously given or is their a choice, in which case a selection model might be more appropriate?

What is a typical case? Does it make sense to show that your theory fits nicely to a typical case? ©Thomas Plümper 2007/08

Use the same terms

The ‘applied’ version of the theory and the empirical analysis need to use the same terms.

Don’t use power in the theory and military spending in the empirical analysis.

Don’t use utility in the theory and per capita income in the empirical analysis.

And so on…

If you want to use general concepts, use them in the discussion of the general theory. ©Thomas Plümper 2007/08

Terms to be avoided and used

Power Military Spending (or whatever you mean) Freedom Justice Utility Welfare per capita income (unless of course you have a direct measure of welfare) ©Thomas Plümper 2007/08

Unobservables

Many students think (and write) in general, broad concepts (see above).

These concepts can hardly be directly observed.

Thus, they are also difficult if not impossible to measure.

Accordingly, you are not likely to use them in the empirical section.

Avoid them altogether…. ©Thomas Plümper 2007/08

Lesson 7: Qualitative Research Designs and Case Studies ©Thomas Plümper 2007/08

Qualitative Research Designs

King, Keohane and Verba: “Qualitative research covers a wide range of approaches (…). Such work has tended to focus on one or a small number of cases, to use intensive or depth analysis of historical materials, to be discursive in method, and to be concerned with a rounded and comprehensive account of some unit or event.” (p.4)

What then is a ‘case’? ©Thomas Plümper 2007/08

What is a ‘case’?

Collier and Brady: “Cases are understood as the broader units, that is the broader research settings or sites within which analysis is conducted.” (p.38) de Vauss: “A case is the object of a study. It is the unit of analysis about which we collect information. In case study designs it is the unit that we seek to understand as a whole.” (p.220) de Vauss: “Case study designs differ from [quantitative designs] in that they seek to achieve both more complex and fuller explanations of phenomena.” (p.221)

Eckstein: “A study of six general elections in Britain may be, but need not be, an N=1 study. It might also be an N=6 study. It can also be an N=120,000,000.” [I doubt it…]

Brady and Collier: “Cases are the political, social, institutional, or individual entities or phenomena about which information is collected and inferences are made.” (p.275) ©Thomas Plümper 2007/08

Wikipedia: “Rather than using large samples and following a rigid protocol to examine a limited number of variables, case study methods involve an in-depth, longitudinal examination of a single instance or event: a case. They provide a systematic way of looking at events, collecting data, analyzing information, and reporting the results. As a result the researcher may gain a sharpened understanding of why the instance happened as it did, and what might become important to look at more extensively in future research.”

Yin: “[The term] case study should be defined as a research strategy, an empirical inquiry that investigates a phenomenon within its real-life context.”

Abercrombie, Hill, & Turner: “The detailed examination of a single example of a class of phenomena, a case study cannot provide reliable information about the broader class, but it may be useful in the preliminary stages of an investigation since it provides hypotheses, which may be tested systematically with a larger number of cases. (p.34)

Are you still asking yourself why case study designs often do not work???? ©Thomas Plümper 2007/08

Where does this Confusion come from?

Can “Germany” possibly be a case?

Or labor market regulation in Germany?

Or the effect of a labor market reform (Hartz 1, Hartz 2) on unemployment figures?

Does the dependent variable define the case?

OR: Is a case defined by the causal mechanism put forward in the theory (that researchers want to make inferences upon)?

Confusion emerges because

1. the definition of a case is often far too broad, 2. the term case is used in different ways, 3. no clear distinction between case and observation exists within qualitative research. 4. we do not necessarily a priori define a case. ©Thomas Plümper 2007/08

A Procedural Solution

Think about theories making predictions across time and .

The number of observations is then the number of cases times the number of periods. Hence,

N=n*T

If n=1 we have a single case study, inferences being possible if T>1. If n>1 we observe multiple cases, inferences possible even if T=1 If n=1 and T=1 we have a problem.

Problems:

Is Germany before 1933 and Germany after 1945 the same case (not to mention Germany between 1933 and 1945)? ©Thomas Plümper 2007/08

More Problems

In deductive research, the definition of a case and of an observation should be clear since defined by the boundaries of the theory.

But what defines a case in inductive research? Where are the boundaries of a case? ©Thomas Plümper 2007/08

Case Study Design and Complexity

George and Bennett: “We emphasize that various kinds of complex causal relations are central concerns of the social sciences, including not only equifinality and multiple interaction effects, but also disproportionate feedback loops, path dependencies, tipping points, selection effects, expectation effects and sequential interactions between individual agents and social structures.” (p.13)

So what???

“Our approach to the problem of complexity is to recommend process-tracing.” (p. 13) ©Thomas Plümper 2007/08

Are all cases equally important?

or are some case more equal than others?

George and Bennett: “It is difficult to judge the probative value of a particular test relative to the weight of prior evidence behind an existing theory.” (p.120)

Qualitative research may have a point here: ©Thomas Plümper 2007/08

Crucial Cases

Crucial cases must closely fit a theory if one is to have confidence in the theory’s validity.

[What is the fit between a theory and a case??? R²=1???]

Crucial cases have a small a priori probability of being deviant.

BUT: How can we tell a crucial case from a not so crucial case?

Is the US example more likely to be crucial than other ‘cases’?

Eckstein notes the difficulty in identifying such crucial cases when theories and their predictive consequences are not precisely stated.

BUT: DO THEORIES EVER MAKE PREDICTIONS ON SINGLE CASES?????? ©Thomas Plümper 2007/08

Most-likely/least-likely Cases

Well, defined as cases which are most or least likely to provide evidence in favor of the theory.

BUT: what about underdetermination?

If a case supports a theory, it is also likely to support other theories.

AND: Most evidence in favor of a theory is provided by least-likeliest cases that actually support the theory. What do we learn from supportive most-likeliest cases? ©Thomas Plümper 2007/08

Typical Cases

OK: How do know that a case is typical? ©Thomas Plümper 2007/08

Nevertheless…

… I believe that qualitative approaches have a point here.

At least, I do not care much if the residuals for Luxemburg are large. But I care if the residuals for the Netherlands are large.

And yes, I thought about making fun of Italy here…. ©Thomas Plümper 2007/08

Confusion again…

Here is may view of why qualitative research is often confused (and confusing):

Case selection, model specification, definition of the boundaries of the case, specification of what we accept as counter-evidence usually requires much more information than provided by any theory. for example, a theory usually does not tell us what a crucial or a most-likely case is. Hence, qualitative researchers necessarily have to make arbitrary decisions – decisions not guided by theory, but decisions which are likely to drive their results. ©Thomas Plümper 2007/08

A broader Critique ©Thomas Plümper 2007/08

Flyvbjerg: In Defense of Case Study Designs… ©Thomas Plümper 2007/08

Taking Flyvbjerg apart: A simple exercise

P1:

If ‘concrete, context-dependent knowledge is of any use, findings from the case study must be generalizable.

a) how do we know? b) why would we think that inferences from one case are more valid than inferences from many cases????

Moreover, it is nonsense to equate non-qualitative designs with context independence. After all, an interaction effect can be modeled in quantitative research, but not (not as easy at least) in qualitative research.

P2: Generalization from a single case is (of course) possible. BUT not very likely correct.

The law of large number applies. In addition, how to we know to what other cases we may generalize? ©Thomas Plümper 2007/08

P3: There can’t be any doubt that qualitative designs cannot appropriately test hypotheses derived from probabilistic theories.

P4: Well, research in general has that bias.

P5: None makes this argument. Indeed, qualitative designs are brilliant for induction – though of course most theories induced from qualitative research will be wrong.

BUT: data mining is far worse (I believe). ©Thomas Plümper 2007/08

OK, then: what is left for Qualitative Designs?

Pros:

Qualitative research designs are superior for theory development (inductive research), testing deterministic theories, providing additional evidence of causal mechanisms at work.

Cons:

Qualitative research designs are inferior for testing probabilistic theories, making inferences and generalizing results, analyzing complex relations with lots of intervening variables, controls and contingencies. ©Thomas Plümper 2007/08

Just in case we still want, how do we do qualitative research?

Let’s face problems:

− case selection

− model specification

− inference ©Thomas Plümper 2007/08

Case Selection

Axiom 1: Do not select on the dependent variable.

Axiom 2: Use quantitative research to identify the most interesting cases.

Axiom 3: Use matching techniques to identify comparable cases.

Axiom 4: Do not claim to be testing theories.

Axiom 5: Use qualitative research methods to improve your theory. ©Thomas Plümper 2007/08

Model Specification

Qualitative research is more dependent than quantitative research on having the right model, because qualitative research cannot account for error processes.

Thus, one need to specify a general equilibrium model (a full explanation, if you prefer) to identify cases which are comparable (because all controls are kept constant).

Still, errors bias the result from qualitative research, since it cannot account for residuals. ©Thomas Plümper 2007/08

Inferences

Be moderate: most likely, your inferences are wrong….

anyway: this definitely is not a plea for avoiding qualitative research.

You definitely need it to learn about your theory and causal mechanisms. ©Thomas Plümper 2007/08

Lesson 8: Quantitative Research Designs ©Thomas Plümper 2007/08

A general data generating process

yi=α + β x i + ε i is boring and will very likely not lead to valid inferences.

Let’s add a vector of controls...

K 1 1 k k yi=α + β x i +∑ β x i + ε i k =2 hmm, slightly better, but do we believe in unconditional effects? ©Thomas Plümper 2007/08

A general data generating process

yi=α + β x i + ε i is boring and will very likely not lead to valid inferences.

Let’s add a vector of controls...

K 1 1 k k yi=α + β x i +∑ β x i + ε i k =2 hmm, slightly better, but do we believe in unconditional effects?

If not, we may say

K 1 1 2 2 3 1 2 k k yi=α + β x i + β x i + β x i x i +∑ β x i + ε i k =4 ©Thomas Plümper 2007/08

A general data generating process

However, if any of the included variables at time t is not independent of the same variable at time t- 1, we get biased estimation results. There is only one solution: pooling and controlling for serial correlation.

Se we get

K 0 1 1 2 2 3 1 2 k k yit=α + β y it−1 + β x it + β x it + β x it x it +∑ β x it + ε it k =4

Note: it is not necessarily the best idea to control dynamics via the LDV. Prais-Winsten is more efficient and distributed lag models use less constraints. (The LDV assumes that all dynamic effects are identical, which is almost certainly wrong).

K 0 11 22 312 11l l 22 l l 312 l k k yit=α + β yxxxxx it−1 + β it + β it + β it it + β it − 1 + β x it − 1 + β xx it − 1 it − 1 +∑ β x it + ε it k =4

This model estimates the deviation of the x’s dynamic effects from beta(0). But let us ignore these complicated dynamic structures. ©Thomas Plümper 2007/08

A general data generating process

We return to

K 0 1 1 2 2 3 1 2 k k yit=α + β y it−1 + β x it + β x it + β x it x it +∑ β x it + ε it k =4

The second big issue is heterogeneity.

Are coefficients stable over time? Are coefficients identical across units? Are error processes heterogeneous?

If not, and if we believe average effects are awful and misleading, we get an even more complicated DGP:

K 0 1 1 2 2 3 1 2 k k yit=α i + α t + β i y it−1 + βi x it + β i x it + β i x it x it +∑ β x it + ε it k =4

Does is make sense to estimate such a model?

I’d say yes, if theory predicts heterogeneous effects. No, if not. ©Thomas Plümper 2007/08

Summary

Cross-sectional analyses are unreliable.

Time-series analyses are, pooling is a good solution.

Both TS and TSCS are costly, because researchers need to specify dynamic processes, account for serial correlation.

TSCS is even more problematic, because we do not believe that unit heterogeneity is low.

(if you want to defend CS analyses, let me know how you want to deal with unit heterogeneity and with temporal dependencies).

Why would be willing to believe that observations in CS analyses are independent? ©Thomas Plümper 2007/08

Research Designs in Quantitative Research ©Thomas Plümper 2007/08

Estimator Properties

Unbiasedness: (absolute concept) As the number of observations approaches infinity, the deviation of the estimated coefficient from the true coefficient becomes zero.

Efficiency: (relative concept) An estimator is most efficient if it produces estimates which have the smallest distribution.

There can be a trade-off between efficiency and unbiasedness.

Rule of Thumb: Unbiasedness relatively important if analyzed variation large, efficiency more important if analyzed variation low.

You may translate this “large variation” into the “number of (independent) observations”, though of course this is not literally the same. ©Thomas Plümper 2007/08

Model Properties

As simple as possible, but not more simple. ©Thomas Plümper 2007/08

Included Variables

Of course, you need to include your variable of main interest.

But what about controls? ©Thomas Plümper 2007/08

Inclusion of Controls

Controls need to be included if they are either correlated with the error term or (more importantly) with the variable of main interest.

Otherwise: estimates will be biased. ©Thomas Plümper 2007/08

Conditional Effects

If your theory suggests conditional effects, you need to specify the conditional effect correctly by using interaction effects – sometimes multiple interaction effects.

An interaction effect requires including all ‘pure’ variables.

Note: one can estimate different slopes for different groups, that is not an interaction effect. ©Thomas Plümper 2007/08

Presentation of Interaction Effect

The size and (more importantly) the functional form of interaction effects cannot be derived from looking at the coefficients. You need to compute the conditional effects.

For example: Generate an Excel Sheet

1. Observe the range of the two interacted variables. (say x: 0..1 and z: 1..10). 2. estimate the coefficients [ say : 0.5 (x), -0.06 (z), 0.2 (xz) ] 3. insert a linear function of x in cells A2-A11. 4. insert a linear function of z in cells B1-K1. 5. type: 0.5*$A2-0.06*B$1+0.2*$A2*B$1 in cell B2. 6. copy contend of cell B2 in matrix B2..K11 ©Thomas Plümper 2007/08

Interaction Effects in Non-Linear Models cannot directly be interpreted (Plümper, Schneider and Troeger 2006) ©Thomas Plümper 2007/08

correct significance levels (Plümper and Neumayer 2007)

Hence: you definitely need to compute effect size and significance if you use interaction effects in non-linear estimation models. ©Thomas Plümper 2007/08

Functional Forms

It is possible to estimate non-linear functional forms in linear models. Do not get confused.

1 2 3 yi=α + β x i + β( x i ⋅ x i) + β( x i ⋅ x i ⋅ x i) + ε i

This is data-mining if done without theory.

Note again: you need to compute the effect of the argument in non-linear models. Results are not obvious and cannot be inferred from coefficients. btw: to compute the conditional effects of non-linear models you need to add an ado that does that for you (Clarify, i.e.) or (MUCH MUCH BETTER) you need to know the functional form of the estimator’s probability density function you use.

(well, you better take this seriously) ©Thomas Plümper 2007/08

Indeed…

…one can combine interaction effects and non-linear functional forms… if the theory says something about these issues. ©Thomas Plümper 2007/08

Random Sampling

In theory, much in econometrics is based on the idea of random sampling. I.e. without random sampling, the concept of standard errors is slightly awkward.

Unfortunately, random samples are very very unlikely to exist, because the random draw from a non-random sample does not give us a random sample.

What then is the population of cases and what defines it? ©Thomas Plümper 2007/08

Populations are defined

by the theory. Full stop. ©Thomas Plümper 2007/08

Standard error and inference in non-random sample

Of course, we can draw inferences from non-random samples, they just have a higher probability of being wrong. Of course, it all depends on whether the non-random sample represents the population and that we do not know.

Thus, we are less confident than the s.e’s suggest. But that is fine.

Just do robustness checks… and we do so in the next week. ©Thomas Plümper 2007/08

Lesson 9: Robustness

Robustness can be defined as stability of results despite significant and realistic changes in the model specification.

Unrealistic model specifications that lead to different results do not indicate lack of robustness.

Hence, two questions emerge:

What is ‘robustness’ between results?

What are ‘realistic changes’ in model specifications? ©Thomas Plümper 2007/08

Before we start: Why Robustness Tests are important for you

Robustness tests helps you understand the limits of your argument.

The inclusion of robustness tests in your paper make your empirical section more convincing.

You may already address some concerns of the reviewers.

In sum, robustness tests lift the quality of your paper by app. one level.

from C to B journals, from B- to B+, perhaps even from B to A, though getting published in A journals does not just require a good paper. ©Thomas Plümper 2007/08

What is ‘robustness’ between estimation results?

1) The ‘Sensitivity-Debate’

Leamer: permutation test with always significant coefficients of the same sign

Sala-i-Martin: permutation test with always the same sign

Dreher/Sturm: permutation test with the same sign in 95% of estimates

Monte Carlo permutation test distinguish between essential X’s and non-essential X’s. Models always include the essential X and a few randomized non-essential X’s. ©Thomas Plümper 2007/08

Why Sensitivity-Analyses does not work perfectly well

When sensitivity analyses show lack of robustness, this may well be caused by a serious mis- specification of many of the models in the MC analysis.

Accordingly, lack of robustness is not informative.

However, when results are robust, then fine (implies that results hold even if we allow for some mis-specified models).

THUS: when we believe in confirmation (if we are Bayesians) then sensitivity analyses provide a powerful tool. ©Thomas Plümper 2007/08

Alternatives to Sensitivity Analyses

selective (targeted) tests

These are based on an understanding of the main problems associated with your analysis.

Examples:

− crucial cases

− measurement error

− estimation procedure

− model specification (functional form, interaction effects, …)

− other proxy/measure for variable of main interest

− else? ©Thomas Plümper 2007/08

Crucial Cases

“In addition, we analyzed the effects of excluding the somewhat special case of anti-American terrorism from the sample. Americans are the major victims of international terrorism (see Author 2007b). The US is also the militarily strongest country in the world, with many international alliances and the highest possible level of democracy on the Polity scale. Hence, one might be concerned whether our results are driven by terror victims from a single country. We therefore re-ran models 2 and 4 on a sample that excludes the US as a target of terror to explore whether our estimation results hinge on this one special case. We find it does not. In model 4, for example, the vector coefficient increases from 0.87 to 1.04, while its interaction with the democracy level in the terrorists’ home country remains about constant (from -0.027 to -0.026) and the interaction with the democracy level in the target country declines only slightly from 0.033 to 0.026.”

Neumayer and Plümper 2007

Alternatives:

- bootstrap - jackknife - groupwise jackknife ©Thomas Plümper 2007/08

Bootstrap (Example: Plümper and Neumayer IO 2006) ©Thomas Plümper 2007/08

Groupwise Jackknife ©Thomas Plümper 2007/08

Measurement Error

“We therefore conducted a Monte Carlo study, which aims at exploring the effect of measurement error on our estimates. Specifically, we re-estimated model 1 1000 times. In each re-estimation, we multiplied the value of the dependent variable of approximately 15 percent of our observations28 by a uniform random number of the interval [0.5..1.5], which mirrors measurement errors of up to 50 percent. By drawing the measurement error from a uniform distribution, it is on average unlikely to be correlated with the explanatory variables. However, the actually drawn measurement error in each iteration may well be correlated with some of the regressors even if the average correlation over infinite iterations is zero. If we were just to report the mean coefficient estimates, then the Monte Carlo study would only address unsystematic measurement error. However, by reporting the full range of coefficients from the Monte Carlo study (minimum to maximum), we take each single iteration into consideration and thus account for some systematic measurement error as well. In other words, the range of the coefficients that we report offers an appropriate measure of the importance of measurement error.”

Plümper and Neumayer: Famine Mortality ©Thomas Plümper 2007/08

Example ©Thomas Plümper 2007/08

Estimation Procedure

Changing the estimation procedure makes sense if and only if various estimators are equally plausible.

Examples: nbreg vs. zinb poisson vs. nbreg fe vs. p-ols vs. re vs. fevd cox vs. any other survival and so on

Important: due to the potential trade-off between efficiency and unbiasedness we hardly know and certainly cannot trust textbook on the optimal estimator for the analysis of the data at hand. ©Thomas Plümper 2007/08

Example (Plümper and Neumayer Famine Mortality 2007) ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

Functional Form

Functional forms tend to have a huge influence on results.

Yet, theories typically does not say much about functional forms.

distance or ln(distance) or log(distance) ?

gdpc or ln(gdpc)?

and so on…

Accordingly, to log or not to log is a much welcomed way of massaging results.

Alternatively, one can let the data decide and regress the dep var on an indep var, its square and the cubic value. Taylor-series regressions are pretty flexible on functional forms, but that is data mining. btw: you have to calculate the functional form when using higher order Taylor-series regressions. interpretation of coefficients is difficult and sometimes almost impossible. ©Thomas Plümper 2007/08

Example Plümper and Martin 2003 ©Thomas Plümper 2007/08

Other proxies/variables for indep var of main interest or depvar advantage: high probability of robustness

disadvantage: quite boring, unclear what lack of robustness implies, poorly coded variable? ©Thomas Plümper 2007/08

Example: Plümper and Neumayer 2007 Alliance ©Thomas Plümper 2007/08

What else can be done? ©Thomas Plümper 2007/08

Summary

Robustness tests are important.

They help you understand your data.

They make your paper more convincing.

You may address concerns of referees before they can raise them (do not worry, they will come up with something else).

Make sure you find the toughest test for your theory. Do not report trivial robustness tests. ©Thomas Plümper 2007/08

How many robustness tests per paper?

rule of thumb: 1-3 not too many regression tables, appendix unless really interesting ©Thomas Plümper 2007/08

Lesson 10: How to write Articles

Purposes

Introduction

Literature Review

Theory

Empirical Analysis

Robustness

Conclusion

Note: There are million ways to organize and formulate an article. But only one is optimal. ©Thomas Plümper 2007/08

Introduction

The introduction does not introduce.

The introduction sells the article.

Most reviewers decide on accepting or rejecting the manuscript while reading the intro.

Take it seriously! ©Thomas Plümper 2007/08

Criteria of a good Introduction

The introduction

− suggests that the project is interesting and important,

− states the argument on the first page,

− clearly and briefly summarizes the argumentation,

− clearly and briefly summarizes the research design,

− clearly and briefly summarizes the findings,

− discusses and explains the contribution of the paper, and

− if necessary outlines the organization of the paper.

In this order… ©Thomas Plümper 2007/08

The first paragraph…

…is important.

Be clear. ©Thomas Plümper 2007/08

Examples ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

The second paragraph… is equally important.

State your argument as clearly as possible. ©Thomas Plümper 2007/08

Examples ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

The Discussion of your Contribution is essential

If you are too moderate, the referees may reject your paper because it makes no relevant contribution.

If you are not moderate enough, they will accuse you of overstating your point.

Sometimes, they will criticize you for not relating your work to their work.

That can be horrible, but not as horrible as a rejection for the mere reason that you made an argument that your referees have not made before. ©Thomas Plümper 2007/08

Examples ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

Lessons learned

Mention the most important word of the article in the first sentence!

Explain what the paper does as soon as possible, but definitely on page 1.

Mention a few names on the first page – especially the names of guys you want to have as referees.

And do not criticize them!

Btw: you should learn how to describe your theory in five different ways. You’ll need it.

Make clear what your contribution is – perhaps without using the term and rather just describing the contributrion. ©Thomas Plümper 2007/08

The Literature Review…

… does not review the literature. (What did you think?)

The literature review identifies a gap in the literature which is either identical to your argument or to your contribution.

Take it seriously. ©Thomas Plümper 2007/08

The most common mistake…

…of the literature review is that authors discuss one article after the other and do not summarize lines of argumentation.

You NEED to identify a structure in the literature! There will ALWAYS be a limited number of arguments.

AND the number of arguments is smaller than the number of articles! ©Thomas Plümper 2007/08

Example

You can state the gap in the literature in the 1st paragraph of the literature review. But you need not. ©Thomas Plümper 2007/08

And you can conclude by saying more or less the same… ©Thomas Plümper 2007/08

Limits

Discuss all relevant papers, and no irrelevant papers.

If in doubt, you better discuss one paper too much than too little.

Make sure that you know your argument and your contribution before you write the literature review. Otherwise you will waste your time.

NEVER start a paper with writing the literature review. It will be rubbish. ©Thomas Plümper 2007/08

Criteria of a Good Literature Review

A good literature review

− identifies a gap in the literature,

− summarizes different lines of argument,

− organizes the literature according to these previous arguments,

− discusses strengths and weaknesses of previous work in a fair way, and

− relates the gap in the literature to the argument developed in the next section. ©Thomas Plümper 2007/08

The Theory

Organization

You must write an introduction to the theory section summarizing the argument.

State your assumptions clearly.

Make sure that your hypotheses follow from your assumption, but not immediately.

Develop your argumentation stepwise. Start simple, get more complicated.

Use simple English. Your significant other does not need to understand your argument, but s/he should understand your language (well, if s/he does).

Do not derive more than 3-4 and not less than 2 hypotheses from your argumentation.

Do NOT write a shopping list (at least, I have NEVER accepted a paper without a consistent theory)

Summarize your argument.

THERE ARE TOO MANY WAYS TO FORMLATE THEORY. ©Thomas Plümper 2007/08

First Paragraphs ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

Empirical Analysis

Organization:

− Purpose of the Analysis − Case Selection − Period Selection − Justification of − Description of and Coding (if relevant) − (if necessary) − Justification of Method and Specification − Results − Description of Results − fair Discussion of Quality and Relevance of Results − Interpretation of Results in Relation to Theory − Discussion of Competing Theories (if relevant) ©Thomas Plümper 2007/08

Conclusion

Summary of Findings and Results

Relevance of Findings

Discussion of Finding in Perspective ©Thomas Plümper 2007/08

The Writing Process: First things first

Theory

Empirical Analysis

Literature Review

Introduction and Conclusion ©Thomas Plümper 2007/08

Examples similar to introduction ©Thomas Plümper 2007/08 ©Thomas Plümper 2007/08

Figures and Tables

Use figures and tables, but use them moderately.

The golden rule of beauty is that it takes 10% of the time to get 90% of beauty, it takes 90% of the time, to get the remaining 10%.

Worth it? well, yes, and already in a manuscript.

DO NOT: copy Stata output

Excel graphs

Use consistent styles. ©Thomas Plümper 2007/08

Literature List

Chicago or Harvard manual of style.

Nothing else.

Do not include material you have not cited.

Do include all material you have cited. ©Thomas Plümper 2007/08

On Style

invest in clarity write relaxed do not use pseudo-scientific jargon beware of words you use too often get rid of words that do not change the contend of the sentence begin each PARAGRAPH with a summary statement begin each section with an introduction do not begin two consecutive paragraphs with the same word void passive construction