Swing Models MARKC.WILSON University of Massachusetts Amherst BERNARDN.GROFMAN University of California Irvine
Submission to American Political Science Review Page 1 of 20 doi:xx.xxxx/xxxxx
Swing models MARKC.WILSON University of Massachusetts Amherst BERNARDN.GROFMAN University of California Irvine
onsider an election of a given type (say a legislative or presidential election) involving some set of districts for which we have data at two distinct points in time, C time 1 and time 2. If there are only two political parties, A and B, we may think of the overall difference in the mean party vote share of party between the two elections as the aggregate inter-election swing between the two parties. If we know this mean inter-election swing, the question we seek to answer is: “How do we expect the aggregate swing to be distributed across the districts or states (or smaller units such as counties) as a function of previous vote share (and perhaps, other factors)?” In the electoral systems and political party literatures there have been two main answers to that question: uniform swing and proportional swing. Our main theoretical contributions are (a) to provide an axiomatic foundation for desirable properties of a model of inter-election changes in vote shares in a districted legislature; (b) to use those axioms to demonstrate why using uniform swing or proportional swing is a bad idea, (c) to provide a reasonably simple swing model that does satisfy the axioms, and (d) to show how to integrate a reversion to the mean effect into models of inter-election swing. Our main empirical contributions address the question of why, despite their theoretical flaws, there is strong evidence that the two standard models, especially uniform swing, provide a very good fit to empirical data. We show that these models can be expected to work well when (a) elections are close, or (b) when we restrict ourselves to data where swing is low, or (c) when we eliminate the cases where the model is most likely to go wrong. In particular, sometimes the model tested is not the standard model; in that either (c1) a piecewise or truncated variant of the model is being used, or (c2) there is a correction (usually 75%) for districts that are uncontested. As we show empirically with data from U.S. congressional elections, either of these corrections will at least marginally improve fit on one or more of five indicators: mistakes about directionality of change, mistakes in winner, estimates that are outside the [0..1] bounds, mean-square error, and correlation between actual and predicted values. We also show that our new model provides an equal or better fit to U.S. congressional data than the traditional models, while having much nicer axiomatic properties.
Word Count: 5473
[email protected]. Corresponding author. Visiting Associate Professor, Mathematics and Statistics.
[email protected]. Jack W. Peltason Distinguished Professor, Political Science.
This is a manuscript submitted for review.
1 APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTemplate Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR
B. Grofman & M.C. Wilson
INTRODUCTION
here are many situations where we have data at two (or possibly more than two) points in time, and we are interested in comparing values between and among the data points for each T subject or case. One such situation is an experiment in which there is a treatment effect which results in an overall mean change of B units in some variable of interest after the intervention.
The situation in which we are interested here is for a two-party contest (parties and ) in the set of districts in a legislature, for which we have data at two distinct points in time, time 1 and time 2. In electoral situations, the variable of interest, vote share, is bounded in the interval [0, 1]. In this case, too, we can specify the magnitude of the change between time periods as B, where B is now the overall difference in the mean party vote share of one of the two parties, say party A, between the two elections. By symmetry, in a two party contest, swing for party A is swing against party B, and conversely. In the electoral and party systems literature, B is known as the aggregate inter-election swing between the two parties – which we often abbreviate simply as swing 1.
If we know the mean inter-election swing, B, the question we seek to answer is: “How do we expect that swing to be distributed across the districts as a function of previous vote share (and perhaps, other factors)?" In the previous literature there have been two main answers to the above question: uniform swing and proportional swing.
The deterministic form of the uniform swing assumption posits that swing B8 in district 8 is simply B, regardless of the previous outcome in that unit. In a more realistic probabilistic version of uniform swing, swing is posited to be uncorrelated with previous vote share in the unit, so that the expected swing in all units is B. 2 Butler, in a chapter in McCallum and Readman (1999, pp. 263–265) is
1We do have to be careful with nomenclature, since swing is also used in the electoral systems and party literatures to refer to “responsiveness", namely the percentage change in seat share for each one percentage point change in vote share (see e.g., Tufte, 1973). That usage of “swing" has been extensively studied both theoretically and empirically. Here we will use the term “swing" only to refer to the magnitude of inter-election shifts in votes. 2In probabilistic versions of uniform swing, there is a stochastic error terms with mean zero whose variance is usually taken to be linked to the historical density distribution of inter-election swing in recent elections. Some stochastic versions of swing include covariates, such as incumbency, in the model (see e.g. (Katz et al. 2020)).
2 Swing models
credited with first using uniform swing to model British elections. While its limitations for multi-party contexts were noted by Butler and by later authors such as Curtice and Steed (1982), the concept has nonetheless subsequently become a workhorse model in the U.K. Multiparty competition in the UK is often dealt with by focusing on competition between the two largest parties. The use of the uniform swing model is equally common in the U.S., with a similar way of treating multi-party competition – see e.g. (Katz et al. 2020). When we study congressional elections in the U.S. we also restrict ourselves to two-party competition.
Proportional swing, on the other hand, posits that each unit experiences the same percentage change from one election to the next, and thus units that have higher starting values experience greater change as measured in percentage point terms. We believe that the first proposed use of this model was by Berrington (1965), where it was offered as an alternative to uniform swing. In subsequent literature, proportional swing has become the main alternative to uniform swing (see e.g. (Katz et al. 2020) where the fit of the two models is compared).
Even in the context of two-party politics the use of these models has always been problematic, since each had properties that made a good fit to empirical data implausible. And yet, when applied to British elections, as it has been by Butler and then by other authors in every post-WWII British National Election Study, or in U.S. congressional and other elections, the fit of these models, especially uniform swing, has been shown to be amazingly good – see e.g. (Butler and Stokes 1969, pp. 140-151) and (Katz et al. 2020). The remarkable empirical fit of uniform swing led Butler and Stokes (1969) to talk about the “paradox of swing". They suggested that, on theoretical grounds, proportional swing should work better, and yet it did not. As Gershuny (1974, p. 116) puts it, after arguing for fundamental flaws with the proportional swing model: "The mass of the evidence is however that swings are virtually equal across constituencies." And this is a finding that has been confirmed by subsequent decades of observations.
We find neither approach to swing to be satisfactory. While many authors have criticized the uniform swing and proportional swing models, as far as we are aware, no author has sought to enumerate the desirable properties of a model of inter-election swing. Doing so is one of the several aims of the present article.
3 APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTemplate Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR
B. Grofman & M.C. Wilson
In Section 2, we provide a general model of inter-election swing, of the form
B8 := G82 − G81 = 5 (G81,B, G8)
where G8 9 is the vote share of party in the 8th district at time 9. We provide an axiomatic characterization of the statistically desirable properties we wish the function 5 to satisfy. 3
We identify several key axioms. Axiom 1 requires that the average of the district-level swings
B8 over all districts equals B. We refer to this as mean-swing preservation. Axiom 2 requires that
0 ≤ G81 + B8 ≤ 1, which we refer to as respecting the bounds. Axiom 3 requires that the projected outcomes from time 1 to time 2 be independent of the labeling of the parties — we call this a symmetry condition. Axiom 4 states that if B = 0 then B8 = 0, which we call the no swing condition.
Having identified four key axioms, we next turn to evaluating uniform swing and proportional swing with respect to the axioms, discovering as we should have expected, that neither of these models is satisfactory. In Section 2.2, we show that uniform swing violates one of the four conditions, and proportional swing violates two of them. We next look to see whether there are any functions that satisfy all the axioms. After proving an impossibility result applicable to a large family of models including both uniform and proportional swing, we find a positive solution in Section 2.2.3, via a piecewise defined function that we feel has strong substantive arguments for adoption, and which has the simplest functional form we can find. We also show how to incorporate a reversion (sometimes called “regression”) to the mean effect while still satisfying our first three axioms.
Reversion to the mean is an obvious effect to wish to incorporate into a model of inter-election swing. In its simplest form it posits that outcomes are a product of fundamental factors and idiosyncratic factors some of which will be constituency-specific. Idiosyncratic features can be represented as stochastic errors occurring with mean zero. Since errors are posited to be uncorrelated across elections, the implication of a reversion to the mean effect, is that on average, outcomes will reflect fundamentals. But then especially large victories (defeats) are likely to involve idiosyncratic features which cannot be
3Our approach here parallels that of Taagepera (2007) in his work on “logical models", where great attention is paid to the functional form of relationships and how these are affected by boundary conditions that rule out certain outcomes as logically impossible.
4 Swing models
expected to persist into the next election. 4
In Section 3 we consider how our theoretical results impact on empirical analyses, beginning with “toy" results to illuminate the properties of the various models and culminating in analyses of two-party vote share in pairs of adjacent U.S. congressional elections from 1968-2016. In Section 3.1 we demonstrate with hypothetical data that, for close districts and small swings, all models will agree very often (see Table 5). In Section 3.2 we demonstrate with actual data that we can improve an already very good fit of these two standard models even further by changing the structure of the data, either by entirely eliminating certain extreme cases (uncompetitive districts) or by recoding uncompetitive districts from 100% to 75%. As we show, doing so improves fit on one or more of several indicators: mistakes about directionality of change, mistakes in winner, estimates that are outside the [0..1] bounds, mean-square error, and correlation between predicted and actual values. Our modified model provides an overall equal or better fit to U.S. congressional data than the traditional models, while having much nicer axiomatic properties.
Our results aid us in making sense of the exceptionally good fit of models such as uniform swing, especially once we recognize that some applications of these models are not actually uses of the standard model. For example, when Katz, King and Rosenblatt (2020) assess the fit of uniform swing and proportional swing they are assigning non-competitive districts to be 75%-25% districts. In other studies of U.S. elections. such as ones to calculate seats-votes relationships, it is very common to see uncompetitive districts coded as 75% districts to avoid what would otherwise be seen to be misleading out of bounds results. 5
4In the concluding discussion we will consider factors other than a reversion to the mean effect that might be
added to more realistically model inter-election swing. 5Practices such as this recoding are taken for granted as minor. For example, (Katz et al. 2020, footnote 6, p171)
views this as "standard practice." They also observe that the effects of this imputation "have no material impact
on our results." While generally correct, as we show later, this is too strong a claim.
5 APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTemplate Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR
B. Grofman & M.C. Wilson
AXIOMATIC PROPERTIES FOR FUNCTIONS MODELLING INTER-ELECTION
SWING
We consider the following basic idealized situation. There are districts of equal size and two parties, and , contesting all districts. Unless otherwise specified we state results for party . 6
We will normally consider two elections, one at time C = 1 and one at time C = 2. Let G8 denote 8 G0 the vote share of the party at the first election in district , and 8 the vote share at the second election in that district. We use bars to denote mean values over districts, so that G denotes the mean over all districts of G8, namely the overall vote fraction for that party. We use the following toy example throughout, in order to concretely illustrate ideas.
Example 2.1 (Toy example). There are two parties and two districts of equal size. Fix a parameter U with 0 ≤ U ≤ 1/2. In District 1, party has fraction 1 − U of the votes, and party B receives U, while these vote shares are reversed in District 2. The overall vote share of each party is (U + (1− U))/2 = 0.5, because the districts have the same size.
Definition 2.2. The district-level swing in district 8 is given by
B G0 G . 8 := 8 − 8
The aggregate swing is given by
B := G0 − G.
Note that since G0 = G + B, we must have 0 ≤ G + B ≤ 1 and so −G ≤ B ≤ 1 − G.
Definition 2.3. By a naive swing model we mean a prediction of G0 of the form
G0 G 5 G ,B 8 = 8 + ( 8 )
where 5 ≡ 5 is a fixed function (depending only on but not 8 or B). 6When we study congressional elections, we do the analyses in terms of two-party vote shares, rather than raw votes.
6 Swing models
Alternatively, B8 = 5 (G8,B) for all 8. Note that the district-level swing depends on the previous vote fraction in that district, but not on the votes in other districts except via the overall swing.
There are some obvious requirements for such a predictive model. The definition of district-level and aggregate swing yields
1 Õ 5 (G ,B) = B . 8 (mean swing preservation) (a1) 8=1
G0 , The next condition holds because 8 is bounded within the interval [0 1]:
0 ≤ G8 + 5 (G8,B) ≤ 1 (respecting bounds). (a2)
We also require a third condition, which we may think of as a symmetry condition similar to neutrality in social choice theory, namely that a universal swing model for two-party competition must give the same answer whether we look at either party. This translates to
5 (G8,B) + 5 (1 − G8, −B) = 0 (party symmetry). (a3)
If we think of the vote shares of the parties as arranged in a matrix with one row per party and one column per district, (a1) can be interpreted as a row sum condition, and (a3) as a column sum condition.
There is one other condition, a very strong one, that we also wish to mention:
5 (G8, 0) = 0 (no swing condition). (a4)
This condition says that when there is no aggregate swing, our best predictor of outcomes in individual districts is that they will be unchanged from the previous period. While several of the models we discuss below do satisfy this property, we do not regard this as a desirable property to impose generally on inter-election swing. The reason is quite simple: there are many substantive effects, such as a reversion to the mean effect, and other substantive effects mentioned in the concluding discussion, that will cause violation of this condition.
7 APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTemplate Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR
B. Grofman & M.C. Wilson
Existing models of inter-election swing
By far the most commonly used model of inter-election swing is the Uniform Swing model, in which
5 (G8,B) = B for all 8.
G0 G B 8 8 Thus 8 = 8 + for all . In other words, the same net fraction of voters changes to party in each district. For example, in the toy model with U = 0.2, a 6% swing to changes its vote in District 1 to 86% and in District 2 to 26%.
We first note that although it is easily seen that Uniform Swing satisfies (a1), (a3) and (a4), it does not satisfy (a2). For example, in the toy model, the upper bound for party A in District 1 is violated when U > 1/2 and B > 1 − U. Note that the lower bound for B in District 1 would also be violated.
There is a good reason why we have not presented a naive swing model satisfying all axioms (a1) – (a4): namely that no such model exists.
Proposition 2.4. No naive swing model can satisfy both (a1) and (a2).
Proof. Consider the situation where all district vote shares are equal, so that G8 = G for all 8. Then (a1) implies that 5 (G,B) = B for all B. Since G can take any value in the interval [0, 1], this shows that
5 (G8,B) = B for all B, in general. In other words, the row sum condition is enough to yield the uniform swing model, and since the uniform swing model fails (a2), the claimed impossibility follows.
Thus in order to obtain a broader class of well-behaved simple models, we need a more general functional form for 5 .
Definition 2.5. A swing model is a prediction of the form
G0 G 5 G ,B, G 8 = 8 + ( 8 ) for some function 5 .
The properties (a1) – (a4) above correspond to analogous properties, with the same motivations:
8 Swing models
1 Õ 5 (G ,B, G) = B 8 mean swing preservation (A1) 8=1
0 ≤ G8 + 5 (G8,B, G) ≤ 1 respecting bounds (A2)
5 (G8,B, G) + 5 (1 − G8, −B, 1 − G) = 0 party symmetry (A3)
5 (G8, 0, G) = 0 no swing condition. (A4)
We can now introduce the only other prominent swing model in the literature: Proportional Swing, defined via the formula
5 (G8,B, G) = BG8/G.
Note that the definition implies that G0 G 8 = 8 G0 G
so that the relative share among districts of the party’s overall vote does not change between elections. For example, in the toy model with U = 0.8, a 6% swing to is a 12% relative increase in its overall vote share. The Proportional Swing prediction in District 1 is 89.6% and in District 2 it is 22.4%. Of course if we think of this as a −6% swing to B, then this is a relative decrease of 12% in the overall vote share of B, and hence the vote fraction of B in the first district under this model should be 17.6% and in the second district 70.4%. This causes a violation of the bounds condition (A2) and also shows that (A3) fails. This property of Proportional Swing has been noticed early, for example (McLean 1973; Gershuny 1974).
It is easily verified that Proportional Swing does satisfy (A1) and (A4).
Some new families of models
Do there exist any models that satisfy all axioms (A1)–(A4)? Consider the special case with 2 districts. Then we seek a mapping from a 2 × 2 contingency table with column sums equal to 1 and fixed row A ,A A ,A A0 ,A0 A + A + B = A0 + A0 sums 1 2 to another such table with 1 2 replaced by 1 2, subject to 1 2 1 2. We can
9 APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template APSR Submission Template PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTmlt PRSbiso epaeAS umsinTemplate Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR Template Submission APSR
B. Grofman & M.C. Wilson impose (A4) without any consequences, since the mapping can simply output the original table when B = 0. We have already accounted for (A1) and (A3). To satisfy (A2) we require that given a vote share 0 for party in district 1, we can choose vote share 00 for the same party in the same district such that 0 ≤ 00 ≤ 1 and 0 ≤ 0 + 1 + 2B − 00 ≤ 1. This is always possible because of the constraints on B. Thus it is possible to satisfy all four axioms. The question is whether we can do so with a reasonably simple and understandable functional form.
Beginning with uniform swing and proportional swing models, a natural question is whether there are minimal changes that can be made to one or both of those models that would satisfy axioms (A1)–(A3). The no swing condition (A4) is inconsistent with the empirically observed phenomenon of reversion to the mean for extreme vote shares, so imposing (A4) seems too strong in general. We have not yet seen any model that satisfies axioms (A1)–(??). We now look for one, by considering some models defined by simple formulae. After some false starts we eventually find one, in Section 2.2.3.
Truncations One variant of the uniform swing model would be simply to impose truncation, i.e., to set G8 + 5 (G8,B) = 1 if G8 + B ≥ 1 and G8 + 5 (G8,B) = 0 if G8 + B ≤ 0. Note that this procedure of simply truncating out of bounds values to be either 0 or 1 was used by expert witnesses estimating racially polarized voting in voting rights cases using Goodman’s (Goodman 1953, 1959) method of ecological regression. However, that methodology has largely been replaced with ecological inference techniques (King, 1997) that assure that estimates are within bounds. In any case, in our situation the truncation procedure leads to failure of axiom (A1).
Models that are linear in B We start with the family of linear models given by formulae of the type
B8 = B6(G8, G).
Of course, we mean linear in the variable B — the functional dependence on G is not specified. If
6(G8, G) = 1, this gives Uniform Swing. The Proportional Swing model is also in this family, with
6(G8) = G8/G.
Example 2.6. McLean (McLean 1973) considers models using transition matrices, in which we account
10 Swing models
for the fraction of voters of party 8 that vote for party 9 in the next election (this includes abstainers as a party). Consider the simple case of no abstainers and two parties. If 0 is the fraction who switch
from party 1 to party 2, then in order to satisfy row and column sums, the matrix must have the form