Public Attention and the Polls: The Political Limelight and the 2016 Presidential Race Geoffrey Skelley April 28, 2017

Abstract

During the 2016 presidential election, some observers noted a negative relationship between the amount of public attention a candidate received and his or her poll numbers. Using a series of impulse response functions from vector autoregression models based on polling and attention data, I find inconclusive evidence as to whether or not more attention led to worse polling numbers. I also find some evidence of reverse causation, i.e. poll numbers affecting levels of public attention.

Introduction

The 2016 presidential contest was arguably the most remarkable and strange election in modern U.S. political history. A political novice, Republican , upset a veteran politician, Democrat , to narrowly win the White House. Trump’s victory startled many observers and his success throughout the Republican presidential primaries and the general election challenged many notions about how the American political system functions. The two major-party nominees were both very well-known to the American public at the outset of the campaign. Donald Trump had long been a cultural figure of note, entering the public consciousness in numerous ways, including the long-running television show The Apprentice. Just days after Trump announced his presidential campaign on June 16, 2015, Huffington Post Pollster’s trend line showed that nearly 90% of the public could say it had a favorable or unfavorable opinion of the eventual Republican nominee (“Donald Trump Favorable Rating” 2017). Conversely, at that same point, only about 70% of the public expressed a favorable or unfavorable opinion about Senator of Texas, a leading player in the GOP presidential field (“Ted Cruz Favorable Rating” 2017). Hillary Clinton had been a public figure for a quarter of a century when she began her 2016 campaign, having served as First Lady, a U.S. Senator, and as Secretary of State. When she announced her candidacy on April 12, 2015, almost 95% of the public had a favorable or unfavorable opinion of her (“Hillary Clinton Favorable Rating” 2017). But importantly, the public’s general view of both candidates was very negative, and this would remain largely unchanged throughout the election campaign. Countless observers pointed out how Clinton and Trump were the most disliked pair of general election candidates “ever,” i.e. for some period of time in the era of public polling. As Clinton and Trump were wrapping up their nominations in the spring of 2016, Harry Enten (2016) of FiveThirtyEight noted that no eventual major-party nominees had worse average “strongly unfavorable” ratings than the 2016 frontrunners from late March to late April of an election year dating back to 1980. By the end of the election, Gallup found that Clinton and Trump had the worst unfavorable ratings of any major-party nominees the polling organization had measured since 1956, 52% and 61%, respectively (Saad 2016). Of course, Clinton and Trump gave the public—and the media—plenty to negatively react to. For Clinton, the scandal involving the use of a private email server while she served as Secretary of State principally dogged her throughout the campaign, particularly the FBI investigation regarding the affair. When asked specifically about what they had read, heard, or seen about Clinton in the past day or two, Gallup respondents from July 11 to September 18, 2016, overwhelmingly said “email” (Newport et al. 2016). Additionally, her health became an issue when she was witnessed having difficulty getting into a vehicle on September 11, 2016,

1 and her campaign suffered from multiple releases of controversial internal Democratic National Committee communications by WikiLeaks. Trump created controversy with many of his statements, beginning with his presidential announcement speech when he said people coming from Mexico were “bringing crime” and were “rapists.” Later, he attacked the Muslim family of a fallen soldier after the father and mother spoke at the Democratic National Convention. Perhaps the most controversial moment of Trump’s campaign came in early October, when a video and audio recording of a 2005 appearance on the television show Access Hollywood revealed Trump making a series of lewd comments about his relations with women. Whether it was emails or offensive quotes, the two candidates had myriad scandals and controversies for the public and the media to scrutinize. The plethora of negative stories and problems for the two candidates prompted some observers to note that there appeared to be a relationship between which candidate was dominating the political limelight—often because of a negative story—and the state of that candidate’s polling numbers. That is, with an abundance of negative information about Clinton and Trump to focus on, if one candidate was in the midst of a widely- followed scandal, that candidate’s position in the polls seemed to get worse. In the middle of September, analysts at Sabato’s Crystal Ball newsletter observed a negative correlation between a candidate’s polling margin in the polling averages and the amount of attention the candidate received from the public based on a question from Gallup that asked respondents if they had recently encountered news about the candidates (Sabato, Kondik, and Skelley 2016). Thus, if a candidate received more attention, he or she saw their poll standing worsen, and vice versa. This intuitively made sense: The 2016 election featured two candidates who were (1) notably disliked by the public and (2) afflicted with a multitude of scandals and controversies that helped to sustain those negative perceptions. While the media’s tendency to engage in a “feeding frenzy” when a scandal erupts has unquestionably contributed to the increasingly negative tenor of political campaigns (Sabato 1993), the 2016 campaign featured two scandal-ridden candidates who provided the media with plenty to chew on. Between attack journalism and two controversial candidates, the tone of the 2016 election was one of the most negative ever. According to the Shorenstein Center’s post-election media report, 71% of the coverage of Clinton and Trump was negative, a share only surpassed by the 2000 campaign, “when news reports questioned whether Al Gore was trustworthy enough and George W. Bush was smart enough to deserve the presidency” (Patterson 2016, 4). As Election Day got closer, further polling and attention data seemingly confirmed earlier observations about negative correlations between public attention and the candidates’ poll numbers. Comparing different polling averages—from Huffington Post Pollster and RealClearPolitics—and different measures of public attention, Skelley (2016) found correlations as strong as -.70 over the period from September 10 to October 10, 2016, between Trump’s polling margin and the amount of attention he was receiving. For a longer period dating from August 1 to October 10, 2016, correlations were still as strong as -.46. Following the election, analysts at Sabato’s Crystal Ball again found strong negative correlations between polling margins and public attention in the final month of the election campaign (Kondik and Skelley 2016). A pair of Republican pollsters also noted the negative relationship between another measure of public attention, Google Trends search data, and candidate standing. They found that in the 30 days prior to Election Day, there were more searches for “Donald Trump” in states that the Republican candidate lost than in the states he won (Anderson and Ruffini 2016). Thus, we are left with a straight-forward hypothesis regarding the impact of public attention on the 2016 campaign: 1. More public attention → worse poll numbers, or less attention → better poll numbers. Exploring the relationship between the public’s attention during the 2016 campaign and the fortunes of Clinton and Trump’s poll numbers is a primary goal for this paper. However, given the nature of the negative correlations between these two pieces of information, we should consider an alternative hypothesis—reverse causation: 2. Worse poll numbers → more public attention, or better poll numbers → less public attention. Why might the second relationship exist? The public’s attention to the campaign is undoubtedly influenced by the coverage choices of the media. The Shorenstein Center report on press reporting during the 2016

2 general election found that a plurality of all coverage, 42%, focused on elements of the “horse race” (Patterson 2016, 8); that is, reporting on changes in the polls as well as the strategies and tactics of the campaigns. The media’s focus on the horse race followed a pattern that dates back to the 1970s when news outlets began regularly commissioning their own polling, and has continued through primary and general elections ever since (Farnsworth and Lichter 2014; Patterson 1980, 1994). Therefore, horse-race stories and sound bites make up a significant portion of the campaign coverage presented to the American public. There are a number of reasons for this, including cultural changes in news coverage due in part to a massive expansion in media sources, an increased reliance by press organizations on regurgitating news reported by other outfits rather than pursuing original reporting, cutbacks in newsroom budgets, and the mushrooming of available polling data to report on (Rosenstiel 2005). But the public itself may share some of the blame for the substantial horse-race coverage as well. At least one study found that news consumers are more interested in the ups and downs of the polls and the strategic political picture than other aspects of the presidential campaign (Iyengar, Norpoth, and Hahn 2004). What kind of horse-race coverage did Americans digest during the 2016 campaign? Just like other campaign subjects, the tone of horse-race coverage was more negative (59%) than positive (41%) (Patterson 2016, 10). The nature of the polls can lead reporters to make interpretations that their readers or viewers then receive. If a candidate is polling well, the media can assign that improvement to supposedly smart campaign strategies or positive character traits. If a candidate is polling poorly, the media may connect his or her problems to supposed campaign mistakes or negative character flaws (Patterson 2005). With negative coverage of polling reports and polls serving as a framing device for media interpretation, we can see how relationship #2 is possible. The media covered the 2016 horse race more than anything else and the coverage of the horse race was relatively negative. With two highly disliked, flawed candidates, there was plenty of material for reporters to use when negatively framing polling stories in the context of whatever scandal or controversy reigned over the campaign for the day or week or month(s). Thus, good polling news for Clinton may have been overshadowed by the inverse bad polling news for Trump as negative stories predominated, and vice versa. The public then sees more stories about who is struggling than who is doing well. Therefore, worse poll numbers could conceivably have caused more public attention instead of more public attention causing worse poll numbers. This work employs time series techniques to investigate the relationship between the attention of the American public and the polls during the 2016 campaign. Uncertain about the causal relationship, a natural tool for this is to test Granger causality using a vector autoregression (VAR) analysis. To then see the impact of one on the other—or lack thereof—this study utilizes a series of impulse response functions (IRFs) in an attempt to measure lagged effects.

Data

Earlier analyses of the attention-polls relationship employed data from two well-known sources that measured the public’s attention on the candidates as well as two polling averages or trend lines that were commonly cited throughout the 2016 campaign (Kondik and Skelley 2016; Sabato, Kondik, and Skelley 2016; Skelley 2016). Throughout the campaign, Gallup regularly released data on a number of election indicators (“Presidential Election 2016: Key Indicators” 2016). Among these were daily responses to a question that asked respondents “Did you read, hear, or see anything about [candidate name] in the last day or two?” Gallup’s data for this question exists over the course of 126 days, from July 5 to November 7, 2016. Figure 1 lays out the comparative public attention of Clinton and Trump based on the daily “Yes” percentage of responses during that period.1

1It should be noted that Gallup did not report data for five days in that timespan: July 13, July 21, September 7, October 6, and November 1.

3 Figure 1: Gallup Yes Percentage for Clinton and Trump

90

80

70 colour Clinton

Percentage Answering 'Yes' Percentage 60 Trump

0 50 100 time

The second measure of public attention comes from Google Trends, which are an “unbiased sample” of Google’s search data that has been normalized “as a proportion of all searches on all topics on Google at that time and location” (Rogers 2016). For this investigation, the Google Trends data are based on searches for the terms “Hillary Clinton” and “Donald Trump” in the United States. To make the data a better parallel to the Gallup data, the Google data were restricted to the category “News” and the timespan July 5 to November 7, 2016. The comparative data for Clinton and Trump are presented in Figure 2. Figure 2: Google Trends for Clinton and Trump

100 colour Clinton 75 Trump

50

25

0 50 100 time Normalized Proportion of Searches in the U.S. But in order to better compare the two measures of public attention, this investigation uses attention margin as one of the key variables, i.e. the difference in attention between the two candidates on a given day based on the Gallup and Google Trends data. For example, when Clinton had her health scare on September 11, 2016, she got notably more attention than Trump did the following day. On September 12 (Day 70), 83% of Gallup respondents said “yes” regarding having heard, read, or seen something about Clinton versus 71% for Trump. Thus, Clinton had 12 percentage points more attention that day than Trump. In Google’s measure, her edge in attention margin was far starker. The proportion of searches for her in the United States reached 67% while Trump’s proportion was just 15%. Thus, Clinton had 52 percentage points more attention that day than Trump. The attention margin data between the two candidates is charted below in the methodology section.

4 Turning to the polls, this analysis uses three different election polling measures over the July 4 to November 7, 2016, period of the campaign. The reason for beginning with July 4 rather than July 5 is discussed below in the methodology section. First, two of the predominant media polling averages or trend lines are utilized: the polling average from RealClearPolitics, which was an arithmetic mean of selected polls, and the polling trend line from Huffington Post Pollster, which employed a Bayesian Kalman Filter model to calculate a polling trend line (Edwards-Levy, Jackson, and Velencia 2017, 156). Figures 3 and 4 present these data. As polling aggregators make certain choices about what polls to include and how to calculate their results, I also used a self-constructed polling average based on the methodology of Erikson and Wlezien (2012) to further broaden the analysis. This polling average encompasses all polls with a median survey date within the timeframe of interest, based on the databases of Huffington Post Pollster and Huffington Post Pollster. Figure 5 presents the self-constructed polling average.2 To correspond with the Erikson and Wlezien methods, which calculate the two-party presidential vote, the RealClearPolitics and Huffington Post Pollster measures are also calculated as the two-party vote of their aggregates averages or trends. Figure 3: Huffington Post Pollster Trend 55.0

52.5 colour

50.0 Clinton Trump

47.5 Two−Party Percentage Two−Party

45.0 0 50 100 time

2One difference between my self-constructed polling average and the Erikson and Wlezien methodology is that I included polls that were conducted by means other than live phone interviews. An abundance of the polling done in the 2016 cycle involved other methods, particularly internet polls using methods such as online panels with multilevel regression with post-stratification. It should also be noted that the self-constructed polling average includes interpolated data for the following median dates: July 12, July 17, July 20, July 25, August 14, August 17, September 4, September 14, September 25-26, October 4, October 25, October 29. I followed Erikson and Wlezien’s methods for how to interpolate. The values for November 7 are the same as November 6 as there were no polls with a median survey date on the day before the November 8 election.

5 Figure 4: RealClearPolitics Polling Average

52.5

50.0

47.5 colour

Two−Party Percentage Two−Party Clinton Trump 0 50 100 time

Figure 5: Self−Constructed Polling Average

55

50

colour 45 Two−Party Percentage Two−Party Clinton Trump 0 50 100 time

As all three figures of the different polling aggregates show, the race had some predictable spikes. For instance, both candidates saw poll bounces after their conventions in the second half of July, around Day 20 for Trump and a few days later for Clinton as the conventions were in back-to-back weeks. Having been viewed as the winner of most of the debates, Clinton saw some spikes during the debate season in late September to mid-October. Overall, the three poll aggregates present a somewhat similar picture, with the trend-line version from Huffington Post being the smoothest and the self-constructed average being the noisiest. Next, I discuss the methodological approach for this paper, and the ways in which the polling data have to be transformed.

Methodology and Data Transformation

One challenge in this analysis is the concept of stationarity. That is, the tendency of a trending time series to return to a long-term mean. The effect of some shock to such data decays over time. However, some time series data are non-stationary, meaning that shocks are not forgotten and the time series is integrated.

6 Non-stationarity complicates the methodological choices for analyzing a time series. This is because modeling an integrated time series as a stationary process presumes there is decay when in fact there is none. This leads to incorrect estimates of the effects of lagged variables on the current dependent variable. In the case of this investigation, multiple tests of stationarity indicate that none of the polling time series data are stationary. A useful hypothesis test for a unit root (i.e. non-stationarity) is the Dickey-Fuller (DF) test. If we reject the null hypothesis of a unit root process being present, then the test suggests that the time series in question is stationary. DF tests for both candidates’ three sets of polling data (for each polling average or trend) all failed to reject the null hypothesis, indicating all are likely non-stationary. These tests confirmed my suspicions based on the lack of a generally common return point for either candidate in all three polling aggregates. One approach for handling non-stationarity is to take the first difference of the data, which can often make the data stationary. In the case of all three polling data sets, DF tests indicated that the first difference for the nominees’ survey numbers were stationary. The first difference for a daily poll average is calculated as follows:

∆PA = PAt − PAt−1

The need to use first-differenced polling numbers explains why I included polling data from July 4, rather than beginning at July 5. This gives a calculation for a first difference for July 5 by subtracting July 4’s polling aggregate from July 5’s for each candidate. The first difference measures the change in the change of the candidate’s polling position. That is, if a candidate’s polling average is 52% on Day 1 and 51% on Day 2, there is a -1 point first difference for Day 2 (Day2 − Day1). If Day 3’s polling average is 51.5%, Day 3’s first difference is +0.5 points (Day3 − Day2). This concept is similar to the idea of the second derivative. The first derivative of a function gives the slope, while the second derivative shows the change in the slope. Overall, the analysis involves the first-differenced polling data for a candidate and that candidate’s attention margin on a given day. The unit root challenge does not occur in the data comparing Clinton and Trump’s attention on a given day. All DF tests for the Gallup and Google attention margins reject the null hypothesis at the p = .05 level. Another consideration is the length of time to analyze. The data covers the period from July 5 to November 7, but as prior analyses of the relationship between the polls and public attention showed, the negative relationship became stronger closer to Election Day (Kondik and Skelley 2016; Skelley 2016). This makes intuitive sense based on the typical nature of presidential campaigns. The convention period of a campaign, which in 2016 happened particularly early—it was the first time since 1960 that both parties held their conventions in July—sees the most polling variability, while the polls become less varied in the final weeks of the election campaign (Erikson and Wlezien 2012, 72-82). Thus, I decided to analyze the data via three different starting points: July 5, the start of the data sets; August 1, after the completion of the conventions; and September 6, the day after Labor Day, a date traditionally associated with the final stretch of major elections in the United States. Importantly, all the statistical findings for the data sets as a whole—i.e. the DF tests for stationarity—hold true for the shorter versions of the data. The method for analyzing the relationship between the first-differenced daily polling averages/trends and the daily comparative levels of attention for the two major-party nominees is a vector autoregression (VAR). Considering the uncertainty regarding possible directions of causation, as laid out in the two hypotheses in the introduction, a VAR is an appropriate way to analyze the data. This is because VAR models are designed with fewer restrictions that allow the data to determine the number of lags to be used in the modeling process. Importantly, it also allows for feedback between all variables—both lagged and concurrent—which is useful for interpreting the relationship between the two variables of interest. This is done using Granger causality testing mechanisms, which can shed light on possible directions of causality by testing to see if a variable Yt is more accurately predicted by past values of Xt and Yt together than from just past values of Yt by themselves. In that case, Xt is said to “Granger cause” Yt. But it should be made clear that Granger causation alone is not proof positive of causation (Box-Steffensmeier et al. 2014, 106-118). But it can be taken as evidence of possible causation. Additionally, impulse response functions (IRFs) can then be utilized

7 in an attempt to measure the lagged effect of one variable on the other. IRFs show lagged effects, i.e. how a change in the value of an independent variable at Time 1 shifts values in the dependent variable at different time points. In the models below that means looking at how a one-point increase in attention margin data shifts first-differenced polling data and vice versa. The standard equation for the models in this work is a series of two equations with one lag:

∆PAt = α1 + δ11∆PAt−1 + β11AMt−1 + 1t

AMt = α2 + δ21∆PAt−1 + β21AMt−1 + 2t where ∆PA is the first-differenced polling average and CA is the attention margin for a candidate at some time t − k, while α is the constant in the equation. Different forms of this basic equation include additional lagged versions of the two variables of interest, depending upon the Akaike information criteria (AIC) calculations for each model. Essentially, 18 models are a part of this analysis, with six variable pairings (three horse-race polling data sets by two public attention polling sets) for three different timespans. As the candidates’ data are the inverse of one another, it is unnecessary to run two different sets of models—the Granger causation tests and impulse response functions show the same results for both. Figures 6 and 7 present the overall July-to-November 2016 data for the first-differenced polling averages/trends and the attention margin between the two candidates from Gallup and Google. As Figures 6 and 7 show, the Clinton and Trump data are the inverses of one another, by way of using the two-party vote in the polling averages/trends and subtracting each candidate’s public attention levels from the other. The summary data for Clinton and Trump’s polling and attention margin data are presented in Tables 1 and 2. Figure 6: Clinton Attention Margins & 1st−Diff. Polling 50 10 25 0 0 −25 −10 −50 Gallup marg. −20 Google marg. 0 50 100 0 50 100

0.4 1 0.2

0 0.0 RCP 1st−diff

HuffPost 1st−diff HuffPost −1 0 50 100 0 50 100

2.5

0.0

−2.5

−5.0 Self−Con. 1st−diff 0 50 100

8 Figure 7: Trump Attention Margins & 1st−Diff. Polling 20 50 10 25 0 0 −25 −10 Gallup marg. Google marg. −50 0 50 100 0 50 100 1

0.0 0

−0.2 −1 −0.4 RCP 1st−diff HuffPost 1st−diff HuffPost 0 50 100 0 50 100

5.0

2.5

0.0

−2.5 Self−Con. 1st−diff 0 50 100

Table 1: Clinton Data Summary

Gallup.Margin Google.Margin HuffPost.1stDiff RCP.1stDiff Self.1stDiff Min. :-20.000 Min. :-68.000 Min. :-0.162668 Min. :-0.90858 Min. :-6.73233 1st Qu.: -8.000 1st Qu.: -6.000 1st Qu.:-0.067087 1st Qu.:-0.17206 1st Qu.:-0.88080 Median : -4.000 Median : -2.000 Median :-0.015921 Median : 0.00000 Median : 0.09934 Mean : -2.769 Mean : -2.175 Mean : 0.004846 Mean :-0.00744 Mean :-0.02905 3rd Qu.: 1.000 3rd Qu.: 1.000 3rd Qu.: 0.040131 3rd Qu.: 0.11840 3rd Qu.: 1.08754 Max. : 15.000 Max. : 52.000 Max. : 0.422306 Max. : 1.59272 Max. : 3.34443 NA’s :5 NA NA NA NA

Table 2: Trump Data Summary

Gallup.Margin Google.Margin HuffPost.1stDiff RCP.1stDiff Self.1stDiff Min. :-15.000 Min. :-52.000 Min. :-0.422306 Min. :-1.59272 Min. :-3.34443 1st Qu.: -1.000 1st Qu.: -1.000 1st Qu.:-0.040131 1st Qu.:-0.11840 1st Qu.:-1.08754 Median : 4.000 Median : 2.000 Median : 0.015921 Median : 0.00000 Median :-0.09934 Mean : 2.769 Mean : 2.175 Mean :-0.004846 Mean : 0.00744 Mean : 0.02905 3rd Qu.: 8.000 3rd Qu.: 6.000 3rd Qu.: 0.067087 3rd Qu.: 0.17206 3rd Qu.: 0.88080 Max. : 20.000 Max. : 68.000 Max. : 0.162668 Max. : 0.90858 Max. : 6.73233 NA’s :5 NA NA NA NA

Overall, as Figures 6 and 7 and Tables 1 and 2 show, there is a much wider range in the Google data than

9 in the Gallup attention margin data. The first-differenced Huffington Post Pollster data shows the least variance while the first-differenced self-constructed polling average has a wider range of data points. The first-differenced RealClearPolitics data lies in between the other two polling averages/trends in terms of variance. To construct the VARs for each pair of variables, we need to determine how many lags should be included for each of the 18 VARs. After running tests for this, the AIC indicates the following lag count for each VAR, as presented in Table 3:

Table 3: Number of lags per model

Model Attention.Margin Polling.Avg..Trend Start.Point AIC.Lags 1 Gallup HuffPost July 5 10 2 Gallup RCP July 5 4 3 Gallup Self-constructed July 5 2 4 Google HuffPost July 5 5 5 Google RCP July 5 3 6 Google Self-constructed July 5 2 7 Gallup HuffPost Aug. 1 2 8 Gallup RCP Aug. 1 1 9 Gallup Self-constructed Aug. 1 6 10 Google HuffPost Aug. 1 2 11 Google RCP Aug. 1 1 12 Google Self-constructed Aug. 1 6 13 Gallup HuffPost Sept. 6 2 14 Gallup RCP Sept. 6 4 15 Gallup Self-constructed Sept. 6 3 16 Google HuffPost Sept. 6 2 17 Google RCP Sept. 6 1 18 Google Self-constructed Sept. 6 3

The AIC calls for a number of different lag lengths for the 18 different models, stretching from just one (as shown in the basic structural equation above) to 10. Before moving to the results and the analysis and discussion of the results, it is important to point out that one over-riding challenge is the effect of sampling error. While a polling sample cannot be expected to perfectly reflect the opinions of the population of interest, a complicating factor in any polling analysis is the reality that observed variance in the polls might be due more to survey error than from actual changes in the preferences of the electorate (Erikson and Wlezien 2012, 27). Gelman et al. (2016) famously found that vote swings in the 2012 presidential election were largely a result of shifts in party identification in samples, particularly before and after the first presidential debate between President Barack Obama and . So this is an acknowledged limitation of this analysis that should be taken into account.

Results

I ran Granger causality tests for each VAR and then IRFs for each model to produce my results. Granger tests are based on the following two base equations. For testing whether attention margin Granger causes polling average:

L L X X ∆PAt = α + δi∆PAt−1 + βiAMt−1 + t i=1 i=1

10 For testing whether polling average causes attention margin:

L L X X AMt = α + δiAMt−1 + βi∆PAt−1 + t i=1 i=1

In each equation, L = the number of lags, based on the AIC. For each model, the polling average/trend and measure of attention margin is substituted to determine Granger causality. If a test is significant (I am using p = .05 as the significance threshold), the null hypothesis that one variable does not cause the other can be rejected, indicating Granger causality. After Granger testing, an IRF was calculated for each relationship to estimate lagged effects. If a Granger causality test met the significance threshold, the IRF of that relationship is discussed in detail below.

Timespan: July 5 to November 7

The July-to-November models are as follows: 1. Gallup Attention Margin and First-Differenced Huffington Post Pollster trend 2. Gallup Attention Margin and First-Differenced RealClearPolitics polling average 3. Gallup Attention Margin and First-Differenced Self-Constructed Polling Average 4. Google Attention Margin and First-Differenced Huffington Post Pollster trend 5. Google Attention Margin and First-Differenced RealClearPolitics polling average 6. Google Attention Margin and First-Differenced Self-Constructed Polling Average

Table 4: Granger Causality p-values for July to November data

Model Granger.cause Outcome p.value 1 Gallup HuffPost 0.0002 1 HuffPost Gallup 0.2646 2 Gallup RCP 0.3289 2 RCP Gallup 0.3494 3 Gallup Self-constructed 0.6499 3 Self-constructed Gallup 0.9255 4 Google HuffPost 0.0008 4 HuffPost Google 0.0236 5 Google RCP 0.0061 5 RCP Google 0.3041 6 Google Self-constructed 0.6828 6 Self-constructed Google 0.2385

......

11 Figure 8: Impulse Reponse Function for Model 1, Gallup Attention Margin → First-Differenced Huffington Post Pollster

Orthogonal Impulse Response from hrc_gallup_marg 0.01 hrc_hp_1st_diff −0.02

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

Figure 9: Impulse Reponse Function for Model 4, Google Attention Margin → First- Differenced Huffington Post Pollster

Orthogonal Impulse Response from hrc_google_marg 0.00 hrc_hp_1st_diff −0.02

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

12 Figure 10: Impulse Reponse Function for Model 4, First-Differenced Huffington Post Pollster → Google Attention Margin

Orthogonal Impulse Response from hrc_hp_1st_diff 4 2 0 −2 hrc_google_marg

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

Figure 11: Impulse Reponse Function for Model 5, Google Attention Margin → First- Differenced RealClearPolitics

Orthogonal Impulse Response from hrc_google_marg 0.00 hrc_rcp_1st_diff −0.10

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

For the July-to-November models, Table 4 lays out the p-values for the Granger causality tests. In four cases, the results meet significance (p = .05) for one variable Granger causing another. In three cases, an attention

13 margin Granger caused a first-differenced polling average/trend. However, in one instance, a polling trend Granger caused an attention margin, providing some limited evidence for reverse causation as laid out in the introduction (Hypothesis 2). In Model 1, the Gallup attention margin Granger caused the first-differenced Huffington Post Pollster polling trend. Based on a one percentage point increase in the Gallup attention margin (i.e. more attention) at Time 1, the IRF for this relationship (Figure 8) shows a relatively positive change in the change of the candidates’ polling trend for most of Time 2 through Time 8. Recall that the first-differenced data are almost like second derivatives, describing the change in the nature of daily change of the candidates’ polling average or trend. In the case of the Huffington Post Pollster trend, however, it’s important to note that the very small variance in the first-difference of means there are very small shifts in the change of the change overall. Starting at Time 9 and lasting through Time 12, the impact of a one-point increase in Gallup attention margin at Time 1 has a negative impact on the change of the change of just -.01 percentage points at each time point, suggesting increasingly negative change from increased attention has about a 10-day lag. Granger tests for Models 4 and 5 found Granger causation for Google’s attention margin on the first-differenced Huffington Post Pollster trend and the RealClearPolitics average. Figure 9 shows the IRF for the former, which differs a bit from the Gallup→Huffington Post Pollster relationship. In the case of Google’s attention margin, a one-point increase on Day 1 corresponds to consistently negative change in the change of the candidates’ polling trend from Time 5 on. Still, that change in the change bottoms out only at about -.01 percentage points at Time 8, but it’s a sustained lagged effect that remains negative from Time 5 through to the end of the calculated run. In Figure 11, the IRF for Model 5 (Google→RealClearPolitics) shows an initial negative effect, with a one-point increase in the Google attention margin at Time 1 corresponding to a -.07 percentage point change at Time 3 in the change of the candidates’ change in the RealClearPolitics poll average. But after that, the change of the change becomes positive to some extent until two weeks after the initial one-point increase in attention, peaking at .03 percentage points at Time 7. Only Figure 9’s IRF seems to truly match the expectations of Hypothesis 1 regarding more attention leading to worse polling numbers, though the other two do show some negative change in the change of the candidates’ horse-race numbers. The notable finding is that Model 4 finds that the first-differenced Huffington Post Pollster data Granger causes the Google attention margin. The IRF for this relationship, shown in Figure 10, finds that a positive increase in the Google attention margin over the first week or so accompanies a one-percentage point change in the change of a candidate’s polling trend at Time 1. In other words, a big positive shift in the relative polling of a candidate was accompanied by more attention. As IRFs also signal the reverse, a one-point decrease in the change of the change of a candidate’s polling trend would be associated with decreasing attention in the first week. But after a week, the relationship predicted in Hypothesis 2 shows itself, as improved polling change at Time 1 is associated with a decrease of more than 1 percentage point in attention at Time 10 to Time 18—and vice versa (decrease in polls→more attention). Within the July-to-November models, only four times in Figure 8 (Time 2, 21-23), four times in Figure 10 (Time 6, Time 14-16), and once in Figure 11 (Time 3) was a one-point increase at Time 1 in either attention margin or polling change associated with a change that did not cross zero within the upper and lower bounds of the 95% confidence intervals. So I am hesistant to make strong statements about the marginal effect of a one-point increase at Time 1.

Timespan: August 1 to November 7

The August-to-November models are as follows: 7. Gallup Attention Margin and First-Differenced Huffington Post Pollster trend 8. Gallup Attention Margin and First-Differenced RealClearPolitics polling average 9. Gallup Attention Margin and First-Differenced Self-Constructed Polling Average 10. Google Attention Margin and First-Differenced Huffington Post Pollster trend 11. Google Attention Margin and First-Differenced RealClearPolitics polling average 12. Google Attention Margin and First-Differenced Self-Constructed Polling Average

14 Table 5: Granger Causality p-values for August to November data

Model Granger.cause Outcome p.value 7 Gallup HuffPost 0.0235 7 HuffPost Gallup 0.0508 8 Gallup RCP 0.2329 8 RCP Gallup 0.5671 9 Gallup Self-constructed 0.2698 9 Self-constructed Gallup 0.1503 10 Google HuffPost 0.2493 10 HuffPost Google 0.3716 11 Google RCP 0.0218 11 RCP Google 0.2369 12 Google Self-constructed 0.2379 12 Self-constructed Google 0.4376

Figure 12: Impulse Reponse Function for Model 7, Gallup Attention Margin → First- Differenced Huffington Post Pollster

Orthogonal Impulse Response from hrc_gallup_marg 0.020 0.010 hrc_hp_1st_diff 0.000

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

. . . . .

15 Figure 13: Impulse Reponse Function for Model 7, First-Differenced Huffington Post Pollster → Gallup Attention Margin

Orthogonal Impulse Response from hrc_hp_1st_diff 1.0 0.0 hrc_gallup_marg −1.0 0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

Figure 14: Impulse Reponse Function for Model 13, Google Attention Margin → First- Differenced RealClearPolitics

Orthogonal Impulse Response from hrc_google_marg −0.02 hrc_rcp_1st_diff −0.08

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

As Table 5 shows, three Granger tests met the statistically significant threshold of p = .05. Once again, however, one model showed Granger causation in both directions.

16 In Models 7 and 11, an attention measure Granger caused a first-differenced polling measure. In the former, a one-point increase in Gallup’s attention margin at Time 1 corresponded with only lagged positive changes in the change of a candidate’s Huffington Post Pollster polling trend, peaking at .01 points from Time 5 to Time 9 (Figure 12). This runs counter to the expectations of Hypothesis 1, though again, the steadiness of that polling trend (calculated by way of a Kalman Filter) compared to the others led its first-difference to have a small variance. In the latter model, a one-point increase in the Google attention margin at Time 1 led to consistently negative change in the change of the candidates’ polling changes in the RealClearPolitics average (shown in Figure 14), following the expectations of Hypothesis 1. This change peaked at the very start, corresponding to a -.05 point change at Time 2, and then slowly increasing but remaining negative until Time 25. But the other relationship in Model 7, Huffington Post Pollster→Gallup attention margin, just missed significance at p = .05 level and showed Granger causation in the opposite direction. A one-point increase in the change of the change in a candidate’s polling trend at Time 1 was associated initially with a positive change in attention margin from Time 1 to Time 6, peaking at 0.84 points at Time 2 (Figure 13). From Time 7 on, however, the one-point change at Time 1 corresponded to negative changes in attention margin, as expected by Hypothesis 2. Just as in the July models, there were instances of an IRF estimate not crossing zero within the bounds of the 95% confidence interval, which necessitates caution in interpreting these results. Still, it should be noted that the confidence interval for the IRF for Model 11 described above barely includes zero.

Timespan: September 6 (after Labor Day) to November 7

The September-to-November models are as follows: 13. Gallup Attention Margin and First-Differenced Huffington Post Pollster trend 14. Gallup Attention Margin and First-Differenced RealClearPolitics polling average 15. Gallup Attention Margin and First-Differenced Self-Constructed Polling Average 16. Google Attention Margin and First-Differenced Huffington Post Pollster trend 17. Google Attention Margin and First-Differenced RealClearPolitics polling average 18. Google Attention Margin and First-Differenced Self-Constructed Polling Average

Table 6: Granger Causality p-values for September to November data

Model Granger.cause Outcome p.value 13 Gallup HuffPost 0.0328 13 HuffPost Gallup 0.0057 14 Gallup RCP 0.0048 14 RCP Gallup 0.5350 15 Gallup Self-constructed 0.5826 15 Self-constructed Gallup 0.3086 16 Google HuffPost 0.4267 16 HuffPost Google 0.2327 17 Google RCP 0.0342 17 RCP Google 0.2748 18 Google Self-constructed 0.9127 18 Self-constructed Google 0.4617

17 Figure 15: Impulse Reponse Function for Model 13, Gallup Attention Margin → First- Differenced Huffington Post Pollster

Orthogonal Impulse Response from hrc_gallup_marg 0.010 hrc_hp_1st_diff −0.005 0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

Figure 16: Impulse Reponse Function for Model 13, First-Differenced Huffington Post Pollster → Gallup Attention Margin

Orthogonal Impulse Response from hrc_hp_1st_diff 2 1 0 hrc_gallup_marg −2

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

18 Figure 17: Impulse Reponse Function for Model 14, Gallup Attention Margin → First- Differenced RealClearPolitics

Orthogonal Impulse Response from hrc_gallup_marg 0.00 hrc_rcp_1st_diff −0.15

0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

Figure 18: Impulse Reponse Function for Model 17, Google Attention Margin → First- Differenced RealClearPolitics

Orthogonal Impulse Response from hrc_google_marg 0.00 −0.06 hrc_rcp_1st_diff −0.12 0 5 10 15 20 25

xy$x 95 % Bootstrap CI, 1000 runs

For the September models, four Granger tests were statistically significant. In three cases, public attention margins Granger caused polling measures, while in one case, a polling measure Granger caused a public attention measure.

19 As with Model 7 for August to November, Model 13’s Granger tests showed causation in both directions. In both models, the two data sets were Gallup’s public attention margin and the first difference of Huffington Post Pollster‘s polling trend. In Figure 15, the IRF estimates are displayed for the relationship for Gallup’s public attention margin→Huffington Post Pollster. A one-point increase in Gallup attention margin at Time 1 corresponds to a lightly positive increase in the change of the change in the candidates’ polling trend that ranges from 0.05 to 0.10 points from Time 2 through Time 13. So unlike the expectations of Hypothesis 1, this relationship shows increased attention corresponding to a slightly better polling situation. In Figure 16, the IRF of the reverse relationship is shown. A one-point increase in the change of a candidate’s change in the polling trend at Time 1 is initially associated with a positive bump of 1.1 percentage points in Gallup attention margin at Time 2, and the change remains positive through Time 4. But then the one-point increase at Time 1 is associated with negative change from Time 5 through the end of the IRF, with the nadir being about a -1.2-point change in attention around Time 10-12. The two other September cases of note involved RealClearPolitics. In Model 14 (Figure 17), a one-point increase in Gallup’s attention margin at Time 1 corresponds to negative changes in the change of the candidate’s polling average at Time 2 to Time 4, with the lowest mark being a -0.12 point change at Time 4. This is also the only point in that IRF where the point estimate’s confidence interval does not include zero. This finding matches Hypothesis 1, though the IRF as a whole doesn’t compellingly line up with it. The IRF bounces between being slightly positive and slightly negative until Time 10, after which it largely remains positive, though only very narrowly. In Model 17 (Figure 18), a one-point increase in Google’s attention margin at Time 1 has a quick, negative mark on the change in the change of the polling average, with a -0.06-point change at Time 2 and -0.05 at Time 3. But the marginal effect moves toward zero after that, though it does remain negative. Having laid out the notable results above, the final section of the paper delves into the meaning, or lack thereof in some cases, of the findings.

Analysis and Discussion

In total, 10 of the 36 two-way relationships in the 18 models produced significant Granger testing results at the p = .05 level. Additionally, one skirted the line at .051. For each, I included an IRF displaying the lagged effect of a one-point increase in either the attention margin measurement in question or the particular first-differenced polling average/trend. With the caveats regarding survey error, the limited significance of most effects in the IRFs (the 95% confidence intervals often included zero), and the fact that Granger testing cannot be considered certain proof of causation, there are some broad observations about the results. First, three of the significant Granger test IRFs, parts of Models 4, 11, and 17 (Figures 9, 14, and 18, respectively) showed the expected relationship as laid out in Hypothesis 1—more public attention led to mostly negative marginal effects on polling. In each case, a one-point increase in attention margin at Time 1 corresponded to mostly continuous negative change in the change of the candidates’ polling average or trend for most of the 25-day run in the IRF. In two other instances, there was a relatively quick negative lagged effect from a one-point increase in attention at Time 1, but just as quickly the effect subsided (Figures 11 and 17, for parts of Models 5 and 14, respectively) to become slightly positive or even much more positive. Four of these five models involved Google’s attention margin (Models 4, 5, 11, and 17), which appears to have prompted polling shifts that better fit the expectations of Hypothesis 1. And four of the polling averages/trends involved in these models were from RealClearPolitics. Second, two IRFs for attention→polls (Figures 12 and 15, parts of Models 7 and 13) found that increased public attention led to mostly positive change in the change of a candidate’s polling position. In both cases, the models featured the Gallup attention margin and the first-differenced polling trend data for Huffington Post Pollster, one starting in August and one starting in September. One other IRF, for Model 1—the July model for Gallup→Huffington Post Pollster in Figure 8—went back and forth between a positive and negative effect over the 25-day timespan after a one-point increase in Gallup’s attention margin at Time 1. Thus, the three models involving the Gallup attention data and Huffington Post Pollster trend were consistent in

20 running at least somewhat counter to the expectations in Hypothesis 1. But three other relationships showed some element of reverse causality (i.e. polls affecting attention), though not necessarily in the way anticipated by Hypothesis 2 (worse polling leading to more attention). In each case, the Huffington Post Pollster trend was the horse-race polling part of the model. Broadly speaking, the IRF for each instance (Models 4, 7, 13, which are in Figures 10, 13, and 16, respectively) showed that a one-point change in the change of a candidate’s polls at Time 1 corresponded to an increase in the amount of public attention either in Gallup’s or Google’s measure. Then, after a few time points—four to eight days of lag—the attention change would become negative after a one-point increase in first-differenced polling position. The initially positive shift runs counter to the expectations in Hypothesis 2. Overall, the results are inconclusive, even allowing for the caveats about polling error, Granger testing, and the inclusion of zero in most IRF confidence intervals. Certain public attention and horse-race measures seemed to be more likely to prompt results, which adds further doubt about the ability to make strong statements about the effects. Google Trends attention margin data and RealClearPolitics’ horse-race data were most likely to produce a polling response more-or-less in line with Hypothesis 1. This may be because Google’s marginal data had a greater variance than Gallup’s, so if a large shift in public attention corresponded to lagged polling change, a larger effect may have results. One notable fact is that models involving the self-constructed polling average, which showed the greatest amount of variation in its first differences of any of the polling measures, did not once find significance in Granger testing. A lack of clear trends in that data, which showed few consistent upswings or downswings in the change of the candidates’ daily change in polling may have made it difficult to identify any level of causality with shifts in attention or vice versa. Despite the inconclusive results, there may be different methods and some potential controls that could better clarify the relationship between public attention and 2016 polling. Getting access to private data regarding the tone of daily media coverage, such as that of Media Tenor, might enable analysis or control for the nature of coverage and how it related to spikes of public attention during the campaign. Because the polling data were non-stationary and necessitated first-differencing, it may be possible to take a different strategy that attempts to control for the sinusoidal nature of the candidates’ polling averages and trends. Data and methods such as these should be a part of the focus of further research on this relationship. If pollsters ask questions on public attention in a daily manner in the future as Gallup did in 2016, that could also enable broader, multi-cycle studies to investigate how that attention may affect or not affect the polls. In conclusion, some of the IRFs provide evidence that more public attention corresponded to worse polling numbers. The lagged negative impact of increased attention in some of the models suggests that major news events—which were mostly negative—could produce downward shifts in the candidates’ polling positions. To use an example, FBI Director James Comey sent a letter to Congress on October 28, 2016, which returned one of Clinton’s most damaging stories to the front pages. Model 13 (Figure 14) showed a negative lagged effect that could cumulatively have led to a -.2 point change in Clinton’s first-differenced polling numbers over the final 10 days. That isn’t massive, and margin of error means this could have been notably smaller or larger, but in an election that was essentially decided by around 77,000 votes in the states of Michigan, Pennsylvania, and Wisconsin, any negative shift in the final days for Clinton might have been enough for Trump to win. But the results in the IRFs were hardly conclusive. Some of the models showed increased attention corresponding to positive shifts in the polls. Other models demonstrated reverse causality in a manner that ran somewhat counter to the expectations of Hypothesis 2—that is, that worse polls might lead to more attention. Despite the strong focus on the negative by the media and the public during the 2016 election, perhaps this latter finding shows that, at least initially, a positive change in the change of the candidates’ polls produced more attention. The Shorenstein Center data on news coverage showed the horse-race to actually be the setting that was the least-negative (Patterson 2016) of any category of coverage, so good polling news may have engendered more (positive) attention, not less compared to policy or scandal stories. Thus, there is more work to be done to elucidate the relationship between public attention and the movement of the 2016 polls.

21 References

Anderson, Kristin Soltis and Patrick Ruffini. 2016. “How to Recover from the Polling Disaster of 2016? Look Beyond Polls.” Washington Post. November 23. https://www.washingtonpost.com/opinions/ how-to-recover-from-the-polling-disaster-of-2016-look-beyond-polls/2016/11/23/3bc5716a-b0d5-11e6-8616-52b15787add0_ story.html?utm_term=.0cb4770cb569 (April 19, 2016). Box-Steffensmeier, Janet M., John R. Freeman, Matthew P. Hitt, and Jon C. W. Pevehouse. 2014. Time Series Analysis for the Social Sciences. New York: Cambridge University Press. “Donald Trump Favorable Rating.” 2017. Huffington Post Pollster. http://elections.huffingtonpost.com/ pollster/donald-trump-favorable-rating (April 19, 2017). Edwards-Levy, Ariel, Natalie Jackson, and Janie Velencia. 2017. “Polling in the 2016 Election.” in Trumped: The 2016 Election That Broke All the Rules, eds. Larry J. Sabato, Kyle Kondik, and Geoffrey Skelley. Lanham, Maryland: Rowman & Littlefield, 152-166. Enten, Harry. 2016. “Americans’ Distaste For Both Trump And Clinton Is Record-Breaking.” FiveThirtyEight. May 5. https://fivethirtyeight.com/features/americans-distaste-for-both-trump-and-clinton-is-record-breaking/ (April 19, 2017). Eriskon, Robert S. and Christopher Wlezien. 2012. The Timeline of Presidential Elections: How Campaigns Do (and Do Not) Matter. Chicago: University of Chicago Press. Farnsworth, Stephen J. and S. Robert Lichter. 2014. “News Coverage of US Presidential Campaigns: Reporting on Primaries and General Elections, 1988-2012.” Paper prepared for presentation at the annual meeting of the American Political Science Association, Washington, DC. Gelman, Andrew, Sharad Goel, Douglas Rivers, and David Rothschild. 2016. “The Mythical Swing Voter.” Quarterly Journal of Political Science 11 (1): 103-130. “Hillary Clinton Favorable Rating.” 2017. Huffington Post Pollster. http://elections.huffingtonpost.com/ pollster/hillary-clinton-favorable-rating Iyengar, Shanto, Helmut Norpoth, and Kyu S. Hahn. 2004. “Consumer Demands for Election News: The Horserace Sells.” The Journal of Politics 66 (1): 157-175. Kondik, Kyle and Geoffrey Skelley. 2016. “In 2016’s Game of Musical Chairs, the Music Stopped at the Wrong Time for Clinton.” Sabato’s Crystal Ball. November 30. http://www.centerforpolitics.org/crystalball/ articles/in-2016s-game-of-musical-chairs-the-music-stopped-at-the-wrong-time-for-clinton/ (April 19, 2017). Newport, Frank, Lisa Singh, Stuart Soroka, Michael Traugott, and Andrew Dugan. 2016. “ ‘Email’ Dominates What Americans Have Heard About Clinton.” Gallup. September 19. http://www.gallup.com/poll/195596/ email-dominates-americans-heard-clinton.aspx (April 19, 2017). Patterson, Thomas E. 1980. The Mass Media Election: How Americans Choose Their President. New York: Praeger. ____. 1994. Out of Order. New York: Vintage. ____. 2005. “Of Polls, Mountains: U.S. Journalists and Their Use of Election Surveys.” Public Opinion Quarterly 69 (5): 716-724 ____. 2016. News Coverage of the 2016 General Election: How the Press Failed the Voters. Shorenstein Center on Media, Politics and Public Policy. December. “Presidential Election 2016: Key Indicators.” 2016. Gallup. http://www.gallup.com/poll/189299/ presidential-election-2016-key-indicators.aspx (April 20, 2016). Rogers, Simon. 2016. “What Is Google Trends Data—And What Does It Mean?” Medium. July 1. https: //medium.com/google-news-lab/what-is-google-trends-data-and-what-does-it-mean-b48f07342ee8 (April 20, 2017).

22 Rosenstiel, Tom. 2005. “Political Polling and the New Media Culture: A Case of More Being Less.” Public Opinion Quarterly 69 (5): 698-715. Saad, Lydia. 2017. “Trump and Clinton Finish With Historically Poor Images.” Gallup. November 8. http://www.gallup.com/poll/197231/trump-clinton-finish-historically-poor-images.aspx (April 19, 2017). “Ted Cruz Favorable Rating.” 2017. Huffington Post Pollster. http://elections.huffingtonpost.com/pollster/ ted-cruz-favorable-rating (April 19, 2017). Sabato, Larry J. Feeding Frenzy. 1993. Feeding Frenzy. New York: The Free Press. Sabato, Larry J., Kyle Kondik, and Geoffrey Skelley. 2016. “The Fundamentals: Where Are We in This Strange Race for President?” Sabato’s Crystal Ball. September 15. http://www.centerforpolitics.org/ crystalball/articles/the-fundamentals-where-are-we-in-this-strange-race-for-president/ (April 19, 2017). Skelley, Geoffrey. 2016. “The Danger of the Political Limelight.” Sabato’s Crystal Ball. October 13. http://www.centerforpolitics.org/crystalball/articles/the-danger-of-the-political-limelight/ (April 19, 2017).

23